Automatic image and video caption generation with deep learning: A concise review and algorithmic overlap

S Amirian, K Rasheed, TR Taha, HR Arabnia - IEEE access, 2020 - ieeexplore.ieee.org
Methodologies that utilize Deep Learning offer great potential for applications that
automatically attempt to generate captions or descriptions about images and video frames …

Funsd: A dataset for form understanding in noisy scanned documents

G Jaume, HK Ekenel, JP Thiran - … International Conference on …, 2019 - ieeexplore.ieee.org
We present a new dataset for form understanding in noisy scanned documents (FUNSD)
that aims at extracting and structuring the textual content of forms. The dataset comprises …

[PDF][PDF] Arabic Optical Character Recognition: A Review.

S Alghyaline - CMES-Computer Modeling in Engineering & …, 2023 - researchgate.net
This study aims to review the latest contributions in Arabic Optical Character Recognition
(OCR) during the last decade, which helps interested researchers know the existing …

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

T Hegghammer - Journal of Computational Social Science, 2022 - Springer
Abstract Optical Character Recognition (OCR) can open up understudied historical
documents to computational analysis, but the accuracy of OCR software varies. This article …

What we teach about race and gender: Representation in images and text of children's books

A Adukia, A Eble, E Harrison… - The Quarterly Journal …, 2023 - academic.oup.com
Books shape how children learn about society and norms, in part through representation of
different characters. We use computational tools to characterize representation in children's …

Two-step CNN framework for text line recognition in camera-captured images

YS Chernyshova, AV Sheshkus, VV Arlazarov - IEEE Access, 2020 - ieeexplore.ieee.org
In this paper, we introduce an “on the device” text line recognition framework that is
designed for mobile or embedded systems. We consider per-character segmentation as a …

The ethics of artificial intelligence in pathology and laboratory medicine: principles and practice

BR Jackson, Y Ye, JM Crawford… - Academic …, 2021 - journals.sagepub.com
Growing numbers of artificial intelligence applications are being developed and applied to
pathology and laboratory medicine. These technologies introduce risks and benefits that …

Text-MCL: Autonomous mobile robot localization in similar environment using text-level semantic information

G Ge, Y Zhang, W Wang, Q Jiang, L Hu, Y Wang - Machines, 2022 - mdpi.com
Localization is one of the most important issues in mobile robotics, especially when an
autonomous mobile robot performs a navigation task. The current and popular occupancy …

Which OCR toolset is good and why: A comparative study

P Jain, K Taneja, H Taneja - Kuwait Journal of Science, 2021 - journalskuwait.org
Abstract Optical Character Recognition (OCR) is a very active research area in many
scientific disciplines like pattern recognition, natural language processing (NLP), computer …

A short review on image caption generation with deep learning

S Amirian, K Rasheed, TR Taha… - Proceedings of the …, 2019 - search.proquest.com
Abstract Methodologies that utilize Deep Learning offer great potential for applications that
automatically attempt to generate captions or descriptions about images. Image captioning …