Survey on multi-output learning

D Xu, Y Shi, IW Tsang, YS Ong… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
The aim of multi-output learning is to simultaneously predict multiple outputs given an input.
It is an important learning problem for decision-making since making decisions in the real …

Automatic image and video caption generation with deep learning: A concise review and algorithmic overlap

S Amirian, K Rasheed, TR Taha, HR Arabnia - IEEE access, 2020 - ieeexplore.ieee.org
Methodologies that utilize Deep Learning offer great potential for applications that
automatically attempt to generate captions or descriptions about images and video frames …

Exploring visual relationship for image captioning

T Yao, Y Pan, Y Li, T Mei - Proceedings of the European …, 2018 - openaccess.thecvf.com
It is always well believed that modeling relationships between objects would be helpful for
representing and eventually describing an image. Nevertheless, there has not been …

Survey on deep neural networks in speech and vision systems

M Alam, MD Samad, L Vidyaratne, A Glandon… - Neurocomputing, 2020 - Elsevier
This survey presents a review of state-of-the-art deep neural network architectures,
algorithms, and systems in speech and vision applications. Recent advances in deep …

A survey on automatic image caption generation

S Bai, S An - Neurocomputing, 2018 - Elsevier
Image captioning means automatically generating a caption for an image. As a recently
emerged research area, it is attracting more and more attention. To achieve the goal of …

Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition

S Xie, H Hu, Y Wu - Pattern recognition, 2019 - Elsevier
Abstract Facial Expression Recognition (FER) has long been a challenging task in the field
of computer vision. In this paper, we present a novel model, named Deep Attentive Multi …

Learning visual relationship and context-aware attention for image captioning

J Wang, W Wang, L Wang, Z Wang, DD Feng, T Tan - Pattern Recognition, 2020 - Elsevier
Image captioning which automatically generates natural language descriptions for images
has attracted lots of research attentions and there have been substantial progresses with …

Incorporating attentive multi-scale context information for image captioning

J Prudviraj, Y Sravani, CK Mohan - Multimedia Tools and Applications, 2023 - Springer
In this paper, we propose a novel encoding framework to learn the multi-scale context
information of the visual scene for image captioning task. The devised multi-scale context …

Robust change captioning

DH Park, T Darrell, A Rohrbach - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Describing what has changed in a scene can be useful to a user, but only if generated text
focuses on what is semantically relevant. It is thus important to distinguish distractors (eg a …

Changes to captions: An attentive network for remote sensing change captioning

S Chang, P Ghamisi - IEEE Transactions on Image Processing, 2023 - ieeexplore.ieee.org
In recent years, advanced research has focused on the direct learning and analysis of
remote-sensing images using natural language processing (NLP) techniques. The ability to …