Mlm: a benchmark dataset for multitask learning with multiple languages and modalities

J Armitage, E Kacupaj, G Tahmasebzadeh… - Proceedings of the 29th …, 2020 - dl.acm.org
In this paper, we introduce the MLM (Multiple Languages and Modalities) dataset-a new
resource to train and evaluate multitask systems on samples in multiple modalities and three …

Topic-based image caption generation

SK Dash, S Acharya, P Pakray, R Das… - Arabian Journal for …, 2020 - Springer
Image captioning is to generate captions for a given image based on the content of the
image. To describe an image efficiently, it requires extracting as much information from it as …

[HTML][HTML] Multimodal learning based spatial relation identification

SK Dash, YV Sureshchandra, Y Mishra… - Computación y …, 2020 - scielo.org.mx
Spatial Relation identification is one of the integral parts of Spatial Information Retrieval. It
deals with identifying the spatially related objects in view of their physical orientation or …

Image retrieval system for citizen services using penalized logistic regression models

E de Ves, X Benavent, G Ayala… - Proceedings of the 10th …, 2020 - dl.acm.org
This paper describes a procedure to deal with large image collections obtained by smart city
services based on interaction with citizens providing pictures. The semantic gap between …

Cross-modal retrieval by an end to end way

BH Yin, X Li - IOP Conference Series: Materials Science and …, 2020 - iopscience.iop.org
Cross-modal retrieval has attracted most attention in the recent years. For the image and
text, how to measure the semantic similarity among them is still a challenging problem in …