Learning more universal representations for transfer-learning

Y Tamaazousti, H Le Borgne, C Hudelot… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
A representation is supposed universal if it encodes any element of the visual world (eg,
objects, scenes) in any configuration (eg, scale, context). While not expecting pure universal …

Humans meet models on object naming: A new dataset and analysis

C Silberer, S Zarrieß, M Westera… - Scott D, Bel N, Zong C …, 2020 - repositori.upf.edu
We release ManyNames v2 (MN v2), a verified version of an object naming dataset that
contains dozens of valid names per object for 25K images. We analyze issues in the data …

Enhancing the quality of image tagging using a visio-textual knowledge base

C Chaudhary, P Goyal, DN Prasad… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Auto-tagging of images is important for image understanding and for tag-based applications
viz. image retrieval, visual question-answering, image captioning, etc. Although existing …

Mucale-net: Multi categorical-level networks to generate more discriminating features

Y Tamaazousti, H Le Borgne… - Proceedings of the …, 2017 - openaccess.thecvf.com
In a transfer-learning scheme, the intermediate layers of a pre-trained CNN are employed as
universal image representation to tackle many visual classification problems. The current …

A survey on automatic image captioning

G Srivastava, R Srivastava - … , ICMC 2018, Varanasi, India, January 9-11 …, 2018 - Springer
Automatic image captioning is the process of providing natural language captions for
images automatically. Considering the huge number of images available in recent time …

Image retrieval for complex queries using knowledge embedding

C Chaudhary, P Goyal, N Goyal… - ACM Transactions on …, 2020 - dl.acm.org
With the increase in popularity of image-based applications, users are retrieving images
using more sophisticated and complex queries. We present three types of complex queries …

Building a voice based image caption generator with deep learning

M Anu, S Divya - 2021 5th International Conference on …, 2021 - ieeexplore.ieee.org
Image processing is used in various industries and it is remaining as one of the most
advanced technologies used in Google, medical field etc. Recently, this technology has also …

Enhancing Cross-Linguistic Image Caption Generation with Indian Multilingual Voice Interfaces using Deep Learning Techniques

VA Sangolgi, MB Patil, SS Vidap, SS Doijode… - Procedia Computer …, 2024 - Elsevier
Abstract The Multilingual Voice-Based Image Caption Generator (MVBICG) is a versatile tool
with numerous applications spanning communications, culture preservation, business, and …

Synthetic textual features for the large-scale detection of basic-level categories in English and Mandarin

Y Chen, S Teufel - Proceedings of the 2021 Conference on …, 2021 - aclanthology.org
Basic-level categories (BLC) are an important psycholinguistic concept introduced by Rosch
et al.(1976); they are defined as the most inclusive categories for which a concrete mental …

[PDF][PDF] On the universality of visual and multimodal representations

Y Tamaazousti - Theses, Université Paris-Saclay, 2018 - people.csail.mit.edu
A rtificial Intelligence is a hot topic that is on everyone's lips, in the news, in industry and
even in politics, because of the key societal, economic and cultural challenges it implies [75 …