This study carries out a systematic intrinsic evaluation of the semantic representations learned by state-of-the-art pre-trained multimodal Transformers. These representations are …
Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from …
Multimodal embeddings aim to enrich the semantic information in neural representations of language compared to text-only models. While different embeddings exhibit different …
CD Nath, SM Hazarika - Spatial Cognition & Computation, 2025 - Taylor & Francis
For visual information processing, the derivation of meaningful low-level spatio-temporal information is challenging. In line with human visualisation and perception in spatial …
Recent years have seen an exponential increase in unstructured data, primarily in the form of text, images, and videos. Extracting useful features and trends from large-scale …
GSR Nath, J Acharjee, S Deb - 2021 International Conference …, 2021 - ieeexplore.ieee.org
Due to the expeditious increase in the number of vehicles, there is an increase in the number of road casualties even in a highly sophisticated roadway. This depicts the natural …
Semantic information is important in eye movement control. An important semantic influence on gaze guidance relates to object-scene relationships: objects that are semantically …
MP Fortin, B Chaib-draa - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Few-shot Learning (FSL) aims to classify new concepts from a small number of examples. While there have been an increasing amount of work on few-shot object classification in the …
A Farhan, MS Hossain - … Conference on Big Data (Big Data), 2020 - ieeexplore.ieee.org
While the use of the distributional hypothesis has become popular in creating embedding for text corpus, it is rarely used for generating the contextual (distributed) representation of …