Multi-view graph convolutional networks with attention mechanism

K Yao, J Liang, J Liang, M Li, F Cao - Artificial Intelligence, 2022 - Elsevier
Recent advances in graph convolutional networks (GCNs), which mainly focus on how to
exploit information from different hops of neighbors in an efficient way, have brought …

Word representation learning in multimodal pre-trained transformers: An intrinsic evaluation

S Pezzelle, E Takmaz, R Fernández - Transactions of the Association …, 2021 - direct.mit.edu
This study carries out a systematic intrinsic evaluation of the semantic representations
learned by state-of-the-art pre-trained multimodal Transformers. These representations are …

Language with vision: A study on grounded word and sentence embeddings

H Shahmohammadi, M Heitmeier… - Behavior Research …, 2024 - Springer
Grounding language in vision is an active field of research seeking to construct cognitively
plausible word and sentence representations by incorporating perceptual knowledge from …

Leverage points in modality shifts: Comparing language-only and multimodal word representations

A Tikhonov, L Bylinina, D Paperno - arXiv preprint arXiv:2306.02348, 2023 - arxiv.org
Multimodal embeddings aim to enrich the semantic information in neural representations of
language compared to text-only models. While different embeddings exhibit different …

Exploring diagram-based visual problem representation and relational abstraction

CD Nath, SM Hazarika - Spatial Cognition & Computation, 2025 - Taylor & Francis
For visual information processing, the derivation of meaningful low-level spatio-temporal
information is challenging. In line with human visualisation and perception in spatial …

Context-Aware Temporal Embeddings for Text and Video Data

A Farhan - 2023 - search.proquest.com
Recent years have seen an exponential increase in unstructured data, primarily in the form
of text, images, and videos. Extracting useful features and trends from large-scale …

Traffic sign recognition and distance estimation with yolov3 model

GSR Nath, J Acharjee, S Deb - 2021 International Conference …, 2021 - ieeexplore.ieee.org
Due to the expeditious increase in the number of vehicles, there is an increase in the
number of road casualties even in a highly sophisticated roadway. This depicts the natural …

Semantic object-scene inconsistencies affect eye movements, but not in the way predicted by contextualized meaning maps

MA Pedziwiatr, M Kümmerer, TSA Wallis… - Journal of …, 2022 - jov.arvojournals.org
Semantic information is important in eye movement control. An important semantic influence
on gaze guidance relates to object-scene relationships: objects that are semantically …

Towards contextual learning in few-shot object classification

MP Fortin, B Chaib-draa - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Few-shot Learning (FSL) aims to classify new concepts from a small number of examples.
While there have been an increasing amount of work on few-shot object classification in the …

Vizobj2vec: Contextual representation learning for visual objects in video-frames

A Farhan, MS Hossain - … Conference on Big Data (Big Data), 2020 - ieeexplore.ieee.org
While the use of the distributional hypothesis has become popular in creating embedding for
text corpus, it is rarely used for generating the contextual (distributed) representation of …