AutoAMS: Automated attention-based multi-modal graph learning architecture search

R Al-Sabri, J Gao, J Chen, BM Oloulade, Z Wu - Neural Networks, 2024 - Elsevier
Multi-modal attention mechanisms have been successfully used in multi-modal graph
learning for various tasks. However, existing attention-based multi-modal graph learning …

[HTML][HTML] Self-supervised incremental learning of object representations from arbitrary image sets

G Leotescu, AI Popa, D Grigore, D Voinea, P Perona - 2025 - amazon.science
Computing a comprehensive and robust visual representation of an arbitrary object or
category of objects is a complex problem. The difficulty increases when one starts from a set …

De-noised Vision-language Fusion Guided by Visual Cues for E-commerce Product Search

Z Hu, S Li, M Du, A Dhua… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
In e-commerce applications vision-language multimodal transformer models play a pivotal
role in product search. The key to successfully training a multimodal model lies in the …