Towards Identity-Aware Cross-Modal Retrieval: a Dataset and a Baseline

N Messina, L Vadicamo, L Maltese… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in deep learning have significantly enhanced content-based retrieval
methods, notably through models like CLIP that map images and texts into a shared …

Multimedia Information Retrieval in XR

R Arnold, W Bailer, R Gasser, BÞ Jónsson… - Proceedings of the …, 2024 - dl.acm.org
The way we create, consume and interact with multimedia content has changed significantly
in recent years with the advent of affordable recording devices and easy sharing and access …

Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment

K Schall, KU Barthel, N Hezel, K Jung - International Conference on …, 2024 - Springer
Abstract Contrastive Language and Image Pairing (CLIP), a transformative method in
multimedia retrieval, typically trains two neural networks concurrently to generate joint …

Comparative Analysis of Relevance Feedback Techniques for Image Retrieval

L Vadicamo, F Scotti, A Dearle, R Connor - International Conference on …, 2025 - Springer
Relevance feedback mechanisms have garnered significant attention in content-based
image and video retrieval thanks to their effectiveness in refining search results to better …

Interactive Video Search with Multi-modal LLM Video Captioning

YT Cheng, J Wu, Z Ma, J He, XY Wei… - … Conference on Multimedia …, 2025 - Springer
Cross-modal representation learning is essential for interactive text-to-video search tasks.
However, the representation learning is limited by the size and quality of video-caption pairs …

PraK Tool V3: Enhancing Video Item Search Using Localized Text and Texture Queries

M Stroh, V Kloda, B Verner, Z Vopálková… - … on Multimedia Modeling, 2025 - Springer
We present a third version of the PraK system designed around an effective text-image and
image-image search model. The system integrates sub-image search options for localized …

IMSearch 2.0: Toward User-Centric and Efficient Interactive Multimedia Retrieval System

DT Luu, KAC Quan, DN Nguyen, KL Bui-Le… - … on Multimedia Modeling, 2025 - Springer
The rapid growth of the internet and technology has led to an exponential increase in the
volume of information individuals must manage, which results in a rising demand for search …

ViFi: A Video Finding System at Video Browser Showdown 2025

KAC Quan, QN Nguyen, MT Tran - International Conference on Multimedia …, 2025 - Springer
This paper presents ViFi-a Vi deo Fi nding System for Video Browser Showdown 2025. Our
retrieval system is mainly based on the SigLIP, a most recent and robust visual-textual …

NII-UIT at VBS2025: Multimodal Video Retrieval with LLM Integration and Dynamic Temporal Search

BT Gia, TBC Khanh, TLT Thanh, TT Doan, K Le… - … on Multimedia Modeling, 2025 - Springer
In summary, our innovative retrieval system for interactive video search, developed for the
VBS 2025 competition, significantly elevates the user experience through the utilization of …

[图书][B] MultiMedia Modeling: 31st International Conference on Multimedia Modeling, MMM 2025, Nara, Japan, January 8–10, 2025, Proceedings, Part IV

I Ide - books.google.com
It is with great pleasure that we welcome you to the 31st International Conference on
Multimedia Modeling (MMM 2025), held from January 8 to 10, 2025, in the historic city of …