VISIONE at video browser showdown 2023

G Amato, P Bolettieri, F Carrara, F Falchi… - … on multimedia modeling, 2023 - Springer
In this paper, we present the fourth release of VISIONE, a tool for fast and effective video
search on a large-scale dataset. It includes several search functionalities like text search …

Marineinst: A foundation model for marine image analysis with instance visual description

Z Zheng, Y Chen, H Zeng, TA Vu, BS Hua… - … on Computer Vision, 2025 - Springer
Recent foundation models trained on a tremendous scale of data have shown great promise
in a wide range of computer vision tasks and application domains. However, less attention …

Vibro: video browsing with semantic and visual image embeddings

K Schall, N Hezel, K Jung, KU Barthel - International Conference on …, 2023 - Springer
Vibro represents a powerful tool for interactive video retrieval and browsing and is the
winner of the Video Browser Showdown 2022. Following the saying of “never change a …

Exploring effective interactive text-based video search in vitrivr

L Sauter, R Gasser, S Heller, L Rossetto… - … on Multimedia Modeling, 2023 - Springer
Abstract vitrivr is a general purpose retrieval system that supports a wide range of query
modalities. In this paper, we briefly introduce the system and describe the changes and …

A Survey of Video Datasets for Grounded Event Understanding

K Sanders, B Van Durme - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
While existing video benchmarks largely consider specialized downstream tasks like
retrieval or question-answering (QA) contemporary multimodal AI systems must be capable …

VideoCLIP: an interactive CLIP-based video retrieval system at VBS2023

TN Nguyen, B Puangthamawathanakun… - … on Multimedia Modeling, 2023 - Springer
In this paper, we present an interactive video retrieval system named VideoCLIP developed
for the Video Browser Showdown 2023. To support users in solving retrieval tasks, the …

DiveXplore at the Video Browser Showdown 2024

K Schoeffmann, S Nasirihaghighi - International Conference on Multimedia …, 2024 - Springer
According to our experience from VBS2023 and the feedback from the IVR4B special
session at CBMI2023, we have largely revised the diveXplore system for VBS2024. It now …

Exploring boundary of gpt-4v on marine analysis: A preliminary case study

Z Zheng, Y Chen, J Zhang, TA Vu, H Zeng… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated a powerful ability to answer various
queries as a general-purpose assistant. The continuous multi-modal large language models …

Visione: a large-scale video retrieval system with advanced search functionalities

G Amato, P Bolettieri, F Carrara, F Falchi… - Proceedings of the …, 2023 - dl.acm.org
VISIONE is a large-scale video retrieval system that integrates multiple search
functionalities, including free text search, spatial color and object search, visual and …

Video search with CLIP and interactive text query reformulation

J Lokoč, Z Vopálková, P Dokoupil, L Peška - International Conference on …, 2023 - Springer
Nowadays, deep learning based models like CLIP allow simple design of cross-modal video
search systems that are able to solve many tasks considered as highly challenging several …