Hyperbolic audio-visual zero-shot learning

J Hong, Z Hayder, J Han, P Fang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Audio-visual zero-shot learning aims to classify samples consisting of a pair of
corresponding audio and video sequences from classes that are not present during training …

Boosting Audio-visual Zero-shot Learning with Large Language Models

H Chen, Y Li, Y Hong, Z Huang, Z Xu, Z Gu… - arXiv preprint arXiv …, 2023 - arxiv.org
Audio-visual zero-shot learning aims to recognize unseen categories based on paired audio-
visual sequences. Recent methods mainly focus on learning aligned and discriminative …

Out-Of-Distribution Detection for Audio-visual Generalized Zero-Shot Learning: A General Framework

L Wen - arXiv preprint arXiv:2408.01284, 2024 - arxiv.org
Generalized Zero-Shot Learning (GZSL) is a challenging task requiring accurate
classification of both seen and unseen classes. Within this domain, Audio-visual GZSL …

Enhancing Multi-modal Contrastive Learning via Optimal Transport-Based Consistent Modality Alignment

S Zhu, D Luo - Chinese Conference on Pattern Recognition and …, 2024 - Springer
Multi-modal contrastive learning has gained significant attention in recent years due to the
rapid growth of multi-modal data and the increasing application demands in practice, eg …