Seeing voices and hearing faces: Cross-modal biometric matching

A Nagrani, S Albanie… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
We introduce a seemingly impossible task: given only an audio clip of someone speaking,
decide which of two face images is the speaker. In this paper we study this, and a number of …

Disjoint mapping network for cross-modal matching of voices and faces

Y Wen, MA Ismail, W Liu, B Raj, R Singh - arXiv preprint arXiv:1807.04836, 2018 - arxiv.org
We propose a novel framework, called Disjoint Mapping Network (DIMNet), for cross-modal
biometric matching, in particular of voices and faces. Different from the existing methods …

On learning associations of faces and voices

C Kim, HV Shin, TH Oh, A Kaspar, M Elgharib… - Computer Vision–ACCV …, 2019 - Springer
In this paper, we study the associations between human faces and voices. Audiovisual
integration, specifically the integration of facial and vocal information is a well-researched …

Disentangled representation learning for cross-modal biometric matching

H Ning, X Zheng, X Lu, Y Yuan - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Cross-modal biometric matching (CMBM) aims to determine the corresponding voice from a
face, or identify the corresponding face from a voice. Recently, many CMBM methods have …

Speech2face: Learning the face behind a voice

TH Oh, T Dekel, C Kim, I Mosseri… - Proceedings of the …, 2019 - openaccess.thecvf.com
How much can we infer about a person's looks from the way they speak? In this paper, we
study the task of reconstructing a facial image of a person from a short audio recording of …

Name-it: Association of face and name in video

S Satoh, T Kanade - … of IEEE Computer Society Conference on …, 1997 - ieeexplore.ieee.org
This paper proposes a novel approach to extract meaningful content information from video
by collaborative integration of image understanding and natural language processing. As an …

Visual acoustic matching

C Chen, R Gao, P Calamia… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce the visual acoustic matching task, in which an audio clip is transformed to
sound like it was recorded in a target environment. Given an image of the target environment …

Face recognition in unconstrained videos with matched background similarity

L Wolf, T Hassner, I Maoz - CVPR 2011, 2011 - ieeexplore.ieee.org
Recognizing faces in unconstrained videos is a task of mounting importance. While
obviously related to face recognition in still images, it has its own unique characteristics and …

The gender gap in face recognition accuracy is a hairy problem

A Bhatta, V Albiero, KW Bowyer… - Proceedings of the …, 2023 - openaccess.thecvf.com
It is broadly accepted that there is a" gender gap" in| face recognition accuracy, with females
having higher false| match and false non-match rates. However, relatively little is known …

Disentangled variational representation for heterogeneous face recognition

X Wu, H Huang, VM Patel, R He, Z Sun - … of the AAAI conference on artificial …, 2019 - aaai.org
Visible (VIS) to near infrared (NIR) face matching is a challenging problem due to the
significant domain discrepancy between the domains and a lack of sufficient data for training …