A Nagrani, S Albanie, A Zisserman - Conference on Computer Vision …, 2018 - ora.ox.ac.uk
We introduce a seemingly impossible task: given only an audio clip of someone speaking,
decide which of two face images is the speaker. In this paper we study this, and a number of …