Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models

JS Chung, A Zisserman - … Vision–ACCV 2016 Workshops: ACCV 2016 …, 2017 - Springer

The goal of this work is to determine the audio-video synchronisation between mouth motion
and speech in a video. We propose a two-stream ConvNet architecture that enables the …

被引用次数：802 相关文章所有 8 个版本

[PDF] mdpi.com

Audio-visual speech and gesture recognition by sensors of mobile devices

D Ryumin, D Ivanko, E Ryumina - Sensors, 2023 - mdpi.com

Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable
speech recognition, particularly when audio is corrupted by noise. Additional visual …

被引用次数：74 相关文章所有 9 个版本

[PDF] ieee.org

Biometric antispoofing methods: A survey in face recognition

J Galbally, S Marcel, J Fierrez - Ieee Access, 2014 - ieeexplore.ieee.org

In recent decades, we have witnessed the evolution of biometric technology from the first
pioneering works in face and voice recognition to the current state of development wherein a …

被引用次数：522 相关文章所有 11 个版本

[PDF] ieee.org

Audio-visual biometric recognition and presentation attack detection: A comprehensive survey

H Mandalapu, AR PN, R Ramachandra, KS Rao… - IEEE …, 2021 - ieeexplore.ieee.org

Biometric recognition is a trending technology that uses unique characteristics data to
identify or verify/authenticate security applications. Amidst the classically used biometrics …

被引用次数：35 相关文章所有 8 个版本

[PDF] ieee.org

3D convolutional neural networks for cross audio-visual matching recognition

A Torfi, SM Iranmanesh, N Nasrabadi, J Dawson - IEEE Access, 2017 - ieeexplore.ieee.org

Audio–visual recognition (AVR) has been considered as a solution for speech recognition
tasks when the audio is corrupted, as well as a visual recognition method used for speaker …

被引用次数：142 相关文章所有 7 个版本

[PDF] googleapis.com

Audio-visual speech recognition with scattering operators

E Marcheret, J Vopicka, V Goel - US Patent 9,697,833, 2017 - Google Patents

Aspects described herein are directed towards methods, computing devices, systems, and
computer-readable media that apply scattering operations to extracted visual features of …

被引用次数：47 相关文章所有 4 个版本

Multimodal activation: Awakening dialog robots without wake words

L Nie, M Jia, X Song, G Wu, H Cheng, J Gu - Proceedings of the 44th …, 2021 - dl.acm.org

When talking to the dialog robots, users have to activate the robot first from the standby
mode with special wake words, such as" Hey Siri", which is apparently not user-friendly. The …

被引用次数：12 相关文章

Presentation attack detection based on score level fusion and challenge-response technique

CL Chou - The Journal of Supercomputing, 2021 - Springer

Biometrics is the state of the art in dealing with identity identification and verification based
on the physical and behavioral characteristics and widely used in the fields of Fintech, such …

被引用次数：15 相关文章所有 2 个版本

[PDF] isca-archive.org

[PDF][PDF] Detecting audio-visual synchrony using deep neural networks.

E Marcheret, G Potamianos, J Vopicka, V Goel - INTERSPEECH, 2015 - isca-archive.org

In this paper, we address the problem of automatically detecting whether the audio and
visual speech modalities in frontal pose videos are synchronous or not. This is of interest in …

被引用次数：40 相关文章所有 7 个版本

[PDF] academia.edu

[PDF][PDF] Lip activity detection for talking faces classification in TV-content

M Bendris, D Charlet, G Chollet - International conference on …, 2010 - academia.edu

Our objective is to index people in a TV-Content. In this context, because of multi-face shots
and non-speaking face shots, it is difficult to determine which face is speaking. There is no …

被引用次数：47 相关文章所有 3 个版本

高级搜索

QQ 群