A comprehensive survey on multi-modal conversational emotion recognition with deep learning

Y Shou, T Meng, W Ai, N Yin, K Li - arXiv preprint arXiv:2312.05735, 2023 - arxiv.org
Multi-modal conversation emotion recognition (MCER) aims to recognize and track the
speaker's emotional state using text, speech, and visual information in the conversation …

An attention-based, context-aware multimodal fusion method for sarcasm detection using inter-modality inconsistency

Y Li, Y Li, S Zhang, G Liu, Y Chen, R Shang… - Knowledge-Based …, 2024 - Elsevier
Sarcasm, a subtle and complex form of expression, presents significant challenges in
detection, especially in the context of social media and meta universe applications where …

Biometrics in extended reality: a review

A Agarwal, R Ramachandra, S Venkatesh… - Discover Artificial …, 2024 - Springer
In the domain of Extended Reality (XR), particularly Virtual Reality (VR), extensive research
has been devoted to harnessing this transformative technology in various real-world …

Easy, Interpretable, Effective: openSMILE for voice deepfake detection

O Pascu, D Oneata, H Cucu, NM Müller - arXiv preprint arXiv:2408.15775, 2024 - arxiv.org
In this paper, we demonstrate that attacks in the latest ASVspoof5 dataset--a de facto
standard in the field of voice authenticity and deepfake detection--can be identified with …

Continuous Speech-Based Fatigue Detection and Transition State Prediction for Air Traffic Controllers

S Vekkot, ST Chavali, CT Kandavalli, RSA Podila… - IEEE …, 2024 - ieeexplore.ieee.org
Air traffic controllers (ATC) play a critical role in ensuring aviation safety, but their demanding
workload can lead to fatigue, potentially compromising their performance. This paper …

Source and system-based modulation approach for fake speech detection

R Sadashiv TN, D Kumar, A Agarwal, M Tzudir… - … Conference on Speech …, 2023 - Springer
The advancement of deep learning technology in speech generation has made fake speech
almost perceptually indistinguishable from real speech. Most of the attempts in literature are …

Spoofing countermeasure for fake speech detection using brute force features

AR Mirza, AK Al-Talabani - Computer Speech & Language, 2025 - Elsevier
Due to the progress in deep learning technology, techniques that generate spoofed speech
have significantly emerged. Such synthetic speech can be exploited for harmful purposes …

Enhancing Multimodal Emotional Information Extraction in Film and Television through Adaptive Feature Fusion with DenseNe, Transformer, and 3D CNN Models

S Liang - Applied Artificial Intelligence, 2024 - Taylor & Francis
The extraction of multimodal emotional information enables a more nuanced representation
of the emotional subtleties embedded in film and television works. However, conventional …

Driver Speech Detection in Real Driving Scenario

M Bhattacharjee, S Baghel, SRM Prasanna - International Conference on …, 2023 - Springer
Developing high-quality artificial intelligence based driver assistance systems is an active
research area. One critical challenge is developing efficient methods to detect a driver …

Fake Speech Detection in Domain Variability Scenario

RS TN, A Agarwal… - 2024 National Conference …, 2024 - ieeexplore.ieee.org
Domain variability refers to a condition where the train and test speech data are from
different environments. This seems to be challenging to deal with in fake speech detection …