Sparks of large audio models: A survey and outlook

S Latif, M Shoukat, F Shamshad, M Usama… - arXiv preprint arXiv …, 2023 - arxiv.org
This survey paper provides a comprehensive overview of the recent advancements and
challenges in applying large language models to the field of audio signal processing. Audio …

Transformers in speech processing: A survey

S Latif, A Zaidi, H Cuayahuitl, F Shamshad… - arXiv preprint arXiv …, 2023 - arxiv.org
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …

Dialogue management and language generation for a robust conversational virtual coach: Validation and user study

A Vázquez, A López Zorrilla, JM Olaso, MI Torres - Sensors, 2023 - mdpi.com
Designing human–machine interactive systems requires cooperation between different
disciplines is required. In this work, we present a Dialogue Manager and a Language …

Speaking of accent: A content analysis of accent misconceptions in ASR research

K Prinos, N Patwari, CA Power - The 2024 ACM Conference on Fairness …, 2024 - dl.acm.org
Automatic speech recognition (ASR) researchers are working to address the differing
transcription performance of ASR by accent or dialect. However, research often has a limited …

Speech Classification using Acoustic Embedding and Large Language Models Applied on Alzheimer's Disease Prediction Task

M Kheirkhahzadeh - 2023 - diva-portal.org
Alzheimer's disease is a neurodegenerative disease that leads to dementia. It can begin
silently in the early stages and progresses over the years to a severe and incurable stage …

On the Use of Audio to Improve Dialogue Policies

D Roncel, F Costa, J Hernando - arXiv preprint arXiv:2410.13385, 2024 - arxiv.org
With the significant progress of speech technologies, spoken goal-oriented dialogue
systems are becoming increasingly popular. One of the main modules of a dialogue system …

[PDF][PDF] On the Use of Audio to Improve Dialogue Policies

DR Dıaz, F Costa, J Hernando - isca-archive.org
With the significant progress of speech technologies, spoken goal-oriented dialogue
systems are becoming increasingly popular. One of the main modules of a dialogue system …

Audio embeddings for chatbots

D Roncel Díaz - 2024 - upcommons.upc.edu
Spoken goal-oriented dialogue systems are increasingly popular for task management.
They are composed of multiple modules, and one of the most important ones is the dialogue …

[PDF][PDF] ANOMALOUS SOUND DETECTION WITH THREE-SUBNETWORKS AND PRE-TRAINED MODELS Technical Report

T Wu, J Wen, Z Yan, X Cheng - dcase.community
Unsupervised pretrained models have been used successfully in a wide range of scenarios.
This report presents our work for DCASE 2024 Task 2: First-shot unsupervised anomalous …

[引用][C] Dialogue Management and Language Generation for a Robust Conversational Virtual Coach: Validation and User Study

A Vázquez Risco, A López Zorrilla… - 2023 - MDPI