Domain specialization as the key to make large language models disruptive: A comprehensive survey

C Ling, X Zhao, J Lu, C Deng, C Zheng, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have significantly advanced the field of natural language
processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of …

Transformers in speech processing: A survey

S Latif, A Zaidi, H Cuayahuitl, F Shamshad… - arXiv preprint arXiv …, 2023 - arxiv.org
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …

A survey on deep reinforcement learning for audio-based applications

S Latif, H Cuayáhuitl, F Pervez, F Shamshad… - Artificial Intelligence …, 2023 - Springer
Deep reinforcement learning (DRL) is poised to revolutionise the field of artificial intelligence
(AI) by endowing autonomous systems with high levels of understanding of the real world …

Audio Embedding-Aware Dialogue Policy Learning

AL Zorrilla, MI Torres… - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Following the success of Natural Language Processing (NLP) transformers pretrained via
self-supervised learning, similar models have been proposed recently for speech …

On the Use of Audio to Improve Dialogue Policies

D Roncel, F Costa, J Hernando - arXiv preprint arXiv:2410.13385, 2024 - arxiv.org
With the significant progress of speech technologies, spoken goal-oriented dialogue
systems are becoming increasingly popular. One of the main modules of a dialogue system …

[PDF][PDF] On the Use of Audio to Improve Dialogue Policies

DR Dıaz, F Costa, J Hernando - isca-archive.org
With the significant progress of speech technologies, spoken goal-oriented dialogue
systems are becoming increasingly popular. One of the main modules of a dialogue system …

対話システム研究における強化学習の利用

河野誠也, 吉野幸一郎 - 人工知能, 2022 - jstage.jst.go.jp
制御 (dialogue management: DM) と呼ぶ場合も多い. 言語生成 (natural language generation:
NLG) は対話制御が出力した行動のフレームを入力として, 対応する自然言語の生成を行う. 最後に …