Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers.

S Bengesi, H El-Sayed, MK Sarker, Y Houkpati… - IEEE …, 2024 - ieeexplore.ieee.org
The launch of ChatGPT in 2022 garnered global attention, marking a significant milestone in
the Generative Artificial Intelligence (GAI) field. While GAI has been in effect for the past …

Linguistic analysis of human-computer interaction

G Zellou, N Holliday - Frontiers in Computer Science, 2024 - frontiersin.org
This article reviews recent literature investigating speech variation in production and
comprehension during spoken language communication between humans and devices …

A virtual simulation-pilot agent for training of air traffic controllers

J Zuluaga-Gomez, A Prasad, I Nigmatulina, P Motlicek… - Aerospace, 2023 - mdpi.com
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …

Enhancing Semantic Communication with Deep Generative Models--An ICASSP Special Session Overview

E Grassucci, Y Mitsufuji, P Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Semantic communication is poised to play a pivotal role in shaping the landscape of future
AI-driven communication systems. Its challenge of extracting semantic information from the …

A method of noise reduction for radio communication signal based on Ragan

L Peng, S Fang, Y Fan, M Wang, Z Ma - Sensors, 2023 - mdpi.com
Radio signals are polluted by noise in the process of channel transmission, which will lead
to signal distortion. Noise reduction of radio signals is an effective means to eliminate the …

Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

H Barakat, O Turk, C Demiroglu - EURASIP Journal on Audio, Speech, and …, 2024 - Springer
Speech synthesis has made significant strides thanks to the transition from machine learning
to deep learning models. Contemporary text-to-speech (TTS) models possess the capability …

CharacterMeet: Supporting Creative Writers' Entire Story Character Construction Processes Through Conversation with LLM-Powered Chatbot Avatars

HX Qin, S Jin, Z Gao, M Fan, P Hui - … of the CHI Conference on Human …, 2024 - dl.acm.org
Support for story character construction is as essential as characters are for stories. Building
upon past research on early character construction stages, we explore how conversation …

A multi-task learning speech synthesis optimization method based on CWT: a case study of Tacotron2

G Hu, Z Ruan, W Guo, Y Quan - EURASIP Journal on Advances in Signal …, 2024 - Springer
Text-to-speech synthesis plays an essential role in facilitating human-computer interaction.
Currently, the predominant approach in Text-to-speech acoustic models selects only the Mel …

Synthesizing Lithuanian voice replacement for laryngeal cancer patients with Pareto-optimized flow-based generative synthesis network

R Maskeliunas, R Damasevicius, A Kulikajevas… - Applied Acoustics, 2024 - Elsevier
This study presents a Pareto optimized flow-based generative network for speech synthesis-
the P-GLOW model in Lithuanian speech synthesis for substituting original voices affected …

Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation

Y Li, A Mehrish, B Chew, B Cheng, S Poria - arXiv preprint arXiv …, 2024 - arxiv.org
Different languages have distinct phonetic systems and vary in their prosodic features
making it challenging to develop a Text-to-Speech (TTS) model that can effectively …