Applications of deep learning to audio generation

A Prashanth, SL Jayalakshmi… - Multimedia Tools and …, 2024 - Springer

In our day-to-day life, observation of human and social actions are highly important for public
protection and security. Additionally, identifying suspicious activity is also essential in critical …

被引用次数：7 相关文章所有 2 个版本

Pvass-mdd: predictive visual-audio alignment self-supervision for multimodal deepfake detection

Y Yu, X Liu, R Ni, S Yang, Y Zhao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Deepfake techniques can forge the visual or audio signals in the video, which leads to
inconsistencies between visual and audio (VA) signals. Therefore, multimodal detection …

被引用次数：21 相关文章

[PDF] smu.edu.sg

Generating music with emotions

C Bao, Q Sun - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org

We focus on the music generation conditional on human emotions, specifically the positive
and negative emotions. There is no existing large-scale music datasets with the annotation …

被引用次数：22 相关文章所有 3 个版本

[PDF] springer.com

Multi-task learning-based spoofing-robust automatic speaker verification system

Y Zhao, R Togneri, V Sreeram - Circuits, Systems, and Signal Processing, 2022 - Springer

Spoofing attacks posed by generating artificial speech can severely degrade the
performance of a speaker verification system. Recently, many anti-spoofing …

被引用次数：21 相关文章所有 8 个版本

Chord-based music generation using long short-term memory neural networks in the context of artificial intelligence

F Li - The Journal of Supercomputing, 2024 - Springer

With the rapid development of artificial intelligence (AI), music generation has gained
widespread attention. Long short-term memory (LSTM) has advantages in handling time …

被引用次数：5 相关文章所有 2 个版本

[PDF] aaai.org

Neural synthesis of sound effects using flow-based deep generative models

S Andreu, MV Aylagas - Proceedings of the AAAI Conference on …, 2022 - ojs.aaai.org

Creating variations of sound effects for video games is a time-consuming task that grows
with the size and complexity of the games themselves. The process usually comprises …

被引用次数：10 相关文章所有 7 个版本

Replay anti-spoofing countermeasure based on data augmentation with post selection

Y Zhao, R Togneri, V Sreeram - Computer Speech & Language, 2020 - Elsevier

Abstract Automatic Speaker Verification (ASV) systems have been widely applied for
speaker authentication for biometric security especially in e-business scenarios. However …

被引用次数：14 相关文章所有 2 个版本

[PDF] arxiv.org

Gotta Hear Them All: Sound Source Aware Vision to Audio Generation

W Guo, H Wang, W Cai, J Ma - arXiv preprint arXiv:2411.15447, 2024 - arxiv.org

Vision-to-audio (V2A) synthesis has broad applications in multimedia. Recent
advancements of V2A methods have made it possible to generate relevant audios from …

Machine and Deep Learning Methods for Predicting Immune Checkpoint Blockade Response

D Ho, M Motani - Machine Learning for Health, 2022 - proceedings.mlr.press

Immune checkpoint blockade (ICB) therapy has improved treatment options in various
cancer malignancies and holds promise for increasing the overall survival of treated …

被引用次数：1 相关文章

[PDF] arxiv.org

A survey of deep learning audio generation methods

M Božić, M Horvat - arXiv preprint arXiv:2406.00146, 2024 - arxiv.org

This article presents a review of typical techniques used in three distinct aspects of deep
learning model development for audio generation. In the first part of the article, we provide …

高级搜索

QQ 群