相关文章- 学术资源搜索

Robust speech activity detection in movie audio: Data resources and experimental evaluation

R Hebbar, K Somandepalli… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Speech activity detection in highly variable acoustic conditions is a challenging task. Many
approaches to detect speech activity in such conditions involve an inherent knowledge of …

被引用次数：24 相关文章

[HTML] springer.com Full View

[HTML][HTML] Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset

D de Benito-Gorron, A Lozano-Diez… - EURASIP Journal on …, 2019 - Springer

Audio signals represent a wide diversity of acoustic events, from background environmental
noise to spoken communication. Machine learning models such as neural networks have …

被引用次数：50 相关文章所有 11 个版本

[PDF] ieee.org

Siamese style convolutional neural networks for sound search by vocal imitation

Y Zhang, B Pardo, Z Duan - IEEE/ACM Transactions on Audio …, 2018 - ieeexplore.ieee.org

Conventional methods for finding audio in databases typically search text labels, rather than
the audio itself. This can be problematic as labels may be missing, irrelevant to the audio …

被引用次数：60 相关文章所有 5 个版本

[PDF] arxiv.org

DNN and CNN with weighted and multi-task loss functions for audio event detection

H Phan, M Krawczyk-Becker, T Gerkmann… - arXiv preprint arXiv …, 2017 - arxiv.org

This report presents our audio event detection system submitted for Task 2," Detection of
rare sound events", of DCASE 2017 challenge. The proposed system is based on …

被引用次数：52 相关文章所有 7 个版本

[PDF] arxiv.org

Audio retrieval with wavtext5k and clap training

S Deshmukh, B Elizalde, H Wang - arXiv preprint arXiv:2209.14275, 2022 - arxiv.org

Audio-Text retrieval takes a natural language query to retrieve relevant audio files in a
database. Conversely, Text-Audio retrieval takes an audio file as a query to retrieve relevant …

被引用次数：32 相关文章所有 4 个版本

[PDF] arxiv.org

Language transfer of audio word2vec: Learning audio segment representations without target language data

CH Shen, JY Sung, HY Lee - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

Audio Word2Vec offers vector representations of fixed dimensionality for variable-length
audio segments using Sequence to-sequence Autoencoder (SA). These vector …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Retrieval-augmented text-to-audio generation

Y Yuan, H Liu, X Liu, Q Huang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Despite recent progress in text-to-audio (TTA) generation, we show that the state-of-the-art
models, such as AudioLDM, trained on datasets with an imbalanced class distribution, such …

被引用次数：4 相关文章所有 5 个版本

[PDF] arxiv.org

Swishnet: A fast convolutional neural network for speech, music and noise classification and segmentation

MS Hussain, MA Haque - arXiv preprint arXiv:1812.00149, 2018 - arxiv.org

Speech, Music and Noise classification/segmentation is an important preprocessing step for
audio processing/indexing. To this end, we propose a novel 1D Convolutional Neural …

被引用次数：30 相关文章所有 3 个版本

[PDF] arxiv.org

EEG2Mel: Reconstructing sound from brain responses to music

AG Ramirez-Aristizabal, C Kello - arXiv preprint arXiv:2207.13845, 2022 - arxiv.org

Information retrieval from brain responses to auditory and visual stimuli has shown success
through classification of song names and image classes presented to participants while …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Deep Learning for MIR Tutorial

A Schindler, T Lidy, S Böck - arXiv preprint arXiv:2001.05266, 2020 - arxiv.org

Deep Learning has become state of the art in visual computing and continuously emerges
into the Music Information Retrieval (MIR) and audio retrieval domain. In order to bring …

被引用次数：5 相关文章所有 2 个版本

高级搜索

QQ 群

Robust speech activity detection in movie audio: Data resources and experimental evaluation

[HTML][HTML] Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset

Siamese style convolutional neural networks for sound search by vocal imitation

DNN and CNN with weighted and multi-task loss functions for audio event detection

Audio retrieval with wavtext5k and clap training

Language transfer of audio word2vec: Learning audio segment representations without target language data

Retrieval-augmented text-to-audio generation

Swishnet: A fast convolutional neural network for speech, music and noise classification and segmentation

EEG2Mel: Reconstructing sound from brain responses to music

Deep Learning for MIR Tutorial

引用