- 学术资源搜索

A review of physical and perceptual feature extraction techniques for speech, music and environmental sounds

F Alías, JC Socoró, X Sevillano - Applied Sciences, 2016 - mdpi.com

Endowing machines with sensing capabilities similar to those of humans is a prevalent
quest in engineering and computer science. In the pursuit of making computers sense their …

被引用次数：308 相关文章所有 11 个版本

[PDF] ismir.net

[PDF][PDF] Music emotion recognition: A state of the art review

YE Kim, EM Schmidt, R Migneco, BG Morton… - Proc. ismir, 2010 - archives.ismir.net

This paper surveys the state of the art in automatic emotion recognition in music. Music is
oftentimes referred to as a “language of emotion”[1], and it is natural for us to categorize …

被引用次数：697 相关文章所有 8 个版本

[PDF] arxiv.org

Text-to-audio generation using instruction-tuned llm and latent diffusion model

D Ghosal, N Majumder, A Mehrish, S Poria - arXiv preprint arXiv …, 2023 - arxiv.org

The immense scale of the recent large language models (LLM) allows many interesting
properties, such as, instruction-and chain-of-thought-based fine-tuning, that has significantly …

被引用次数：143 相关文章所有 3 个版本

[PDF] surrey.ac.uk

Panns: Large-scale pretrained audio neural networks for audio pattern recognition

Q Kong, Y Cao, T Iqbal, Y Wang… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org

Audio pattern recognition is an important research topic in the machine learning area, and
includes several tasks such as audio tagging, acoustic scene classification, music …

被引用次数：1248 相关文章所有 8 个版本

[PDF] aaai.org

Smil: Multimodal learning with severely missing modality

M Ma, J Ren, L Zhao, S Tulyakov, C Wu… - Proceedings of the AAAI …, 2021 - ojs.aaai.org

A common assumption in multimodal learning is the completeness of training data, ie, full
modalities are available in all training examples. Although there exists research endeavor in …

被引用次数：249 相关文章所有 6 个版本

[PDF] arxiv.org

Mert: Acoustic music understanding model with large-scale self-supervised training

Y Li, R Yuan, G Zhang, Y Ma, X Chen, H Yin… - arXiv preprint arXiv …, 2023 - arxiv.org

Self-supervised learning (SSL) has recently emerged as a promising paradigm for training
generalisable models on large-scale data in the fields of vision, text, and speech. Although …

被引用次数：89 相关文章所有 6 个版本

[PDF] calebrascon.info

Trends in audio signal feature extraction methods

G Sharma, K Umapathy, S Krishnan - Applied Acoustics, 2020 - Elsevier

Audio signal processing algorithms generally involves analysis of signal, extracting its
properties, predicting its behaviour, recognizing if any pattern is present in the signal, and …

被引用次数：470 相关文章所有 3 个版本

[PDF] oup.com

Deep forest

ZH Zhou, J Feng - National science review, 2019 - academic.oup.com

Current deep-learning models are mostly built upon neural networks, ie multiple layers of
parameterized differentiable non-linear modules that can be trained by backpropagation. In …

被引用次数：1782 相关文章所有 22 个版本

[PDF] city.ac.uk

Singing voice separation with deep u-net convolutional networks

A Jansson, E Humphrey, N Montecchio, R Bittner… - 2017 - openaccess.city.ac.uk

The decomposition of a music audio signal into its vocal and backing track components is
analogous to image-to-image translation, where a mixed spectrogram is transformed into its …

被引用次数：550 相关文章所有 7 个版本

[PDF] arxiv.org

FMA: A dataset for music analysis

M Defferrard, K Benzi, P Vandergheynst… - arXiv preprint arXiv …, 2016 - arxiv.org

We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable
for evaluating several tasks in MIR, a field concerned with browsing, searching, and …

被引用次数：541 相关文章所有 4 个版本

高级搜索

QQ 群

A review of physical and perceptual feature extraction techniques for speech, music and environmental sounds

[PDF][PDF] Music emotion recognition: A state of the art review

Text-to-audio generation using instruction-tuned llm and latent diffusion model

Panns: Large-scale pretrained audio neural networks for audio pattern recognition

Smil: Multimodal learning with severely missing modality

Mert: Acoustic music understanding model with large-scale self-supervised training

Trends in audio signal feature extraction methods

Deep forest

Singing voice separation with deep u-net convolutional networks

FMA: A dataset for music analysis

引用