YE Kim, EM Schmidt, R Migneco, BG Morton… - Proc. ismir, 2010 - archives.ismir.net
This paper surveys the state of the art in automatic emotion recognition in music. Music is oftentimes referred to as a “language of emotion”[1], and it is natural for us to categorize …
The immense scale of the recent large language models (LLM) allows many interesting properties, such as, instruction-and chain-of-thought-based fine-tuning, that has significantly …
Audio pattern recognition is an important research topic in the machine learning area, and includes several tasks such as audio tagging, acoustic scene classification, music …
A common assumption in multimodal learning is the completeness of training data, ie, full modalities are available in all training examples. Although there exists research endeavor in …
Self-supervised learning (SSL) has recently emerged as a promising paradigm for training generalisable models on large-scale data in the fields of vision, text, and speech. Although …
Audio signal processing algorithms generally involves analysis of signal, extracting its properties, predicting its behaviour, recognizing if any pattern is present in the signal, and …
Current deep-learning models are mostly built upon neural networks, ie multiple layers of parameterized differentiable non-linear modules that can be trained by backpropagation. In …
The decomposition of a music audio signal into its vocal and backing track components is analogous to image-to-image translation, where a mixed spectrogram is transformed into its …
We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and …