Automatic music transcription: An overview

E Benetos, S Dixon, Z Duan… - IEEE Signal Processing …, 2018 - ieeexplore.ieee.org
The capability of transcribing music audio into music notation is a fascinating example of
human intelligence. It involves perception (analyzing complex auditory scenes), cognition …

A tutorial on deep learning for music information retrieval

K Choi, G Fazekas, K Cho, M Sandler - arXiv preprint arXiv:1709.04396, 2017 - arxiv.org
Following their success in Computer Vision and other areas, deep learning techniques have
recently become widely adopted in Music Information Retrieval (MIR) research. However …

Onsets and frames: Dual-objective piano transcription

C Hawthorne, E Elsen, J Song, A Roberts… - arXiv preprint arXiv …, 2017 - arxiv.org
We advance the state of the art in polyphonic piano music transcription by using a deep
convolutional and recurrent neural network which is trained to jointly predict onsets and …

MT3: Multi-task multitrack music transcription

J Gardner, I Simon, E Manilow, C Hawthorne… - arXiv preprint arXiv …, 2021 - arxiv.org
Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a
challenging task at the core of music understanding. Unlike Automatic Speech Recognition …

High-resolution piano transcription with pedals by regressing onset and offset times

Q Kong, B Li, X Song, Y Wan… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Automatic music transcription (AMT) is the task of transcribing audio recordings into
symbolic representations. Recently, neural network-based methods have been applied to …

Hear: Holistic evaluation of audio representations

J Turian, J Shier, HR Khan, B Raj… - NeurIPS 2021 …, 2022 - proceedings.mlr.press
What audio embedding approach generalizes best to a wide range of downstream tasks
across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark …

Learning features of music from scratch

J Thickstun, Z Harchaoui, S Kakade - arXiv preprint arXiv:1611.09827, 2016 - arxiv.org
This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of
supervision and evaluation of machine learning methods for music research. MusicNet …

nnaudio: An on-the-fly gpu audio to spectrogram conversion toolbox using 1d convolutional neural networks

KW Cheuk, H Anderson, K Agres, D Herremans - IEEE Access, 2020 - ieeexplore.ieee.org
In this paper, we present nnAudio, a new neural network-based audio processing framework
with graphics processing unit (GPU) support that leverages 1D convolutional neural …

ASAP: a dataset of aligned scores and performances for piano transcription

F Foscarin, A Mcleod, P Rigaux… - … Society for Music …, 2020 - infoscience.epfl.ch
In this paper we present Aligned Scores and Performances (ASAP): a new dataset of 222
digital musical scores aligned with 1068 performances (more than 92 hours) of Western …

Multi-instrument automatic music transcription with self-attention-based instance segmentation

YT Wu, B Chen, L Su - IEEE/ACM Transactions on Audio …, 2020 - ieeexplore.ieee.org
Multi-instrument automatic music transcription (AMT) is a critical but less investigated
problem in the field of music information retrieval (MIR). With all the difficulties faced by …