查看文章

ismir.net 中的 [PDF]

Facilitating Comprehensive Benchmarking Experiments on the Million Song Dataset.

作者

Alexander Schindler, Rudolf Mayer, Andreas Rauber

发表日期

2012

研讨会论文

ISMIR

页码范围

469-474

简介

The Million Song Dataset (MSD), a collection of one million music pieces, enables a new era of research of Music Information Retrieval methods for large-scale applications. It comes as a collection of meta-data such as the song names, artists and albums, together with a set of features extracted with the The Echo Nest services, such as loudness, tempo, and MFCC-like features. There is, however, no easily obtainable download for the audio files. Furthermore, labels for supervised machine learning tasks are missing. Researchers thus are currently restricted on working solely with these features provided, limiting the usefulness of MSD. We therefore present in this paper a more comprehensive set of data based on the MSD, allowing its broader use as benchmark collection. Specifically, we provide a wide and growing collection of other well-known features in the MIR domain, as well as ground truth data with a set of recommended training/test splits.

We obtained these features from audio samples provided by 7digital. com, and metadata from the All Music Guide. While copyright prevents re-distribution of the audio snippets per se, the features as well as metadata are publicly available on our website for benchmarking evaluations. In this paper we describe the pre-processing and cleansing steps applied, as well as feature sets and tools made available, together with first baseline classification results.

引用总数

被引用次数：78

2012201320142015201620172018201920202021202220233 9 5 11 13 8 4 6 6 5 4 3

学术搜索中的文章

Facilitating Comprehensive Benchmarking Experiments on the Million Song Dataset.

A Schindler, R Mayer, A Rauber - ISMIR, 2012

被引用次数：78 相关文章所有 12 个版本