查看文章

mdpi.com 中的 [HTML]

Comparative Analysis of Deep Learning Architectures and Vision Transformers for Musical Key Estimation

作者

Manav Garg, Pranshav Gajjar, Pooja Shah, Madhu Shukla, Biswaranjan Acharya, Vassilis C Gerogiannis, Andreas Kanavos

发表日期

2023/9/28

期刊

Information

卷号

期号

页码范围

527

出版商

MDPI

简介

The musical key serves as a crucial element in a piece, offering vital insights into the tonal center, harmonic structure, and chord progressions while enabling tasks such as transposition and arrangement. Moreover, accurate key estimation finds practical applications in music recommendation systems and automatic music transcription, making it relevant across academic and industrial domains. This paper presents a comprehensive comparison between standard deep learning architectures and emerging vision transformers, leveraging their success in various domains. We evaluate their performance on a specific subset of the GTZAN dataset, analyzing six different deep learning models. Our results demonstrate that DenseNet, a conventional deep learning architecture, achieves remarkable accuracy of 91.64%, outperforming vision transformers. However, we delve deeper into the analysis to shed light on the temporal characteristics of each deep learning model. Notably, the vision transformer and SWIN transformer exhibit a slight decrease in overall performance (1.82% and 2.29%, respectively), yet they demonstrate superior performance in temporal metrics compared to the DenseNet architecture. The significance of our findings lies in their contribution to the field of musical key estimation, where accurate and efficient algorithms play a pivotal role. By examining the strengths and weaknesses of deep learning architectures and vision transformers, we can gain valuable insights for practical implementations, particularly in music recommendation systems and automatic music transcription. Our research provides a foundation for future …

引用总数

被引用次数：3

20242

学术搜索中的文章

Comparative Analysis of Deep Learning Architectures and Vision Transformers for Musical Key Estimation

M Garg, P Gajjar, P Shah, M Shukla, B Acharya… - Information, 2023

被引用次数：3 相关文章所有 3 个版本