Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Aggregate features and ADABOOST for music classification
Machine Learning
Music genre classification using explicit semantic analysis
MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
An analysis of the GTZAN music genre dataset
Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Hi-index | 0.00 |
Musical genre classification is a promising yet difficult task in the field of musical information retrieval. As a widely used feature in genre classification systems, MFCC is typically believed to encode timbral information, since it represents short-duration musical textures. In this paper, we investigate the invariance of MFCC to musical key and tempo, and show that MFCCs in fact encode both timbral and key information. We also show that musical genres, which should be independent of key, are in fact influenced by the fundamental keys of the instruments involved. As a result, genre classifiers based on the MFCC features will be influenced by the dominant keys of the genre, resulting in poor performance on songs in less common keys.We propose an approach to address this problem, which consists of augmenting classifier training and prediction with various key and tempo transformations of the songs. The resulting genre classifier is invariant to key, and thus more timbreoriented, resulting in improved classification accuracy in our experiments.