Modeling timbre distance with temporal statistics from polyphonic music

Authors:
F. Morchen;A. Ultsch;M. Thies;I. Lohken
Affiliations:
Data Bionics Res. Group, Philipps Univ., Marburg, Germany;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2006

Citing 0
Cited 13

Understandable models Of music collections based on exhaustive feature generation with temporal statistics

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Lightweight measures for timbral similarity of musical audio

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Analytical features: a knowledge-based approach to audio feature generation

EURASIP Journal on Audio, Speech, and Music Processing
Stable and Accurate Feature Selection

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features

IEEE Transactions on Multimedia
Automatic index construction for multimedia digital libraries

Information Processing and Management: an International Journal
The development of interactive feature selection and GA feature selection method for emotion recognition

KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part III
Random relevant and non-redundant feature subspaces for co-training

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Co-training with relevant random subspaces

Neurocomputing
Music classification via the bag-of-features approach

Pattern Recognition Letters
The novel feature selection method based on emotion recognition system

ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
The interactive feature selection method development for an ANN based emotion recognition system

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Example-guided physically based modal sound synthesis

ACM Transactions on Graphics (TOG)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Timbre distance and similarity are expressions of the phenomenon that some music appears similar while other songs sound very different to us. The notion of genre is often used to categorize music, but songs from a single genre do not necessarily sound similar and vice versa. In this work, we analyze and compare a large amount of different audio features and psychoacoustic variants thereof for the purpose of modeling timbre distance. The sound of polyphonic music is commonly described by extracting audio features on short time windows during which the sound is assumed to be stationary. The resulting down sampled time series are aggregated to form a high-level feature vector describing the music. We generated high-level features by systematically applying static and temporal statistics for aggregation. The temporal structure of features in particular has previously been largely neglected. A novel supervised feature selection method is applied to the huge set of possible features. The distances of the selected feature correspond to timbre differences in music. The features show few redundancies and have high potential for explaining possible clusters. They outperform seven other previously proposed feature sets on several datasets with respect to the separation of the known groups of timbrally different music.