Structural Segmentation of Musical Audio by Constrained Clustering

Authors:
M. Levy;M. Sandler
Affiliations:
Dept. of Electron. Eng., Queen Mary Univ. of London, London;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2008

Citing 0
Cited 10

Music structure analysis using a probabilistic fitness measure and a greedy search algorithm

IEEE Transactions on Audio, Speech, and Language Processing
Music information retrieval using social tags and audio

IEEE Transactions on Multimedia - Special section on communities and media computing
Clustering for music search results

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Modeling music as a dynamic texture

IEEE Transactions on Audio, Speech, and Language Processing
Representing musical sounds with an interpolating state model

IEEE Transactions on Audio, Speech, and Language Processing
Mining transposed motifs in music

Journal of Intelligent Information Systems
Indexing musical pieces using their major repetition

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Speech/music discrimination in audio podcast using structural segmentation and timbre recognition

CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Semi-supervised constrained clustering with cluster outlier filtering

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Elastic Net subspace clustering applied to pop/rock music structure analysis

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a method of segmenting musical audio into structural sections based on a hierarchical labeling of spectral features. Frames of audio are first labeled as belonging to one of a number of discrete states using a hidden Markov model trained on the features. Histograms of neighboring frames are then clustered into segment-types representing distinct distributions of states, using a clustering algorithm in which temporal continuity is expressed as a set of constraints modeled by a hidden Markov random field. We give experimental results which show that in many cases the resulting segmentations correspond well to conventional notions of musical form. We show further how the constrained clustering approach can easily be extended to include prior musical knowledge, input from other machine approaches, or semi-supervision.