Music structure analysis using a probabilistic fitness measure and a greedy search algorithm

Authors:
Jouni Paulus;Anssi Klapuri
Affiliations:
Department of Signal Processing, Tampere University of Technology, Tampere, Finland;Department of Signal Processing, Tampere University of Technology, Tampere, Finland
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2009

Citing 15
Cited 3

Visualizing music and audio using self-similarity

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Automated extraction of music snippets

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
General sound classification and similarity in MPEG-7

Organised Sound
Automatic Structure Detection for Popular Music

IEEE MultiMedia
Music structure analysis by finding repeated parts

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Similarity matrix processing for music structure analysis

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Using duration models to reduce fragmentation in audio segmentation

Machine Learning
Automated analysis of musical structure

Automated analysis of musical structure
Music summarization using key phrases

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Multiple scale music segmentation using rhythm, timbre, and harmony

EURASIP Journal on Applied Signal Processing
Towards structural analysis of audio recordings in the presence of musical variations

EURASIP Journal on Applied Signal Processing
Automatic transcription of melody, bass line, and chords in polyphonic music

Computer Music Journal
Structural Segmentation of Musical Audio by Constrained Clustering

IEEE Transactions on Audio, Speech, and Language Processing
Analysis of the meter of acoustic musical signals

IEEE Transactions on Audio, Speech, and Language Processing
Audio thumbnailing of popular music using chroma-based representations

IEEE Transactions on Multimedia

Mining transposed motifs in music

Journal of Intelligent Information Systems
Music segmentation and summarization based on self-similarity matrix

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Elastic Net subspace clustering applied to pop/rock music structure analysis

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a method for recovering the sectional form of a musical piece from an acoustic signal. The description of form consists of a segmentation of the piece into musical parts, grouping of the segments representing the same part, and assigning musically meaningful labels, such as "chorus" or "verse," to the groups. The method uses a fitness function for the descriptions to select the one with the highest match with the acoustic properties of the input piece. Different aspects of the input signal are described with three acoustic features: mel-frequency cepstral coefficients, chroma, and rhythmogram. The features are used to estimate the probability that two segments in the description are repeats of each other, and the probabilities are used to determine the total fitness of the description. Creating the candidate descriptions is a combinatorial problem and a novel greedy algorithm constructing descriptions gradually is proposed to solve it. The group labeling utilizes a musicological model consisting of N-grams. The proposed method is evaluated on three data sets of musical pieces with manually annotated ground truth. The evaluations show that the proposed method is able to recover the structural description more accurately than the state-of-the-art reference method.