Unsupervised analysis and generation of audio percussion sequences

Authors:
Marco Marchini;Hendrik Purwins
Affiliations:
Music Technology Group, Department of Information and Communications Technologies, Universitat Pompeu Fabra, Barcelona, Spain;Music Technology Group, Department of Information and Communications Technologies, Universitat Pompeu Fabra, Barcelona, Spain
Venue:
CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Year:
2010

Citing 4
Cited 0

The power of amnesia: learning probabilistic automata with variable memory length

Machine Learning - Special issue on COLT '94
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Virtual Music: Computer Synthesis of Musical Style

Virtual Music: Computer Synthesis of Musical Style
What/when causal expectation modelling applied to audio signals

Connection Science - Music, Brain, Cognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

A system is presented that learns the structure of an audio recording of a rhythmical percussion fragment in an unsupervised manner and that synthesizes musical variations from it. The procedure consists of 1) segmentation, 2) symbolization (feature extraction, clustering, sequence structure analysis, temporal alignment), and 3) synthesis. The symbolization step yields a sequence of event classes. Simultaneously, representations are maintained that cluster the events into few or many classes. Based on the most regular clustering level, a tempo estimation procedure is used to preserve the metrical structure in the generated sequence. Employing variable length Markov chains, the final synthesis is performed, recombining the audio material derived from the sample itself. Representations with different numbers of classes are used to trade off statistical significance (short context sequence, low clustering refinement) versus specificity (long context, high clustering refinement) of the generated sequence. For a broad variety of musical styles the musical characteristics of the original are preserved. At the same time, considerable variability is introduced in the generated sequence.