A music stream segregation system based on adaptive multi-agents

Authors:
Kunio Kashino;Hiroshi Murase
Affiliations:
NTT Basic Research Laboratories, Atsugi-shi, Kanagawa, Japan;NTT Basic Research Laboratories, Atsugi-shi, Kanagawa, Japan
Venue:
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Year:
1997

Citing 5
Cited 0

Fusion, propagation, and structuring in belief networks

Artificial Intelligence
An information-maximization approach to blind separation and blind deconvolution

Neural Computation
Prediction-driven computational auditory scene analysis

Prediction-driven computational auditory scene analysis
Organization of hierarchical perceptual sounds: music scene analysis with autonomous processing modules and a quantitative information integration mechanism

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Residue-driven architecture for computational auditory scene analysis

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

A principal problem of auditory scene analysis is stream segregation: decomposing an input acoustic signal into signals of individual sound sources included in the input. While existing signal processing algorithms cannot properly solve this inverse problem, a multiagentbased architecture has been considered to be a promising methodology in its modularity and scalability. However, most attempts made so far depend on subjectively defined rules to deal with variability of sounds. Here we propose a quantitatively principled architecture in agent interaction by formulating the problem as least-squares optimization. In this architecture, adaptation of the agents is the essential idea. We have developed two kinds of processing to realize adaptivity: template filtering and phase tracking. These mechanisms enable each agent to optimally, in the least-squares sense, track the individual sound. As an example application of the proposed architecture, we have built a music recognition system that recognizes instrument names and pitches of the notes included in ensemble music performances. Experimental results show that these adaptive mechanisms significantly improve the recognition accuracy.