A music stream segregation system based on adaptive multi-agents

  • Authors:
  • Kunio Kashino;Hiroshi Murase

  • Affiliations:
  • NTT Basic Research Laboratories, Atsugi-shi, Kanagawa, Japan;NTT Basic Research Laboratories, Atsugi-shi, Kanagawa, Japan

  • Venue:
  • IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

A principal problem of auditory scene analysis is stream segregation: decomposing an input acoustic signal into signals of individual sound sources included in the input. While existing signal processing algorithms cannot properly solve this inverse problem, a multiagentbased architecture has been considered to be a promising methodology in its modularity and scalability. However, most attempts made so far depend on subjectively defined rules to deal with variability of sounds. Here we propose a quantitatively principled architecture in agent interaction by formulating the problem as least-squares optimization. In this architecture, adaptation of the agents is the essential idea. We have developed two kinds of processing to realize adaptivity: template filtering and phase tracking. These mechanisms enable each agent to optimally, in the least-squares sense, track the individual sound. As an example application of the proposed architecture, we have built a music recognition system that recognizes instrument names and pitches of the notes included in ensemble music performances. Experimental results show that these adaptive mechanisms significantly improve the recognition accuracy.