Person identification based on multichannel and multimodality fusion

Authors:
Ming Liu;Hao Tang;Huazhong Ning;Thomas Huang
Affiliations:
IFP Group, University of Illinois at Urbana-Champaign, Urbana, IL;IFP Group, University of Illinois at Urbana-Champaign, Urbana, IL;IFP Group, University of Illinois at Urbana-Champaign, Urbana, IL;IFP Group, University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Year:
2006

Citing 3
Cited 0

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Face recognition: A literature survey

ACM Computing Surveys (CSUR)
Audio-visual speech modeling for continuous speech recognition

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.01

Visualization

Abstract

Person ID is a very useful information for high level video analysis and retrieval. In some scenario, the recording is not only multimodality and also multichannel (microphone array, camera array). In this paper, we describe a Multimodal person ID system base on multichannel and multimodal fusion. The audio only system is combining 7 channel microphone recording at decision output individual audio-only system. The modeling technique of audio system is Universal Background Model (UBM) and Maximum a Posterior adaptation framework which is very popular in speaker recognition literature. The visual only system works directly on the appearance space via l1 norm and nearest neighbor classifier. The linear fusion is then combining the two modalities to improve the ID performance. The experiments indicate the effectiviness of micropohone array fusion and audio/visual fusion.