Multichannel and Multimodality Person Identification

Authors:
Ming Liu;Yanxiang Chen;Xi Zhou;Xiaodan Zhuang;Mark Hasegawa-Johnson;Thomas Huang
Affiliations:
Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801;Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801
Venue:
Multimodal Technologies for Perception of Humans
Year:
2008

Citing 3
Cited 0

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Face recognition: A literature survey

ACM Computing Surveys (CSUR)
Audio-visual speech modeling for continuous speech recognition

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Person's identity is a very important high level information for video analysis and retrieval. Along the growth of multimedia data, the recording is not only multimodality and also multichannel(microphone array, camera array). In this paper, we describe a multimodal person identification system of UIUC team for CLEAR 2007 evaluation. The audio only system is based on a new proposed model --- Chain of Gaussian Mixtures. The visual only system is a face recognition module based on nearest neighbor classifier at appearance space. Final system fuses 7 channel microphone recordings and 4 camera recordings at decision level. The experimental results indicate the effectiviness of speaker modeling methods and the fusion scheme.