Multimodal people ID for a multimedia meeting browser

Authors:
Jie Yang;Xiaojin Zhu;Ralph Gross;John Kominek;Yue Pan;Alex Waibel
Affiliations:
Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA;Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA;Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA;Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA;Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA;Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, PA
Venue:
MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Year:
1999

Citing 8
Cited 14

Color indexing

International Journal of Computer Vision
Fundamentals of speech recognition

Fundamentals of speech recognition
Neural networks for pattern recognition

Neural networks for pattern recognition
Pfinder: Real-Time Tracking of the Human Body

IEEE Transactions on Pattern Analysis and Machine Intelligence
Visual tracking for multimodal human computer interaction

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automatic Detection of Human Faces in Natural Scene Images by Use of a Skin Color Model and of Invariant Moments

FG '98 Proceedings of the 3rd. International Conference on Face & Gesture Recognition
A real-time face tracker

WACV '96 Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96)
Divergence measures based on the Shannon entropy

IEEE Transactions on Information Theory

Modeling focus of attention for meeting indexing

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Portable meeting recorder

Proceedings of the tenth ACM international conference on Multimedia
Segmenting People in Meeting Videos Using Mixture Background and Object Models

PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Automated Analysis of Nursing Home Observations

IEEE Pervasive Computing
Robust speaker identification based on selective use of feature vectors

Pattern Recognition Letters
Audio-visual multi-person tracking and identification for smart environments

Proceedings of the 15th international conference on Multimedia
Towards smart meeting: enabling technologies and a real-world application

Proceedings of the 9th international conference on Multimodal interfaces
As go the feet...: on the estimation of attentional focus from stance

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Probabilistic integration of sparse audio-visual cues for identity tracking

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Multimodal identity tracking in a smart room

Personal and Ubiquitous Computing
Communicative gestures in coreference identification in multiparty meetings

Proceedings of the 2009 international conference on Multimodal interfaces
Smart meeting systems: A survey of state-of-the-art and open issues

ACM Computing Surveys (CSUR)
Exploiting sensecam for helping the blind in business negotiations

ICCHP'06 Proceedings of the 10th international conference on Computers Helping People with Special Needs
People reidentification in surveillance and forensics: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A meeting browser is a system that allows users to review a multimedia meeting record from a variety of indexing methods. Identification of meeting participants is essential for creating such a multimedia meeting record. Moreover, knowing who is speaking can enhance the performance of speech recognition and indexing meeting transcription. In this paper, we present an approach that identifies meeting participants by fusing multimodal inputs. We use face ID, speaker ID, color appearance ID, and sound source directional ID to identify and track meeting. After describing the different modules in detail, we will discuss a framework for combining the information sources. Integration of the multimodal people ID into the multimedia meeting browser is in its preliminary stage.