Speaker change detection and tracking in real-time news broadcasting analysis

Authors:
Lie Lu;Hong-Jiang Zhang
Affiliations:
Microsoft Research, Asia Beijing, China;Microsoft Research, Asia Beijing, China
Venue:
Proceedings of the tenth ACM international conference on Multimedia
Year:
2002

Citing 6
Cited 14

Data fusion in robotics and machine intelligence

Data fusion in robotics and machine intelligence
Speaker segmentation for browsing recorded audio

CHI '95 Conference Companion on Human Factors in Computing Systems
A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Speaker Identification Based Text to Audio Alignment for an Audio Retrieval System

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Investigating speaker features from very short speech records

Investigating speaker features from very short speech records
A speaker tracking system based on speaker turn detection for NIST evaluation

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02

Automated extraction of music snippets

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
HTTP-Proxy-Assisted Automatic Video Indexing for E-Learning

SAINT-W '04 Proceedings of the 2004 Symposium on Applications and the Internet-Workshops (SAINT 2004 Workshops)
EM detection of common origin of multi-modal cues

Proceedings of the 8th international conference on Multimodal interfaces
CA3: collaborative annotation of audio in academia

ACM-SE 45 Proceedings of the 45th annual southeast regional conference
Speaker separation and tracking system

EURASIP Journal on Applied Signal Processing
Speaker change detection in casual conversations using excitation source features

Speech Communication
Review: Speaker segmentation and clustering

Signal Processing
Unsupervised speaker segmentation with residual phase and MFCC features

Expert Systems with Applications: An International Journal
BIC-based speaker segmentation using divide-and-conquer strategies with application to speaker diarization

IEEE Transactions on Audio, Speech, and Language Processing
Produce. annotate. archive. repurpose --: accelerating the composition and metadata accumulation of tv content

AIEMPro '11 Proceedings of the 2011 ACM international workshop on Automated media analysis and production for novel TV services
Speaker-and-environment change detection in broadcast news using maximum divergence common component GMM

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Collaborative personal speaker identification: A generalized approach

Pervasive and Mobile Computing
A review on speaker diarization systems and approaches

Speech Communication
A unified framework for domain independent online speaker indexing in eigen-voice space using an index tree of reference models

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the problem of real time speaker change detection and speaker tracking in broadcasted news video analysis. In such a case, both speaker identities and number of speakers are assumed unknown. A two-step speaker change detection algorithm, including potential change detection and refinement, is proposed. Speaker tracking is performed based on the results of speaker change detection. A Bayesian Fusion method is used to fuse multiple audio features to get a more reliable result. The algorithm has low complexity and runs in real-time with a very limited delay in analysis. Our experiments show that the algorithms produce very satisfactory results.