The rich transcription 2006 spring meeting recognition evaluation

Authors:
Jonathan G. Fiscus;Jerome Ajot;Martial Michel;John S. Garofolo
Affiliations:
National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD;National Institute of Standards and Technology, Gaithersburg, MD
Venue:
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Year:
2006

Citing 2
Cited 11

The rich transcription 2005 spring meeting recognition evaluation

MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
The NIST meeting room corpus 2 phase 1

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction

Evaluations of Automatic Speaker Classification Systems

Speaker Classification I
The Rich Transcription 2007 Meeting Recognition Evaluation

Multimodal Technologies for Perception of Humans
The IBM Rich Transcription 2007 Speech-to-Text Systems for Lecture Meetings

Multimodal Technologies for Perception of Humans
The IBM RT07 Evaluation Systems for Speaker Diarization on Lecture Meetings

Multimodal Technologies for Perception of Humans
Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings

MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Enhanced speech features by single-channel joint compensation of noise and reverberation

IEEE Transactions on Audio, Speech, and Language Processing
An information theoretic approach to speaker diarization of meeting data

IEEE Transactions on Audio, Speech, and Language Processing
The CLEAR 2006 evaluation

CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Computer-supported human-human multilingual communication

50 years of artificial intelligence
Robust speech/non-speech classification in heterogeneous multimedia content

Speech Communication
The AMI speaker diarization system for NIST RT06s meeting data

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the design and results of the Spring 2006 (RT-06S) Rich Transcription Meeting Recognition Evaluation; the fourth in a series of community-wide evaluations of language technologies in the meeting domain. For 2006, we supported three evaluation tasks in two meeting sub-domains: the Speech-To-Text (STT) transcription task, and the “Who Spoke When” and “Speech Activity Detection” diarization tasks. The meetings were from the Conference Meeting, and Lecture Meeting sub-domains. The lowest STT word error rate, with up to four simultaneous speakers, in the multiple distant microphone condition was 46.3% for the conference sub-domain, and 53.4% for the lecture sub-domain. For the “Who Spoke When” task, the lowest diarization error rates for all speech were 35.8% and 24.0% for the conference and lecture sub-domains respectively. For the “Speech Activity Detection” task, the lowest diarization error rates were 4.3% and 8.0% for the conference and lecture sub-domains respectively.