Cross-domain speech disfluency detection

Authors:
Kallirroi Georgila;Ning Wang;Jonathan Gratch
Affiliations:
University of Southern California, Playa Vista, CA;University of Southern California, Playa Vista, CA;University of Southern California, Playa Vista, CA
Venue:
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Year:
2010

Citing 10
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
Speech repairs, intonational phrases, and discourse markers: modeling speakers' utterances in spoken dialogue

Computational Linguistics
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Statistical language modeling for speech disfluencies

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Creating Rapport with Virtual Agents

IVA '07 Proceedings of the 7th international conference on Intelligent Virtual Agents
Hybrid Multi-step Disfluency Detection

MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Using integer linear programming for detecting speech disfluencies

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Global inference for sentence compression an integer linear programming approach

Journal of Artificial Intelligence Research
Enriching speech recognition with automatic detection of sentence boundaries and disfluencies

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We build a model for speech disfluency detection based on conditional random fields (CRFs) using the Switchboard corpus. This model is then applied to a new domain without any adaptation. We show that a technique for detecting speech disfluencies based on Integer Linear Programming (ILP) (Georgila, 2009) significantly outperforms CRFs. In particular, in terms of F-score and NIST Error Rate the absolute improvement of ILP over CRFs exceeds 20% and 25% respectively. We conclude that ILP is an approach with great potential for speech disfluency detection when there is a lack or shortage of indomain data for training.