Early deletion of fillers in processing conversational speech

Authors:
Matthew Lease;Mark Johnson
Affiliations:
Brown University, Providence, RI;Brown University, Providence, RI
Venue:
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Year:
2006

Citing 9
Cited 3

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A syntactic framework for speech repairs and other disruptions

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Markov parsing: lattice rescoring with a statistical parser

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Edit detection and parsing for transcribed speech

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Parsing and disfluency placement

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A TAG-based noisy channel model of speech repairs

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Effective use of prosody in parsing conversational speech

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing

Where Do Parsing Errors Come From

TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Wide-coverage parsing of speech transcripts

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Automatic identification of discourse markers in dialogues: An in-depth study of like and well

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper evaluates the benefit of deleting fillers (e.g. you know, like) early in parsing conversational speech. Readability studies have shown that disfluencies (fillers and speech repairs) may be deleted from transcripts without compromising meaning (Jones et al., 2003), and deleting repairs prior to parsing has been shown to improve its accuracy (Charniak and Johnson, 2001). We explore whether this strategy of early deletion is also beneficial with regard to fillers. Reported experiments measure the effect of early deletion under in-domain and out-of-domain parser training conditions using a state-of-the-art parser (Charniak, 2000). While early deletion is found to yield only modest benefit for in-domain parsing, significant improvement is achieved for out-of-domain adaptation. This suggests a potentially broader role for disfluency modeling in adapting text-based tools for processing conversational speech.