Edit detection and parsing for transcribed speech

  • Authors:
  • Eugene Charniak;Mark Johnson

  • Affiliations:
  • Brown University, Providence, RI;Brown University, Providence, RI

  • Venue:
  • NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a simple architecture for parsing transcribed speech in which an edited-word detector first removes such words from the sentence string, and then a standard statistical parser trained on transcribed speech parses the remaining words. The edit detector achieves a misclassification rate on edited words of 2.2%. (The NULL-model, which marks everything as not edited, has an error rate of 5.9%.) To evaluate our parsing results we introduce a new evaluation metric, the purpose of which is to make evaluation of a parse tree relatively indifferent to the exact tree position of EDITED nodes. By this metric the parser achieves 85.3% precision and 86.5% recall.