Comma restoration using constituency information

  • Authors:
  • Stuart M. Shieber;Xiaopeng Tao

  • Affiliations:
  • Harvard University;Harvard University

  • Venue:
  • NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic restoration of punctuation from unpunctuated text has application in improving the fluency and applicability of speech recognition systems. We explore the possibility that syntactic information can be used to improve the performance of an HMM-based system for restoring punctuation (specifically, commas) in text. Our best methods reduce sentence error rate substantially --- by some 20%, with an additional 8% reduction possible given improvements in extraction of the requisite syntactic information.