Tagging sentence boundaries

  • Authors:
  • Andrei Mikheev

  • Affiliations:
  • The University of Edinburgh, Edinburgh, UK

  • Venue:
  • NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we tackle sentence boundary disambiguation through a part-of-speech (POS) tagging framework. We describe necessary changes in text tokenization and the implementation of a POS tagger and provide results of an evaluation of this system on two corpora. We also describe an extension of the traditional POS tagging by combining it with the document-centered approach to proper name identification and abbreviation handling. This made the resulting system robust to domain and topic shifts.