Representing text chunks

  • Authors:
  • Erik F. Tjong Kim Sang;Jorn Veenstra

  • Affiliations:
  • University of Antwerp, Wilrijk, Belgium;Tilburg University, Le Tilburg, The Netherlands

  • Venue:
  • EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However, equipped with the most suitable data representation, our memory-based learning chunker was able to improve the best published chunking results for a standard data set.