A Practical Chunker for Unrestricted Text

  • Authors:
  • E. Stamatatos;Nikos Fakotakis;George K. Kokkinakis

  • Affiliations:
  • -;-;-

  • Venue:
  • NLP '00 Proceedings of the Second International Conference on Natural Language Processing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is performed using exclusively two small lexicons containing closed-class words and common suffixes of the Modern Greek words. We give comparative performance results on the basis of a corpus of unrestricted text and show that very good results can be obtained by omitting the large and complicate resources. Moreover, the considerable time cost introduced by the use of the large lexicon indicates that the minimal-resources chunker is the best solution regarding a practical application that requires rapid response and less than perfect parsing results.