Automatic stochastic tagging of natural language texts
Computational Linguistics
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Automatic rule induction for unknown-word guessing
Computational Linguistics
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Automatic authorship attribution
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Surface grammatical analysis for the extraction of terminological noun phrases
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
The message understanding conferences
TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
Empirical Paraphrasing of Modern Greek Text in Two Phases: An Application to Steganography
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
ICGI'10 Proceedings of the 10th international colloquium conference on Grammatical inference: theoretical results and applications
Experimental evaluation of tree-based algorithms for intonational breaks representation
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Hi-index | 0.00 |
In this paper we present a practical approach to text chunking for unrestricted Modern Greek text that is based on multiple-pass parsing. Two versions of this chunker are proposed: one based on a large lexicon and one based on minimal resources. In the latter case the morphological analysis is performed using exclusively two small lexicons containing closed-class words and common suffixes of the Modern Greek words. We give comparative performance results on the basis of a corpus of unrestricted text and show that very good results can be obtained by omitting the large and complicate resources. Moreover, the considerable time cost introduced by the use of the large lexicon indicates that the minimal-resources chunker is the best solution regarding a practical application that requires rapid response and less than perfect parsing results.