Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Noun phrase recognition by system combination
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A memory-based approach to learning shallow natural language patterns
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Computational complexity of probabilistic disambiguation by means of tree-grammars
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Noun phrase recognition with tree patterns
Acta Cybernetica
Manually annotated Hungarian corpus
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Hi-index | 0.00 |
This paper presents a method for parsing Hungarian texts using a machine learning approach. The method collects the initial grammar for a learner from an annotated corpus with the help of tree shapes. The PGS algorithm, an improved version of the RGLearn algorithm, was developed and applied to learning tree patterns with various phrase types described by regular expressions. The method also calculates the probability values of the learned tree patterns. The syntactic parser of learned grammar using the Viterbi algorithm performs a quick search for finding the most probable derivation of a sentence. The results were built into an information extraction pipeline.