A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Mistake-driven mixture of hierarchical tag context trees
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised learning of word-category guessing rules
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Achieving an Almost Correct PoS-Tagged Corpus
TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Detecting errors in part-of-speech annotation
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
(Semi-)automatic detection of errors in PoS-tagged corpora
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Annotating topological fields and chunks: and revising POS tags at the same time
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Schema and variation: digitizing printed dictionaries
ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
Hi-index | 0.00 |
This paper proposes a new unsupervised learning method for obtaining English part-of-speech (POS) disambiguation rules which would improve the accuracy of a POS tagger. This method has been implemented in the experimental system APRAS (Automatic POS Rule Acquisition System), which extracts POS disambiguation rules from plain text corpora by utilizing different types of coded linguistic knowledge, i.e., POS tagging rules and syntactic parsing rules, which are already stored in a fully implemented MT system.In our experiment, the obtained rules were applied to 1.7% of the sentences in a non-training corpus. For this group of sentences, 78.4% of the changes made in tagging results were an improvement. We also saw a 15.5% improvement in tagging and parsing speed and an 8.0% increase of parsable sentences.