Grammatical category disambiguation by statistical optimization
Computational Linguistics
A corpus-based approach to language learning
A corpus-based approach to language learning
A Computational Approach to Grammatical Coding of English Words
Journal of the ACM (JACM)
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Coping with ambiguity and unknown words through probabilistic models
Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Automatic grammar induction and parsing free text: a transformation-based approach
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
A new quantitative quality measure for machine translation systems
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Retrieving NASA Problem Reports with Natural Language
NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Retrieving NASA problem reports: a case study in natural language information retrieval
Data & Knowledge Engineering - NLDB2002
Activity detection for information access to oral communication
HLT '01 Proceedings of the first international conference on Human language technology research
Empirical merging of ontologies: a proposal of universal uncertainty representation framework
ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Hi-index | 0.00 |
Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of thousands of lexical and contextual probabilities. In [Brill 92], a trainable rule-based tagger was described that obtained performance comparable to that of stochastic taggers, but captured relevant linguistic information in a small number of simple non-stochastic rules. In this paper, we describe a number of extensions to this rule-based tagger. First, we describe a method for expressing lexical relations in tagging that stochastic taggers are currently unable to express. Next, we show a rule-based approach to tagging unknown words. Finally, we show how the tagger can be extended into a k-best tagger, where multiple tags can be assigned to words in some cases of uncertainty.