A report of recent progress in transformation-based error-driven learning

Authors:
Eric Brill
Affiliations:
Massachusetts Institute of Technology, Cambridge, Massachusetts
Venue:
HLT '94 Proceedings of the workshop on Human Language Technology
Year:
1994

Citing 11
Cited 4

Grammatical category disambiguation by statistical optimization

Computational Linguistics
A corpus-based approach to language learning

A corpus-based approach to language learning
A Computational Approach to Grammatical Coding of English Words

Journal of the ACM (JACM)
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Automatic grammar induction and parsing free text: a transformation-based approach

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Parsing the LOB corpus

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
A new quantitative quality measure for machine translation systems

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2

Retrieving NASA Problem Reports with Natural Language

NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Retrieving NASA problem reports: a case study in natural language information retrieval

Data & Knowledge Engineering - NLDB2002
Activity detection for information access to oral communication

HLT '01 Proceedings of the first international conference on Human language technology research
Empirical merging of ontologies: a proposal of universal uncertainty representation framework

ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of thousands of lexical and contextual probabilities. In [Brill 92], a trainable rule-based tagger was described that obtained performance comparable to that of stochastic taggers, but captured relevant linguistic information in a small number of simple non-stochastic rules. In this paper, we describe a number of extensions to this rule-based tagger. First, we describe a method for expressing lexical relations in tagging that stochastic taggers are currently unable to express. Next, we show a rule-based approach to tagging unknown words. Finally, we show how the tagger can be extended into a k-best tagger, where multiple tags can be assigned to words in some cases of uncertainty.