Pattern-based disambiguation for natural language processing

Authors:
Eric Brill
Affiliations:
Microsoft Research, Redmond, Wa.
Venue:
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Year:
2000

Citing 9
Cited 6

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Introduction to Automata Theory, Languages and Computability

Introduction to Automata Theory, Languages and Computability
Automatic Rule Acquisition for Spelling Correction

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Inference and Estimation of a Long-Range Trigram Model

ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Inducing constraint grammars

ICG! '96 Proceedings of the 3rd International Colloquium on Grammatical Inference: Learning Syntax from Sentences
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A classification approach to word prediction

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Exploiting syntactic structure for language modeling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1

Fuzzy Set Tagging

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Word sense disambiguation with pattern learning and automatic feature selection

Natural Language Engineering
Japanese named entity recognition based on a simple rule generator and decision tree learning

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Regular expression learning for information extraction

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A semi-supervised algorithm for pattern discovery in information extraction from textual data

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Clustering based approach to learning regular expressions over large alphabet for noisy unstructured text

AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data

Quantified Score

Hi-index	0.00

Visualization

Abstract

A wide range of natural language problems can be viewed as disambiguating between a small set of alternatives based upon the string context surrounding the ambiguity site. In this paper we demonstrate that classification accuracy can be improved by invoking a more descriptive feature set than what is typically used. We present a technique that disambiguates by learning regular expressions describing the string contexts in which the ambiguity sites appear.