Grammatical category disambiguation by statistical optimization

Authors:
Steven J. DeRose
Affiliations:
Brown University, Dallas, TX
Venue:
Computational Linguistics
Year:
1988

Citing 5
Cited 63

Choice of grammatical word-class without global syntactic analysis: tagging words in the LOB Corpus.

Computers and the Humanities
A Computational Approach to Grammatical Coding of English Words

Journal of the ACM (JACM)
Art and Theory of Dynamic Programming

Art and Theory of Dynamic Programming
Semantic interpretation against ambiguity

Semantic interpretation against ambiguity
Grammatical analysis by computer of the Lancaster-Oslo/Bergen (LOB) corpus of British English texts

ACL '85 Proceedings of the 23rd annual meeting on Association for Computational Linguistics

A proposal for lexical disambiguation

HLT '91 Proceedings of the workshop on Speech and Natural Language
Use of syntactic context to produce term association lists for text retrieval

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Deterministic part-of-speech tagging with finite-state transducers

Computational Linguistics
Learning morpho-lexical probabilities from an untagged corpus with an application to Hebrew

Computational Linguistics
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Estimating lexical priors for low-frequency morphologically ambiguous forms

Computational Linguistics
A Machine Learning Approach to POS Tagging

Machine Learning
Formal Context and Morphological Analysis

CONTEXT '99 Proceedings of the Second International and Interdisciplinary Conference on Modeling and Using Context
Parsing Asymmetries

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
A Methodology for Deriving Probabilistic Correctness Measures from Recognizers

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Introduction to the special issue on computational linguistics using large corpora

Computational Linguistics - Special issue on using large corpora: I
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

Computational Linguistics - Special issue on using large corpora: I
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
Improving accuracy in word class tagging through the combination of machine learning systems

Computational Linguistics
Automatic rule induction for unknown-word guessing

Computational Linguistics
Real-time automatic insertion of accents in French text

Natural Language Engineering
Does Baum-Welch re-estimation help taggers?

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
A corpus-based statistical approach to automatic book indexing

ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Review of "English computer corpora: selected papers and research guide" by Stig Johansson and Anna-Brita Stenström. Mouton de Gruyter 1991

Computational Linguistics
A syntax-based part-of-speech analyser

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Cascaded Markov Models

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Parsing without lexicon: the MorP system

EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
The recognition capacity of local syntactic constraints

EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
Morphological disambiguation by voting constraints

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Comparing a linguistic and a stochastic tagger

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Tagging English by path voting constraints

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Acquiring disambiguation rules from text

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Parsing the LOB corpus

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Combining Trigram-based and feature-based methods for context-sensitive spelling correction

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Fast parsing using pruning and grammar specialization

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
GPSM: a Generaized Probabilistic Semantic Model for ambiguity resolution

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
"The first million is hardest to get": building a large tagged corpus as automatically as possible

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Blending segmentation with tagging in Chinese language corpus processing

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Disambiguation of morphological analysis in Bantu languages

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Linguistic indeterminacy as a source of errors in tagging

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Handling sparse data by successive abstraction

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Tagging spoken language using written language statistics

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Parsing, word associations and typical predicate-argument relations

HLT '89 Proceedings of the workshop on Speech and Natural Language
Augmenting a hidden Markov model for phrase-dependent word tagging

HLT '89 Proceedings of the workshop on Speech and Natural Language
A simple rule-based part of speech tagger

HLT '91 Proceedings of the workshop on Speech and Natural Language
Example-based correction of word segmentation and part of speech labelling

HLT '93 Proceedings of the workshop on Human Language Technology
A report of recent progress in transformation-based error-driven learning

HLT '94 Proceedings of the workshop on Human Language Technology
An interactive spreadsheet for teaching the forward-backward algorithm

ETMTNLP '02 Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1
Markov models for language-independent named entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Natural language tagging with genetic algorithms

Information Processing Letters
A hybrid approach to fuzzy name search incorporating language-based and text-based principles

Journal of Information Science
Agent Based Arabic Language Understanding

WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Part-of-speech tagging of modern hebrew text

Natural Language Engineering
HunPos: an open source trigram tagger

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Wrap-Up: a trainable discourse module for information extraction

Journal of Artificial Intelligence Research
POST: using probabilities in language processing

IJCAI'91 Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2
Using DEDICOM for completely unsupervised part-of-speech tagging

UMSLLS '09 Proceedings of the Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics
Equations for part-of-speech tagging

AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence
Comparing canonicalizations of historical German text

SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
A unified framework to incorporate speech and language information in spoken language processing

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Acoustic side-channel attacks on printers

USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Assessment of language skills weaknesses and their confrontation through personalized e-lessons

Proceedings of the 12th International Conference on Computer Systems and Technologies
Ending-based strategies for part-of-speech tagging

UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
Comparing two markov methods for part-of-speech tagging of portuguese

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Using micro-patterns of speech to predict the correctness of answers to mathematics problems: an exercise in multimodal learning analytics

Proceedings of the 15th ACM on International conference on multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several algorithms have been developed in the past that attempt to resolve categorial ambiguities in natural language text without recourse to syntactic or semantic level information. An innovative method (called "CLAWS") was recently developed by those working with the Lancaster-Oslo/Bergen Corpus of British English. This algorithm uses a systematic calculation based upon the probabilities of co-occurrence of particular tags. Its accuracy is high, but it is very slow, and it has been manually augmented in a number of ways. The effects upon accuracy of this manual augmentation are not individually known.The current paper presents an algorithm for disambiguation that is similar to CLAWS but that operates in linear rather than in exponential time and space, and which minimizes the unsystematic augments. Tests of the algorithm using the million words of the Brown Standard Corpus of English are reported; the overall accuracy is 96%. This algorithm can provide a fast and accurate front end to any parsing or natural language processing system for English.