Augmenting a hidden Markov model for phrase-dependent word tagging

Authors:
Julian Kupiec
Affiliations:
Xerox Palo Alto Research Center, Palo Alto, CA
Venue:
HLT '89 Proceedings of the workshop on Speech and Natural Language
Year:
1989

Citing 2
Cited 10

Grammatical category disambiguation by statistical optimization

Computational Linguistics
Probabilistic models of short and long distance word dependencies in running text

HLT '89 Proceedings of the workshop on Speech and Natural Language

A Trellis-based algorithm for estimating the parameters of a hidden stochastic context-free grammar

HLT '91 Proceedings of the workshop on Speech and Natural Language
Introduction to the special issue on computational linguistics using large corpora

Computational Linguistics - Special issue on using large corpora: I
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
A stochastic parser based on a structural word prediction model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
An algorithm for estimating the parameters of unrestricted hidden stochastic context-free grammars

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
A simple rule-based part of speech tagger

HLT '91 Proceedings of the workshop on Speech and Natural Language
POST: using probabilities in language processing

IJCAI'91 Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2
Hidden Markov estimation for unrestricted stochastic context-free grammars

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper describes refinements that are currently being investigated in a model for part-of-speech assignment to words in unrestricted text. The model has the advantage that a pre-tagged training corpus is not required. Words are represented by equivalence classes to reduce the number of parameters required and provide an essentially vocabulary-independent model. State chains are used to model selective higher-order conditioning in the model, which obviates the proliferation of parameters attendant in uniformly higher-order models. The structure of the state chains is based on both an analysis of errors and linguistic knowledge. Examples show how word dependency across phrases can be modeled.