Tagging text with a probabilistic model

Authors:
B. Merialdo
Affiliations:
IBM France Sci. Center, Paris, France
Venue:
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Year:
1991

Citing 0
Cited 5

Statistical Part-of-Speech Tagging for Classical Chinese

TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
The importance of the lexicon in tagging biological text

Natural Language Engineering
Part of speech tagging in context

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Prototype-driven learning for sequence models

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
EM works for pronoun anaphora resolution

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Experiments on the use of a probabilistic model to tag English text, that is, to assign to each word the correct tag (part of speech) in the context of the sentence, are presented. A simple triclass Markov model is used, and the best way to estimate the parameters of this model, depending on the kind and amount of training data that is provided, is found. Two approaches are compared: the use of text that has been tagged by hand and comparing relative frequency counts; and use text without tags and training the model as a hidden Markov process, according to a maximum likelihood principle. Experiments show that the best training is obtained by using as much tagged text as is available, a maximum likelihood training may improve the accuracy of the tagging.