Does Baum-Welch re-estimation help taggers?

Authors:
David Elworthy
Affiliations:
Sharp Laboratories of Europe Ltd., Oxford, United Kingdom
Venue:
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Year:
1994

Citing 8
Cited 42

Grammatical category disambiguation by statistical optimization

Computational Linguistics
Probabilistic models of short and long distance word dependencies in running text

HLT '89 Proceedings of the workshop on Speech and Natural Language
Hidden Markov Models for Speech Recognition

Hidden Markov Models for Speech Recognition
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing

A Review of Statistical Language Processing Techniques

Artificial Intelligence Review
A Machine Learning Approach to POS Tagging

Machine Learning
Cohesive Generation of Syntactically Simplified Newspaper Text

TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
Exploitation of Unlabeled Sequences in Hidden Markov Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
A syntax-based part-of-speech analyser

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Parsing with an extended domain of locality

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Minimizing manual annotation cost in supervised training from corpora

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Improving part-of-speech tagging using lexicalized HMMs

Natural Language Engineering
A comparison of parsing technologies for the biomedical domain

Natural Language Engineering
Detecting novel compounds: the role of distributional evidence

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Using grammatical relations to compare parsers

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
XML-based data preparation for robust deep parsing

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Applying co-training methods to statistical parsing

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Improving subcategorization acquisition using word sense disambiguation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Lexicalized stochastic modeling of constraint-based grammars using log-linear measures and EM training

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

Computational Linguistics
XML-based NLP tools for analysing and annotating medical language

NLPXML '02 Proceedings of the 2nd workshop on NLP and XML - Volume 17
Bootstrapping POS taggers using unlabelled data

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Definitional, personal, and mechanical constraints on part of speech annotation performance

Natural Language Engineering
The importance of the lexicon in tagging biological text

Natural Language Engineering
Annealing techniques for unsupervised statistical language learning

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An unsupervised morpheme-based HMM for hebrew morphological disambiguation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Toward unsupervised whole-corpus tagging

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Part of speech tagging in context

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Semi-supervised learning of the hidden vector state model for extracting protein-protein interactions

Artificial Intelligence in Medicine
Part-of-speech tagging of modern hebrew text

Natural Language Engineering
A Fault Prediction Model with Limited Fault Data to Improve Test Process

PROFES '08 Proceedings of the 9th international conference on Product-Focused Software Process Improvement
Towards full automation of lexicon construction

CLS '04 Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics
Semi-supervised training of a statistical parser from unlabeled partially-bracketed data

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Refining the most frequent sense baseline

DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
Painless unsupervised learning with features

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
HMMs, GRs, and n-grams as lexical substitution techniques: are they portable to other languages?

MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Viterbi training improves unsupervised dependency parsing

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
A comparison of unsupervised methods for part-of-speech tagging in Chinese

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Unsupervised structure prediction with non-parallel multilingual guidance

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

ACM Transactions on Asian Language Information Processing (TALIP)
Disambiguating noun and verb senses using automatically acquired selectional preferences

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
A ruled-based part of speech (RPOS) tagger for malay text articles

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In part of speech tagging by Hidden Markov Model, a statistical model is used to assign grammatical categories to words in a text. Early work in the field relied on a corpus which had been tagged by a human annotator to train the model. More recently, Cutting et al. (1992) suggest that training can be achieved with a minimal lexicon and a limited amount of a priori information about probabilities, by using an Baum-Welch re-estimation to automatically refine the model. In this paper, I report two experiments designed to determine how much manual training information is needed. The first experiment suggests that initial biasing of either lexical or transition probabilities is essential to achieve a good accuracy. The second experiment reveals that there are three distinct patterns of Baum-Welch reestimation. In two of the patterns, the re-estimation ultimately reduces the accuracy of the tagging rather than improving it. The pattern which is applicable can be predicted from the quality of the initial model and the similarity between the tagged training corpus (if any) and the corpus to be tagged. Heuristics for deciding how to use re-estimation in an effective manner are given. The conclusions are broadly in agreement with those of Merialdo (1994), but give greater detail about the contributions of different parts of the model.