Part-of-speech tagging based on hidden Markov model assuming joint independence

Authors:
Sang-Zoo Lee;Jun-ichi Tsujii;Hae-Chang Rim
Affiliations:
University of Tokyo, Hongo, Tokyo, Japan;University of Tokyo, Hongo, Tokyo, Japan;Korea University, Anam-Dong, Korea
Venue:
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Year:
2000

Citing 4
Cited 3

Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Building probabilistic models for natural language

Building probabilistic models for natural language
Machine Learning

Machine Learning
Part-of-speech tagging with neural networks

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1

Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Part-of-speech tagging using virtual evidence and negative training

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improving Korean verb-verb morphological disambiguation using lexical knowledge from unambiguous unlabeled data and selective web counts

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present part-of-speech taggers based on hidden Markov models, which adopt a less strict Markov assumption to consider rich contexts. In models whose parameters are very specific like lexicalized ones, sparse-data problem is very serious and also conditional probabilities tend to be estimated unreliably. To overcome data-sparseness, a simplified version of the well-known back-off smoothing method is used. To mitigate unreliable estimation problem, our models assume joint independence instead of conditional independence because joint probabilities have the same degree of estimation reliability. In experiments for the Brown corpus, models with rich contexts achieve relatively high accuracy and some models assuming joint independence show better results than the corresponding HMMs.