Markov random field based English part-of-speech tagging system

Authors:
Sung-Young Jung;Young C. Park;Key-Sun Choi;Youngwhan Kim
Affiliations:
Korea Advanced Institute of Science and Technology, Taejon, Korea;Korea Advanced Institute of Science and Technology, Taejon, Korea;Korea Advanced Institute of Science and Technology, Taejon, Korea;Multimedia Research Laboratories, Korea Telecom
Venue:
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Year:
1996

Citing 11
Cited 3

Natural Language Modeling for Phoneme-to-Text Transcription

IEEE Transactions on Pattern Analysis and Machine Intelligence
Self-organized language modeling for speech recognition

Readings in speech recognition
Parallel and Deterministic Algorithms from MRFs: Surface Reconstruction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hidden Markov Models for Speech Recognition

Hidden Markov Models for Speech Recognition
Inducing Features of Random Fields

Inducing Features of Random Fields
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Neural network approach to word category prediction for English texts

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3

An English to Korean transliteration model of extended Markov window

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Korean language engineering: current status of the information platform

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
A Statistical Model for User Preference

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Probabilistic models have been widely used for natural language processing. Part-of-speech tagging, which assingns the most likely tag to each word in a given sentence, is one of the problems which can be solved by statistical approach. Many researchers have tried to solve the problem by hidden Markov model (HMM), which is well known as one of the statistical models. But it has many difficulties: integrating heterogeneous information, coping with data sparseness problem, and adapting to new environments. In this paper, we propose a Markov radom field (MRF) model based approach to the tagging problem. The MRF provides the base frame to combine various statistical information with maximum entropy (ME) method. As Gibbs distribution can be used to describe a posteriori probability of tagging, we use it in maximum a posteriori (MAP) estimation of optimizing process. Besides, several tagging models are developed to show the effect of adding information. Experimental results show that the performance of the tagger gets improved as we add more statistical information, and that MRF-based tagging model is better than HMM based tagging in data sparseness problem.