Hybrid neuro and rule-based part of speech taggers

Authors:
Qing Ma;Masaki Murata;Kiyotaka Uchimoto;Hitoshi Isahara
Affiliations:
Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan;Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan;Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan;Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Year:
2000

Citing 5
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Tagging English text with a probabilistic model

Computational Linguistics
A multi-neuro tagger using variable lengths of contexts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Part-of-speech tagging with neural networks

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1

Comparison of three machine-learning methods for Thai part-of-speech tagging

ACM Transactions on Asian Language Information Processing (TALIP)
On-Line Error Detection of Annotated Corpus Using Modular Neural Networks

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

A hybrid system for tagging part of speech is described that consists of a neuro tagger and a rule-based corrector. The neuro tagger is an initial-state annotator that uses different lengths of context based on longest context priority. Its inputs are weighted by information gains that are obtained by information maximization. The rule-based corrector is constructed by a set of transformation rules to make up for the shortcomings of the neuro tagger. Computer experiments show that almost 20% of the errors made by the neuro tagger are corrected by these transformation rules, so that the hybrid system can reach an accuracy of 95.5% counting only the ambiguous words and 99.1% counting all words when a small Thai corpus with 22,311 ambiguous words is used for training. This accuracy is far higher than that using an HMM and is also higher than that using a rule-based model.