Hybrid neuro and rule-based part of speech taggers

  • Authors:
  • Qing Ma;Masaki Murata;Kiyotaka Uchimoto;Hitoshi Isahara

  • Affiliations:
  • Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan;Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan;Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan;Communications Research Laboratory, Ministry of Posts and Telecommunications, Kobe, Japan

  • Venue:
  • COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

A hybrid system for tagging part of speech is described that consists of a neuro tagger and a rule-based corrector. The neuro tagger is an initial-state annotator that uses different lengths of context based on longest context priority. Its inputs are weighted by information gains that are obtained by information maximization. The rule-based corrector is constructed by a set of transformation rules to make up for the shortcomings of the neuro tagger. Computer experiments show that almost 20% of the errors made by the neuro tagger are corrected by these transformation rules, so that the hybrid system can reach an accuracy of 95.5% counting only the ambiguous words and 99.1% counting all words when a small Thai corpus with 22,311 ambiguous words is used for training. This accuracy is far higher than that using an HMM and is also higher than that using a rule-based model.