Unknown Word Recognition Based on Maximal Cliques

  • Authors:
  • Hao Chen;Bo Xiao;ZhiQing Lin

  • Affiliations:
  • -;-;-

  • Venue:
  • CYBERC '11 Proceedings of the 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Unknown word recognition is a key issue in Chinese information processing. The traditional algorithms of unknown word recognition can be broadly classified into two types: the rule-based methods and the statistical methods. However, these algorithms have some limitations in identifying the unknown words which are created on Internet. The unknown words of Internet have no obvious rules and are composed of common words, so the rule-based methods have limitations in identifying them; while the statistical methods also have limitations in identifying them for they use mutual information. Therefore, this paper proposes an algorithm of unknown word recognition, which is based on the bigram model and uses the method of mining maximal cliques to identify the unknown words of Internet. Experimental results show that the algorithm achieves a higher accuracy than the traditional statistical methods that are based on the N-gram model.