A Data Structure Using Hashing and Tries For Efficient Chinese Lexical Access

  • Authors:
  • Yat-Kin LAM;Qiang HUO

  • Affiliations:
  • University of Hong Kong;University of Hong Kong

  • Venue:
  • ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lexicon is needed in many applications. In the past, different structures such as tries, hash tables and their variants have been investigated for lexicon organization and lexical access. In this paper, we propose a new data structure that combines the use of hash table and tries for storing a Chinese lexicon. The data structure facilitates an efficient lexical access yet requires less memory than that of a trie lexicon. Experiments are conducted to evaluate its performance for in-vocabulary lexical access, out-of-vocabulary word rejection, and substring matching. The effectiveness of the proposed approach is confirmed.