Lexicon-based orthographic disambiguation in CJK intelligent information retrieval

  • Authors:
  • Jack Halpern

  • Affiliations:
  • The CJK Dictionary Institute, Saitama, Japan

  • Venue:
  • COLING '02 Proceedings of the 3rd workshop on Asian language resources and international standardization - Volume 12
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The orthographical complexity of Chinese, Japanese and Korean (CJK) poses a special challenge to the developers of computational linguistic tools, especially in the area of intelligent information retrieval. These difficulties are exacerbated by the lack of a standardized orthography in these languages, especially the highly irregular Japanese orthography. This paper focuses on the typology of CJK orthographic variation, provides a brief analysis of the linguistic issues, and discusses why lexical databases should play a central role in the disambiguation process.