Comparison of word-based and syllable-based retrieval for Tibetan (poster session)

  • Authors:
  • Paul G. Hackett;Douglas W. Oard

  • Affiliations:
  • College of Information Studies, University of Maryland, College Park, MD;College of Information Studies, University of Maryland, College Park, MD

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tibetan retrieval based on automatically segmented words is compared with the use of overlapping syllable n-grams using a known-item retrieval evaluation. The optimal span of fixed-length n-grams is found to be 2 syllables, and indexing words is found to be as effective as indexing syllable bigrams.