Using N-grams to Process Hindi Queries with Transliteration Variations

  • Authors:
  • Anand Natrajan;Allison L. Powell;James C. French

  • Affiliations:
  • -;-;-

  • Venue:
  • Using N-grams to Process Hindi Queries with Transliteration Variations
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

Retrieval systems based on N-grams have been used as alternatives to word-based systems. N-grams offer a language-independent technique that allows retrieval based on portions of words. A query that contains misspellings or differences in transliteration can defeat word-based systems. N-gram systems are more resistant to these problems. We present a retrieval system based on N-grams that uses a collection of Hindi songs. Within this retrieval system, we study the effect of varying N on retrievability. Additionally, we present an alternative spell-checking tool based on N- grams. We conclude with a discussion of the number of N-grams produced by different values of N for different languages and a discussion of the choice of N.