Making MIRACLEs: Interactive translingual search for Cebuano and Hindi
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
Retrieval systems based on N-grams have been used as alternatives to word-based systems. N-grams offer a language-independent technique that allows retrieval based on portions of words. A query that contains misspellings or differences in transliteration can defeat word-based systems. N-gram systems are more resistant to these problems. We present a retrieval system based on N-grams that uses a collection of Hindi songs. Within this retrieval system, we study the effect of varying N on retrievability. Additionally, we present an alternative spell-checking tool based on N- grams. We conclude with a discussion of the number of N-grams produced by different values of N for different languages and a discussion of the choice of N.