Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
A comparison of indexing techniques for Japanese text retrieval
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Using n-grams for Korean text retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Comparing representations in Chinese information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese text retrieval without using a dictionary
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Previous studies have reported that bigrams work well for many Asian language including Chinese, Korean and Japanese. Most of these studies have focused on newspaper texts. We report an experiment with a very different genre (technical abstracts) and find performance can be improved by combining both short and long ngrams. It is a sound approach to work with all ngrams of all lengths since we will have more information than that of bigrams.