Experiments in Japanese text retrieval and routing using the NEAT system
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Explorative multilingual text retrieval based on fuzzy multilingual keyword classification
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Hi-index | 0.00 |
The recent enormous increase in the use of networked information access and on-line databases has led to more databases being available in languages other than English. The Center for Intelligent Information Retrieval (CIIR) at the University of Massachusetts is involved in a variety of industrial, government, and digital library applications which have a need for multilingual text retrieval. Most information retrieval research, however, has been evaluated using English databases and queries, and relatively little is known about how well advanced statistical techniques that incorporate ranking and term weighting perform in different languages.We describe our experience with a range of projects involving text retrieval in Spanish, Japanese and Chinese. The issues covered by these projects include document representation techniques such as morphology and segmentation, query formulation and expansion techniques, relevance feedback, and comparisons of retrieval effectiveness with English databases. The results indicate that advanced statistical techniques are effective in a wide range of languages, and that new languages can be incorporated with only moderate effort.