Applications of Multilingual Text Retrieval

  • Authors:
  • W. Bruce Croft;John Broglio;Hideo Fujii

  • Affiliations:
  • -;-;-

  • Venue:
  • HICSS '96 Proceedings of the 29th Hawaii International Conference on System Sciences Volume 5: Digital Documents
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recent enormous increase in the use of networked information access and on-line databases has led to more databases being available in languages other than English. The Center for Intelligent Information Retrieval (CIIR) at the University of Massachusetts is involved in a variety of industrial, government, and digital library applications which have a need for multilingual text retrieval. Most information retrieval research, however, has been evaluated using English databases and queries, and relatively little is known about how well advanced statistical techniques that incorporate ranking and term weighting perform in different languages.We describe our experience with a range of projects involving text retrieval in Spanish, Japanese and Chinese. The issues covered by these projects include document representation techniques such as morphology and segmentation, query formulation and expansion techniques, relevance feedback, and comparisons of retrieval effectiveness with English databases. The results indicate that advanced statistical techniques are effective in a wide range of languages, and that new languages can be incorporated with only moderate effort.