Concept unification of terms in different languages for IR

  • Authors:
  • Qing Li;Sung-Hyon Myaeng;Yun Jin;Bo-yeong Kang

  • Affiliations:
  • Information & Communications University, Korea;Information & Communications University, Korea;Chungnam National University, Korea;Seoul National University, Korea

  • Venue:
  • ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as Chinese and Korean. Although these English terms and their equivalences in the Asian languages refer to the same concept, they are erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and suggests a novel technique to solve it. Our method firstly extracts an English phrase from Asian language Web pages, and then unifies the extracted phrase and its equivalence(s) in the language as one index unit. Experimental results show that the high precision of our conceptual unification approach greatly improves the IR performance.