Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Using n-grams for Korean text retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Effective foreign word extration for Korean information retrieval
Information Processing and Management: an International Journal
An IR approach for translating new words from nonparallel, comparable texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Anchor text mining for translation of Web queries: A transitive translation approach
ACM Transactions on Information Systems (TOIS)
Translating unknown queries with web corpora for cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using the web for automated translation extraction in cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using mutual information to resolve query translation ambiguities and query term weighting
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Mining the Web for bilingual text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Mining translations of OOV terms from the web through cross-lingual query expansion
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Using the web as a bilingual dictionary
DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Learning inter-related statistical query translation models for English-Chinese bi-directional CLIR
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Hi-index | 0.00 |
For historical and cultural reasons, English phases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.