A statistical approach to machine translation
Computational Linguistics
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Structured translation for cross-language information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Improving cross language retrieval with triangulated translation
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Fuzzy translation of cross-lingual spelling variants
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Translating unknown queries with web corpora for cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using the web for automated translation extraction in cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Transitive dictionary translation challenges direct dictionary translation in CLIR
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Translating cross-lingual spelling variants using transformation rules
Information Processing and Management: an International Journal
FITE-TRT: a high quality translation technique for OOV words
Proceedings of the 2006 ACM symposium on Applied computing
Machine transliteration survey
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
We devised a novel statistical technique for the identification of the translation equivalents of source words obtained by transformation rule based translation (TRT). The effectiveness of the technique called frequency-based identification of translation equivalents (FITE) was tested using biological and medical cross-lingual spelling variants and out-of-vocabulary (OOV) words in Spanish-English and Finnish-English TRT. The results showed that, depending on the source language and frequency corpus, FITE-TRT (the identification of translation equivalents from TRT's translation set by means of the FITE technique) may achieve high translation recall. In the case of the Web as the frequency corpus, translation recall was 89.2%--91.0% for Spanish-English FITE-TRT. For both language pairs FITE-TRT achieved high translation precision: 95.0%--98.8%. The technique also reliably identified native source language words: source words that cannot be correctly translated by TRT. Dictionary-based CLIR augmented with FITE-TRT performed substantially better than basic dictionary-based CLIR where OOV keys were kept intact. FITE-TRT with Web document frequencies was the best technique among several fuzzy translation/matching approaches tested in cross-language retrieval experiments. We also discuss the application of FITE-TRT in the automatic construction of multilingual dictionaries.