Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Program: Automated Library and Information Systems
Finding approximate matches in large lexicons
Software—Practice & Experience
Phonetic string matching: lessons from information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval effectiveness of proper name search methods
Information Processing and Management: an International Journal
An algorithm to align words for historical comparison
Computational Linguistics
Fuzzy translation of cross-lingual spelling variants
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic transliteration for Japanese-to-English text retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Computational Linguistics
Translating names and technical terms in Arabic text
Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
FITE-TRT: a high quality translation technique for OOV words
Proceedings of the 2006 ACM symposium on Applied computing
s-grams: Defining generalized n-grams for information retrieval
Information Processing and Management: an International Journal
ACM Transactions on Information Systems (TOIS)
Data driven methods for improving mono- and cross-lingual IR performance in noisy environments
Proceedings of the second workshop on Analytics for noisy unstructured text data
When Harry met Harri: cross-lingual name spelling normalization
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Recent Literature Collected by Didier DUBOIS, Henri PRADE and Salvatore SESSA
Fuzzy Sets and Systems
Machine transliteration survey
ACM Computing Surveys (CSUR)
English to persian transliteration
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
Technical terms and proper names constitute a major problem in dictionary-based cross-language information retrieval (CLIR). However, technical terms and proper names in different languages often share the same Latin or Greek origin, being thus spelling variants of each other. In this paper we present a novel two-step fuzzy translation technique for cross-lingual spelling variants. In the first step, transformation rules are applied to source words to render them more similar to their target language equivalents. The rules are generated automatically using translation dictionaries as source data. In the second step, the intermediate forms obtained in the first step are translated into a target language using fuzzy matching. The effectiveness of the technique was evaluated empirically using five source languages and English as a target language. The two-step technique performed better, in some cases considerably better, than fuzzy matching alone. Even using the first step as such showed promising results.