The String-to-String Correction Problem
Journal of the ACM (JACM)
Translation of web queries using anchor text mining
ACM Transactions on Asian Language Information Processing (TALIP)
Using Bilingual Web Data to Mine and Rank Translations
IEEE Intelligent Systems
Computational Linguistics
Automatic English-Chinese name transliteration for development of multilingual resources
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Proper name translation in cross-language information retrieval
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Translating unknown queries with web corpora for cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using the web for automated translation extraction in cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Mining translations of OOV terms from the web through cross-lingual query expansion
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Backward machine transliteration by learning phonetic similarity
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Transliteration of proper names in cross-lingual information retrieval
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Translating–transliterating named entities for multilingual information access
Journal of the American Society for Information Science and Technology
Multitype Features Coselection for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
An ensemble of transliteration models for information retrieval
Information Processing and Management: an International Journal
ACM Transactions on Asian Language Information Processing (TALIP)
A joint source-channel model for machine transliteration
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Measuring similarity between transliterations against noise data
ACM Transactions on Asian Language Information Processing (TALIP)
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Chinese-English term translation mining based on semantic prediction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A phonetic similarity model for automatic extraction of transliteration pairs
ACM Transactions on Asian Language Information Processing (TALIP)
Translating names and technical terms in Arabic text
Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
Named entity translation with web mining and transliteration
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Comparison of ensemble classifiers in extracting synonymous Chinese transliteration pairs from web
ICSI'11 Proceedings of the Second international conference on Advances in swarm intelligence - Volume Part II
Dealing with orthographic variation in a tagger-lemmatizer for fourteenth century Dutch charters
Language Resources and Evaluation
Hi-index | 0.00 |
The World Wide Web has been considered one of the important sources for information. Using search engines to retrieve Web pages can gather lots of information, including foreign information. However, to be better understood by local readers, proper names in a foreign language, such as English, are often transliterated to a local language such as Chinese. Due to different translators and the lack of translation standard, translating foreign proper nouns may result in different transliterations and pose a notorious headache. In particular, it may cause incomplete search results. Using one transliteration as a query keyword will fail to retrieve the Web pages which use a different word as the transliteration. Consequently, important information may be missed. We present a framework for mining synonymous transliterations as many as possible from the Web for a given transliteration. The results can be used to construct a database of synonymous transliterations which can be utilized for query expansion so as to alleviate the incomplete search problem. Experimental results show that the proposed framework can effectively retrieve the set of snippets which may contain synonymous transliterations and then extract the target terms. Most of the extracted synonymous transliterations have higher rank of similarity to the input transliteration compared to other noise terms.