Automatic transliteration for Japanese-to-English text retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Extracting named entity translingual equivalence with limited resources
ACM Transactions on Asian Language Information Processing (TALIP)
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Paraphrasing with bilingual parallel corpora
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning transliteration lexicons from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Named entity transliteration and discovery from multilingual comparable corpora
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Mining the Web for Transliteration Lexicons: Joint-Validation Approach
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
A phonetic similarity model for automatic extraction of transliteration pairs
ACM Transactions on Asian Language Information Processing (TALIP)
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Automated mining of names using parallel Hindi-English corpus
ALR7 Proceedings of the 7th Workshop on Asian Language Resources
Hitting the right paraphrases in good time
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Report of NEWS 2010 transliteration mining shared task
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration generation and mining with limited training resources
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration mining with phonetic conflation and iterative training
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Language independent transliteration mining system using finite state automata framework
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Mining name translations from entity graph mapping
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Hashing-based approaches to spelling correction of personal names
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Transliteration equivalence using canonical correlation analysis
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Transliteration mining using large training and test sets
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
A statistical model for unsupervised and semi-supervised transliteration mining
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
Mining of transliterations from comparable or parallel text can enhance natural language processing applications such as machine translation and cross language information retrieval. This paper presents an enhanced transliteration mining technique that uses a generative graph reinforcement model to infer mappings between source and target character sequences. An initial set of mappings are learned through automatic alignment of transliteration pairs at character sequence level. Then, these mappings are modeled using a bipartite graph. A graph reinforcement algorithm is then used to enrich the graph by inferring additional mappings. During graph reinforcement, appropriate link reweighting is used to promote good mappings and to demote bad ones. The enhanced transliteration mining technique is tested in the context of mining transliterations from parallel Wikipedia titles in 4 alphabet-based languages pairs, namely English-Arabic, English-Russian, English-Hindi, and English-Tamil. The improvements in F1-measure over the baseline system were 18.7, 1.0, 4.5, and 32.5 basis points for the four language pairs respectively. The results herein outperform the best reported results in the literature by 2.6, 4.8, 0.8, and 4.1 basis points for the four language pairs respectively.