Translation of web queries using anchor text mining
ACM Transactions on Asian Language Information Processing (TALIP)
Computational Linguistics - Special issue on web as corpus
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Translating unknown queries with web corpora for cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using the web for automated translation extraction in cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Detection and translation of OOV terms prior to query time
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Mining translations of OOV terms from the web through cross-lingual query expansion
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Concept unification of terms in different languages for IR
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Learning transliteration lexicons from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Active learning for constructing transliteration lexicons from the Web
Journal of the American Society for Information Science and Technology
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
English-Arabic proper-noun transliteration-pairs creation
Journal of the American Society for Information Science and Technology
Using English information in non-English web search
Proceedings of the 2nd ACM workshop on Improving non english web searching
Query Classification and Expansion for Translation Mining Via Search Engines
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Text data acquisition for domain-specific language models
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Mining translations of web queries from web click-through data
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
A comparison of different machine transliteration models
Journal of Artificial Intelligence Research
WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
QRpotato: a system that exhaustively collects bilingual technical term pairs from the web
Proceedings of the 3rd International Universal Communication Symposium
Mining bilingual data from the web with adaptively learnt patterns
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Chinese-English organization name translation based on correlative expansion
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
QRselect: a user-driven system for collecting translation document pairs from the web
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
SDDB: a self-dependent and data-based method for constructing bilingual dictionary from the web
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Parallel sentence generation from comparable corpora for improved SMT
Machine Translation
Mining OOV translations from mixed-language web pages for cross language information retrieval
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
The english unknown term translation mining with improved bilingual snippets collection strategy
ICIC'12 Proceedings of the 8th international conference on Intelligent Computing Theories and Applications
Translation techniques in cross-language information retrieval
ACM Computing Surveys (CSUR)
Toward statistical machine translation without parallel corpora
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Bilingual lexicon extraction from comparable corpora using label propagation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Key phrases are usually among the most information-bearing linguistic structures. Translating them correctly will improve many natural language processing applications. We propose a new framework to mine key phrase translations from web corpora. We submit a source phrase to a search engine as a query, then expand queries by adding the translations of topic-relevant hint words from the returned snippets. We retrieve mixed-language web pages based on the expanded queries. Finally, we extract the key phrase translation from the second-round returned web page snippets with phonetic, semantic and frequency-distance features. We achieve 46% phrase translation accuracy when using top 10 returned snippets, and 80% accuracy with 165 snippets. Both results are significantly better than several existing methods.