Cross-lingual slot filling from comparable corpora

Authors:
Matthew Snover;Xiang Li;Wen-Pin Lin;Zheng Chen;Suzanne Tamang;Mingmin Ge;Adam Lee;Qi Li;Hao Li;Sam Anzaroot;Heng Ji
Affiliations:
City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY;City University of New York, New York, NY
Venue:
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Year:
2011

Citing 21
Cited 1

Exploiting redundancy in question answering

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
An IR approach for translating new words from nonparallel, comparable texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Is it the right answer?: exploiting web redundancy for Answer Validation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Mining comparable bilingual text corpora for cross-language information integration

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Named entity transliteration with comparable corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Exploring correlation of dependency relation paths for answer extraction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Mining new word translations from comparable corpora

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Cross-lingual information extraction system evaluation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Named entity transliteration and discovery from multilingual comparable corpora

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
MINT: a method for effective and scalable mining of named entity transliterations from large comparable corpora

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Freebase: a shared database of structured general human knowledge

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
A probabilistic model of redundancy in information extraction

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Distant supervision for relation extraction without labeled data

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Mining name translations from comparable corpora by creating bilingual information networks

BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
A unified model of phrasal and sentential evidence for information extraction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Keyword translation accuracy and cross-lingual question answering in Chinese and Japanese

MLQA '06 Proceedings of the Workshop on Multilingual Question Answering
MT error detection for cross-lingual question answering

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Overview of the CLEF 2005 multilingual question answering track

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Chinese question-answering: comparing monolingual with english-chinese cross-lingual results

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology

Joint bilingual name tagging for parallel corpora

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a new task of crosslingual slot filling which aims to discover attributes for entity queries from crosslingual comparable corpora and then present answers in a desired language. It is a very challenging task which suffers from both information extraction and machine translation errors. In this paper we analyze the types of errors produced by five different baseline approaches, and present a novel supervised rescoring based validation approach to incorporate global evidence from very large bilingual comparable corpora. Without using any additional labeled data this new approach obtained 38.5% relative improvement in Precision and 86.7% relative improvement in Recall over several state-of-the-art approaches. The ultimate system outperformed monolingual slot filling pipelines built on much larger monolingual corpora.