Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The lie detector: explorations in the automatic recognition of deceptive language
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Towards cross-lingual textual entailment
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Creating speech and language data with Amazon's Mechanical Turk
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Using bilingual parallel corpora for cross-lingual textual entailment
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
CoSyne: a framework for multilingual content synchronization of wikis
Proceedings of the 7th International Symposium on Wikis and Open Collaboration
Towards strict sentence intersection: decoding and evaluation strategies
MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Divide and conquer: crowdsourcing the creation of cross-lingual textual entailment corpora
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Crowdsourcing research opportunities: lessons from natural language processing
Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies
JU_CSE_NLP: language independent cross-lingual textual entailment system
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
FBK: cross-lingual textual entailment without translation
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Detecting semantic equivalence and information disparity in cross-lingual documents
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
This paper reports on experiments in the creation of a bi-lingual Textual Entailment corpus, using non-experts' workforce under strict cost and time limitations ($100, 10 days). To this aim workers have been hired for translation and validation tasks, through the Crowd-Flower channel to Amazon Mechanical Turk. As a result, an accurate and reliable corpus of 426 English/Spanish entailment pairs has been produced in a more cost-effective way compared to other methods for the acquisition of translations based on crowdsourcing. Focusing on two orthogonal dimensions (i.e. reliability of annotations made by non experts, and overall corpus creation costs), we summarize the methodology we adopted, the achieved results, the main problems encountered, and the lessons learned.