Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
wikiBABEL: community creation of multilingual data
WikiSym '08 Proceedings of the 4th International Symposium on Wikis
Language independent identification of parallel sentences using Wikipedia
Proceedings of the 20th international conference companion on World wide web
Multilingual schema matching for Wikipedia infoboxes
Proceedings of the VLDB Endowment
Analysis of discussion contributions in translated Wikipedia articles
Proceedings of the 4th international conference on Intercultural Collaboration
Hi-index | 0.00 |
In this demo, we present a wiki-style platform -- WikiBABEL -- that enables easy collaborative creation of multilingual content in many non-English Wikipedias, by leveraging the relatively larger and more stable content in the English Wikipedia. The platform provides an intuitive user interface that maintains the user focus on the multilingual Wikipedia content creation, by engaging search tools for easy discoverability of related English source material, and a set of linguistic and collaborative tools to make the content translation simple. We present two different usage scenarios and discuss our experience in testing them with real users. Such integrated content creation platform in Wikipedia may yield as a by-product, parallel corpora that are critical for research in statistical machine translation systems in many languages of the world.