A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for Knowledge Management
IEEE Internet Computing
Extracting parallel sentences from comparable corpora using document level alignment
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Cross language indexing and retrieval of the cypriot digital antiquities repository
Proceedings of the 2013 ACM symposium on Document engineering
Hi-index | 0.00 |
Bilingual dictionaries and the multilingual dictionaries are necessary resources for machine translation and cross language information retrieval. With the help of these dictionaries, an information retrieval system can find documents of similar content in different languages. Maintaining such dictionaries is an interesting research topic. Researchers can collect multilingual parallel corpora from the Internet and find the translation of new words. Therefore, the parallel corpora can help machine translation and cross language information retrieval. Sentence alignment of parallel corpora is a way to mine the necessary knowledge. But in the real world, a lot of the documents can be presented in comparable corpora. Therefore, we introduce the technique for the extraction of parallel sentences from Wikipedia as multilingual comparable corpora.