Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A program for aligning sentences in bilingual corpora
Computational Linguistics - Special issue on using large corpora: I
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
A maximum entropy approach to identifying sentence boundaries
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Bilingual text, matching using bilingual dictionary and statistics
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
ACM Transactions on Asian Language Information Processing (TALIP)
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Multi-language named-entity recognition system based on HMM
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Stemming to improve translation lexicon creation form bitexts
Information Processing and Management: an International Journal
Extracting parallel sub-sentential fragments from non-parallel corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A DOM tree alignment model for mining parallel data from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Left-to-right target generation for hierarchical phrase-based translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Integrating Cross-Language Hierarchies and Its Application to Retrieving Relevant Documents
ACM Transactions on Asian Language Information Processing (TALIP)
Validity of an Automatic Evaluation of Machine Translation Using a Word-Alignment-Based Classifier
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
On the use of comparable corpora to improve SMT performance
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
An alignment algorithm using belief propagation and a structure-based distortion model
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A simple sentence-level extraction algorithm for comparable data
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Automatic recognition of logical relations for English, Chinese and Japanese in the GLARF framework
DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
Imposing constraints from the source tree on ITG constraints for SMT
SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
A beam-search extraction algorithm for comparable data
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Exploiting comparable corpora with TER and TERp
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Transducing logical relations from automatic and manual GLARF
ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
An approach for extracting bilingual terminology from Wikipedia
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Automatic evaluation method for machine translation using noun-phrase chunking
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
An empirical study on web mining of parallel data
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automatic linguistic knowledge acquisition for web-based translation and language learning
Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
Enhancing language learning and translation with ubiquitous applications
Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia
Automatic evaluation of texts by using paraphrases
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Pervasive language learning on modern mobile devices
Journal of Mobile Multimedia
Parallel sentence generation from comparable corpora for improved SMT
Machine Translation
Practical translation pattern acquisition from combined language resources
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Efficient retrieval of tree translation examples for syntax-based machine translation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A machine learning-based evaluation method for machine translation
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
WILLIE: a web interface for a language learning and instruction environment
ICWL'07 Proceedings of the 6th international conference on Advances in web based learning
Detecting highly confident word translations from comparable corpora without any prior knowledge
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
WASSA '12 Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Mastering Japanese through Augmented Browsing
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
We have aligned Japanese and English news articles and sentences to make a large parallel corpus. We first used a method based on cross-language information retrieval (CLIR) to align the Japanese and English articles and then used a method based on dynamic programming (DP) matching to align the Japanese and English sentences in these articles. However, the results included many incorrect alignments. To remove these, we propose two measures (scores) that evaluate the validity of alignments. The measure for article alignment uses similarities in sentences aligned by DP matching and that for sentence alignment uses similarities in articles aligned by CLIR. They enhance each other to improve the accuracy of alignment. Using these measures, we have successfully constructed a large-scale article and sentence alignment corpus available to the public.