Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Learning translation templates from examples
Information Systems - Special issue on selected papers from 6th annual workshop on information technologies and systems, December 1996, Cleveland, Ohio, USA
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Template detection via data mining and its applications
Proceedings of the 11th international conference on World Wide Web
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Automatic Wrapper Generation for Multilingual Web Resources
DS '02 Proceedings of the 5th International Conference on Discovery Science
A systematic comparison of various statistical alignment models
Computational Linguistics
A Template-Based Methodology for Disaster Management Information Systems
HICSS '00 Proceedings of the 33rd Hawaii International Conference on System Sciences-Volume 1 - Volume 1
Empirical methods for exploiting parallel texts
Empirical methods for exploiting parallel texts
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
A program for aligning sentences in bilingual corpora
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Cross-lingual retrieval for Hindi
ACM Transactions on Asian Language Information Processing (TALIP)
A simple hybrid aligner for generating lexical correspondences in parallel texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An experiment in hybrid dictionary and statistical sentence alignment
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Aligning sentences in bilingual corpora using lexical information
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Structural matching of parallel texts
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Identifying word translations in non-parallel texts
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
K-vec: a new approach for aligning parallel texts
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Building an MT dictionary from parallel texts based on linguistic and statistical information
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Extracting word correspondences from bilingual corpora based on word co-occurrences information
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Extraction of lexical translations from non-aligned corpora
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Automatic identification of word translations from unrelated English and German corpora
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Study of practical effectiveness for machine translation using recursive chain-link-type learning
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A syntax-based statistical translation model
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information
Computational Linguistics
Efficient optimization for bilingual sentence alignment based on linear regression
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Template-based information mining from HTML documents
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Hi-index | 0.00 |
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system must deal with various languages. Therefore, automatic extraction of bilingual word pairs from parallel corpora with various languages is important. However, previous works based on statistical methods are insufficient because of the sparse data problem. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any prior preparation of a bilingual resource (e.g., a bilingual dictionary, a machine translation system). We call this learning method Inductive Chain Learning (ICL). Moreover, the system using ICL can extract bilingual word pairs even from bilingual sentence pairs for which the grammatical structures of the source language differ from the grammatical structures of the target language because the acquired rules have the information to cope with the different word orders of source language and target language in local parts of bilingual sentence pairs. Evaluation experiments demonstrated that the recalls of systems based on several statistical approaches were improved through the use of ICL.