Identifying word correspondence in parallel texts
HLT '91 Proceedings of the workshop on Speech and Natural Language
Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
ACM Computing Surveys (CSUR)
Termight: Coordinating Humans and Machines in Bilingual Terminology Acquisition
Machine Translation
Parallel Strands: A Preliminary Investigation into Mining the Web for Bilingual Text
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics - Special issue on using large corpora: I
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
A simple hybrid aligner for generating lexical correspondences in parallel texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An IR approach for translating new words from nonparallel, comparable texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
An algorithm for finding noun phrase correspondences in bilingual corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
K-vec: a new approach for aligning parallel texts
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Towards automatic extraction of monolingual and bilingual terminology
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic detection of omissions in translations
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Mining the Web for bilingual text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Hi-index | 0.00 |
The use of corpora has become an important issue in IE. In this chapter we consider a specific type of corpus, the bilingual parallel corpus, and ways of automatically extracting information from such corpora. This information, "linguistic metaknowledge", is essential for techniques such as tokenization, POS-tagging, morphological analysis, used in IE. Where we wish to extract information from multilingual texts, we must rely on these linguistic resources being available in several languages. This chapter discusses locating and storing parallel texts, alignment at various levels (sentence, word, phrase), and extraction of bilingual vocabulary and terminology.