Aligning Multiword Terms Using a Hybrid Approach

Authors:
Arantza Casillas;Raquel Martínez
Affiliations:
-;-
Venue:
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Year:
2002

Citing 10
Cited 1

Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
Termight: identifying and translating technical terminology

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Automating the acquisition of bilingual terminology

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
The computational lexical semantics of syntagmatic relations

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
K-vec: a new approach for aligning parallel texts

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Building an MT dictionary from parallel texts based on linguistic and statistical information

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Towards automatic extraction of monolingual and bilingual terminology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Learning translation templates from bilingual text

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finding structural correspondences from bilingual parsed corpus for corpus-based translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Acquisition of phrase-level bilingual correspondence using dependency structure

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2

A bio-inspired approach for multi-word expression extraction

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the context of parallel corpus alignment research between a pair of languages with various and important distinguishing factors (e.g., structural, lexical, morpho-syntactical), this paper presents an approach that deals with multiword terms alignment. Our system, ALINTEC, implements a hybrid strategy that adds various kinds of linguistic knowledge (an aligned corpus at the sentence level, POS tagging, grammatical patterns, and a bilingual glossary) to quantitative criteria such as frequency and distribution of terms in the corpus. The experiments were undertaken on a parallel corpus consisting on a collection of administrative and legal documents in Spanish and Basque. This pair of languages is representative of the context in which our work is framed. The results show that our approach obtains reasonably good results in aligning terms of a pair of languages of different typology such as Spanish and Basque.