Open-Source portuguese–spanish machine translation

Authors:
Carme Armentano-Oller;Rafael C. Carrasco;Antonio M. Corbí-Bellot;Mikel L. Forcada;Mireia Ginestí-Rosell;Sergio Ortiz-Rojas;Juan Antonio Pérez-Ortiz;Gema Ramírez-Sánchez;Felipe Sánchez-Martínez;Miriam A. Scalco
Affiliations:
Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain;Transducens Group, Departament de Llenguatges i Sistemes Informàtics, Universitat d'Alacant, Alacant, Spain
Venue:
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Year:
2006

Citing 1
Cited 15

A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing

Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation

Machine Translation
On the Automatic Learning of Bilingual Resources: Some Relevant Factors for Machine Translation

SBIA '08 Proceedings of the 19th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
Using target-language information to train part-of-speech taggers for machine translation

Machine Translation
Inferring shallow-transfer machine translation rules from small parallel corpora

Journal of Artificial Intelligence Research
Statistically-driven alignment-based multiword expression identification for technical domains

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Learning Spanish-Galician translation equivalents using a comparable corpus and a bilingual dictionary

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Factored translation between Brazilian Portuguese and English

SBIA'10 Proceedings of the 20th Brazilian conference on Advances in artificial intelligence
English to Sanskrit machine translation

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Is singular value decomposition useful for word similarity extraction?

Language Resources and Evaluation
Matxin, an open-source rule-based machine translation system for Basque

Machine Translation
Speeding up target-language driven part-of-speech tagger training for machine translation

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Using alignment templates to infer shallow-transfer machine translation rules

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
A hybrid approach for multiword expression identification

PROPOR'10 Proceedings of the 9th international conference on Computational Processing of the Portuguese Language
Automatic generation of bilingual dictionaries using intermediary languages and comparable corpora

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Extraction of bilingual cognates from wikipedia

PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the current status of development of an open-source shallow-transfer machine translation (MT) system for the [European] Portuguese $\leftrightarrow$ Spanish language pair, developed using the OpenTrad Apertium MT toolbox (www.apertium.org). Apertium uses finite-state transducers for lexical processing, hidden Markov models for part-of-speech tagging, and finite-state-based chunking for structural transfer, and is based on a simple rationale: to produce fast, reasonably intelligible and easily correctable translations between related languages, it suffices to use a MT strategy which uses shallow parsing techniques to refine word-for-word MT. This paper briefly describes the MT engine, the formats it uses for linguistic data, and the compilers that convert these data into an efficient format used by the engine, and then goes on to describe in more detail the pilot Portuguese$\leftrightarrow$Spanish linguistic data.