Improving translation quality of rule-based machine translation

Authors:
Paisarn Charoenpornsawat;Virach Sornlertlamvanich;Thatsanee Charoenporn
Affiliations:
National Electronics and Computer Technology Center, Pathumthani, Thailand;National Electronics and Computer Technology Center, Pathumthani, Thailand;National Electronics and Computer Technology Center, Pathumthani, Thailand
Venue:
COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
Year:
2002

Citing 7
Cited 2

A statistical approach to machine translation

Computational Linguistics
C4.5: programs for machine learning

C4.5: programs for machine learning
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Adaptive sentence boundary disambiguation

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Example-Based Machine Translation in the Pangloss system

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic corpus-based Thai word extraction with the c4.5 learning algorithm

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Adapting an example-based translation system to Chinese

HLT '01 Proceedings of the first international conference on Human language technology research

Reducing grammar errors for translated english sentences

ICIC'11 Proceedings of the 7th international conference on Intelligent Computing: bio-inspired computing and applications
Statistical machine translation enhancements through linguistic levels: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes machine learning techniques, which help disambiguate word meaning. These methods focus on considering the relationship between a word and its surroundings, described as context information in the paper. Context information is produced from rule-based translation such as part-of-speech tags, semantic concept, case relations and so on. To automatically extract the context information, we apply machine learning algorithms which are C4.5, C4.5rule and RIPPER. In this paper, we test on ParSit, which is an interlingual-based machine translation for English to Thai. To evaluate our approach, an verb-to-be is selected because it has increased in frequency and it is quite difficult to be translated into Thai by using only linguistic rules. The result shows that the accuracy of C4.5, C4.5rule and RIPPER are 77.7%, 73.1% and 76.1% respectively whereas ParSit give accuracy only 48%.