Transliteration system using pair HMM with weighted FSTs

Authors:
Peter Nabende
Affiliations:
University of Groningen, Netherlands
Venue:
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Year:
2009

Citing 5
Cited 2

Machine transliteration

Computational Linguistics
A generic framework for machine transliteration

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Inducing sound segment differences using Pair Hidden Markov Models

SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Whitepaper of NEWS 2009 machine transliteration shared task

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Computing word similarity and identifying cognates with pair hidden Markov models

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Mining transliterations from Wikipedia using pair HMMs

NEWS '10 Proceedings of the 2010 Named Entities Workshop
A method for generating rules for cross-lingual transliteration

Automatic Documentation and Mathematical Linguistics

Quantified Score

Hi-index	0.05

Visualization

Abstract

This paper presents a transliteration system based on pair Hidden Markov Model (pair HMM) training and Weighted Finite State Transducer (WFST) techniques. Parameters used by WFSTs for transliteration generation are learned from a pair HMM. Parameters from pair-HMM training on English-Russian data sets are found to give better transliteration quality than parameters trained for WFSTs for corresponding structures. Training a pair HMM on English vowel bigrams and standard bigrams for Cyrillic Romanization, and using a few transformation rules on generated Russian transliterations to test for context improves the system's transliteration quality.