Hindi Urdu machine transliteration using finite-state transducers

  • Authors:
  • M. G. Abbas Malik;Christian Boitet;Pushpak Bhattacharyya

  • Affiliations:
  • Université Joseph Fourier, France;Université Joseph Fourier, France;IIT Bombay, India

  • Venue:
  • COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finite-state Transducers (FST) can be very efficient to implement inter-dialectal transliteration. We illustrate this on the Hindi and Urdu language pair. FSTs can also be used for translation between surface-close languages. We introduce UIT (universal intermediate transcription) for the same pair on the basis of their common phonetic repository in such a way that it can be extended to other languages like Arabic, Chinese, English, French, etc. We describe a transliteration model based on FST and UIT, and evaluate it on Hindi and Urdu corpora.