A natural language processing infrastructure for Turkish

  • Authors:
  • A. C. Cem Say;Şeniz Demir;Özlem Çetinoǧlu;Fatih Öǧün

  • Affiliations:
  • Bogaziçi University, Bebek, Istanbul;Bogaziçi University Bebek, Istanbul;Sabanci University, Tuzla, Istanbul;Bogaziçi University, Bebek, Istanbul

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

We built an open-source software platform intended to serve as a common infrastructure that can be of use in the development of new applications involving the processing of Turkish. The platform incorporates a lexicon, a morphological analyzer/generator, and a DCG parser/generator that translates Turkish sentences to predicate logic formulas, and a knowledge base framework. Several developers have already utilized the platform for a variety of applications, including conversation programs and an artificial personal assistant, tools for automatic analysis of rhyme and meter in Turkish folk poems, a prototype sentence-level translator between Albanian, Turkish, and English, natural language interfaces for generating SQL queries and JAVA code, as well as a text tagger used for collecting statistics about Turkish morpheme order for a speech recognition algorithm. The results indicate the adaptability of the infrastructure to different kinds of applications and how it facilitates improvements and modifications.