TectoMT: modular NLP framework

  • Authors:
  • Martin Popel;Zdeněk Žabokrtský

  • Affiliations:
  • Charles University in Prague, Institute of Formal and Applied Linguistics;Charles University in Prague, Institute of Formal and Applied Linguistics

  • Venue:
  • IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the present paper we describe TectoMT, a multi-purpose open-source NLP framework. It allows for fast and efficient development of NLP applications by exploiting a wide range of software modules already integrated in TectoMT, such as tools for sentence segmentation, tokenization, morphological analysis, POS tagging, shallow and deep syntax parsing, named entity recognition, anaphora resolution, tree-to-tree translation, natural language generation, word-level alignment of parallel corpora, and other tasks. One of the most complex applications of TectoMT is the English-Czech machine translation system with transfer on deep syntactic (tectogrammatical) layer. Several modules are available also for other languages (German, Russian, Arabic).Where possible, modules are implemented in a language-independent way, so they can be reused in many applications.