Harnessing NLP techniques in the processes of multilingual content management

  • Authors:
  • Anelia Belogay;Svetla Koeva;Adam Przepiórkowski;Dan Cristea;Diman Karagyozov;Cristina Vertan;Polivios Raxis

  • Affiliations:
  • Tetracom IS Ltd.;Institute for Bulgarian Language;Instytut Podstaw Informatyki Polskiej, Akademii Nauk;Universitatea Alexandru Ioan Cuza;Tetracom IS Ltd.;Universitaet Hamburg;Atlantis Consulting SA

  • Venue:
  • EACL '12 Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The emergence of the WWW as the main source of distributing content opened the floodgates of information. The sheer volume and diversity of this content necessitate an approach that will reinvent the way it is analysed. The quantitative route to processing information which relies on content management tools provides structural analysis. The challenge we address is to evolve from the process of streamlining data to a level of understanding that assigns value to content. We present an open-source multilingual platform ATALS that incorporates human language technologies in the process of multilingual web content management. It complements a content management software-as-a-service component i-Publisher, used for creating, running and managing dynamic content-driven websites with a linguistic platform. The platform enriches the content of these websites with revealing details and reduces the manual work of classification editors by automatically categorising content. The platform ASSET supports six European languages. We expect ASSET to serve as a basis for future development of deep analysis tools capable of generating abstractive summaries and training models for decision making systems.