Multilingual extraction and mapping of dictionary entry names in business schema integration

  • Authors:
  • Michael Dietrich;Dirk Weissmann;Jörg Rech;Gunther Stuhec

  • Affiliations:
  • SAP Research, Karlsruhe, Germany;SAP AG, Walldorf, Germany;SAP Research, Karlsruhe, Germany;SAP AG, Walldorf, Germany

  • Venue:
  • Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Being a research field for many years, natural language processing (NLP) has gained a lot of attention in recent times due to the quality of translation software like Google Translator or Babylon. In the context of the iGreen project, we are developing a schema integration service (working title: Warp 10), which helps to generate business transformations from individual business schemata. To match the different schemas natural language processing plays an important role. Information coming from user input or files is extracted and mapped to a canonical data model (CDM) based on the CCTS standard. In this paper, we illustrate the use of NLP for the extraction of dictionary entry names (DENs) and indicate some of the problems of NLP and term extraction. Furthermore, we describe the NLP-supported mapping of DENs and outline the problems and approaches in a multilingual setting.