Identifying the Truth: Aggregation of Named Entity Extraction Results

  • Authors:
  • Katja Pfeifer;Johannes Meinecke

  • Affiliations:
  • SAP AG, Chemnitzer Str. 48, 01187 Dresden, Germany;SAP AG, Chemnitzer Str. 48, 01187 Dresden, Germany

  • Venue:
  • Proceedings of International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Huge amounts of textual information relevant for market analysis, trending or product monitoring can be found on the Web. To exploit that knowledge a number of extraction services were proposed that extract and categorize entities from given text. Prior work showed that a combination of individual extractors can increase quality. However, so far no system exists that is fully applicable to reasonably combine real world extraction services that differ substantially in the entity types they extract and the schemata used. In this paper, we propose an aggregation system and a corresponding aggregation process that can be used for these services. We present a number of novel aggregation techniques that incorporate schema-information as well as entity extraction specific characteristics into the aggregation process. The aggregation system is broadly evaluated on six real world named entity recognition services and compared to state of the art approaches.