A weighting scheme for open information extraction

  • Authors:
  • Yuval Merhav

  • Affiliations:
  • Illinois Institute of Technology, Chicago, IL

  • Venue:
  • NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the problem of extracting all possible relations among named entities from unstructured text, a task known as Open Information Extraction (Open IE). A state-of-the-art Open IE system consists of natural language processing tools to identify entities and extract sentences that relate such entities, followed by using text clustering to identify the relations among co-occurring entity pairs. In particular, we study how the current weighting scheme used for Open IE affects the clustering results and propose a term weighting scheme that significantly improves on the state-of-the-art in the task of relation extraction both when used in conjunction with the standard tf. idf scheme, and also when used as a pruning filter.