Assessing sparse information extraction using semantic contexts

  • Authors:
  • Peipei Li;Haixun Wang;Hongsong Li;Xindong Wu

  • Affiliations:
  • Hefei University of Technology, Hefei city, China;Microsoft Research Asia, Bei Jing, China;Microsoft Research Asia, Bei Jing, China;University of Vermont, Vermont, USA

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

One important assumption of information extraction is that extractions occurring more frequently are more likely to be correct. Sparse information extraction is challenging because no matter how big a corpus is, there are extractions supported by only a small amount of evidence in the corpus. A pioneering work known as REALM learns HMMs to model the context of a semantic relationship for assessing the extractions. This is quite costly and the semantics revealed for the context are not explicit. In this work, we introduce a lightweight, explicit semantic approach for sparse information extraction. We use a large semantic network consisting of millions of concepts, entities, and attributes to explicitly model the context of semantic relationships. Experiments show that our approach improves the F-score of extraction by at least 11.2% over state-of-the-art, HMM based approaches while maintaining more efficiency.