Turning web text and search queries into factual knowledge: hierarchical class attribute extraction

  • Authors:
  • Marius Pasca

  • Affiliations:
  • Google Inc., Mountain View, California

  • Venue:
  • AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A seed-based framework for textual information extraction allows for weakly supervised acquisition of open-domain class attributes over conceptual hierarchies, from a combination of Web documents and query logs. Automatically-extracted labeled classes, consisting of a label (e.g., painkillers) and an associated set of instances (e.g., vicodin, oxycontin), are linked under existing conceptual hierarchies (e.g., brain disorders and skin diseases are linked under the concepts BrainDisorder and SkinDisease respectively). Attributes extracted for the labeled classes are propagated upwards in the hierarchy, to determine the attributes of hierarchy concepts (e.g., Disease) from the attributes of their subconcepts (e.g., BrainDisorder and SkinDisease).