Automated feature generation from structured knowledge

  • Authors:
  • Weiwei Cheng;Gjergji Kasneci;Thore Graepel;David Stern;Ralf Herbrich

  • Affiliations:
  • University of Marburg, Marburg, Germany;Microsoft Research, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom;Microsoft Research, Cambridge, United Kingdom

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The prediction accuracy of any learning algorithm highly depends on the quality of the selected features; but often, the task of feature construction and selection is tedious and nonscalable. In recent years, however, there have been numerous projects with the goal of constructing general-purpose or domain-specific knowledge bases with entity-relationship-entity triples extracted from various Web sources or collected from user communities, e.g. YAGO, DBpedia, Freebase, UMLS, etc. This paper advocates the simple and yet far-reaching idea that the structured knowledge contained in such knowledge bases can be exploited to automatically extract features for general learning tasks. We introduce an expressive graph-based language for extracting features from such knowledge bases and a theoretical framework for constructing feature vectors from the extracted features. Our experimental evaluation on different learning scenarios provides evidence that the features derived through our framework can considerably improve the prediction accuracy, especially when the labeled data at hand is sparse.