Type Extension Trees for feature construction and learning in relational domains

  • Authors:
  • Manfred Jaeger;Marco Lippi;Andrea Passerini;Paolo Frasconi

  • Affiliations:
  • Institut for Datalogi, Aalborg Universitet, Denmark;Dipartimento di Ingegneria dellInformazione e Scienze Matematiche, Universití degli Studi di Siena, Italy;Dipartimento di Ingegneria e Scienza dellInformazione, Universití degli Studi di Trento, Italy;Dipartimento di Ingegneria dellInformazione, Universití degli Studi di Firenze, Italy

  • Venue:
  • Artificial Intelligence
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Type Extension Trees are a powerful representation language for ''count-of-count'' features characterizing the combinatorial structure of neighborhoods of entities in relational domains. In this paper we present a learning algorithm for Type Extension Trees (TET) that discovers informative count-of-count features in the supervised learning setting. Experiments on bibliographic data show that TET-learning is able to discover the count-of-count feature underlying the definition of the h-index, and the inverse document frequency feature commonly used in information retrieval. We also introduce a metric on TET feature values. This metric is defined as a recursive application of the Wasserstein-Kantorovich metric. Experiments with a k-NN classifier show that exploiting the recursive count-of-count statistics encoded in TET values improves classification accuracy over alternative methods based on simple count statistics.