Demand-Driven Construction of Structural Features in ILP

  • Authors:
  • Stefan Kramer

  • Affiliations:
  • -

  • Venue:
  • ILP '01 Proceedings of the 11th International Conference on Inductive Logic Programming
  • Year:
  • 2001

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper tackles the problem that methods for proposition-alization and feature construction in first-order logic to date construct features in a rather unspecific way. That is, they do not construct features "on demand", but rather in advance and without detecting the need for a representation change. Even if structural features are required, current methods do not construct these features in a goal-directed fashion. In previous work, we presented a method that creates structural features in a class-sensitive manner: We queried the molecular feature miner (MOLFEA) for features (linear molecular fragments) with a minimum frequency in the positive examples and a maximum frequency in the negative examples, such that they are, statistically significant, overrepresented in the positives and under-represented in the negatives. In the present paper, we go one step further. We construct structural features in order to discriminate between those examples from different classes that are particularly problematic to classify. In order to avoid overfitting, this is done in a boosting framework. We are alternating AdaBoost re-weighting episodes and feature construction episodes in order to construct structural features "on demand". In a feature construction episode, we are querying for features with a minimum cumulative weight in the positives and a maximum cumulative weight in the negatives, where the weights stem from the previous AdaBoost iteration. In summary, we propose to construct structural features "on demand" by a combination of AdaBoost and an extension of MOLFEA to handle weighted learning instances.