Learning probabilistic datalog rules for information classification and transformation

  • Authors:
  • Henrik Nottelmann;Norbert Fuhr

  • Affiliations:
  • University of Dortmund, Dortmund, Germany;University of Dortmund, Dortmund, Germany

  • Venue:
  • Proceedings of the tenth international conference on Information and knowledge management
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Probabilistic Datalog is a combination of classical Datalog (i.e., function-free Horn clause predicate logic) with probability theory. Therefore, probabilistic weights may be attached to both facts and rules. But it is often impossible to assign exact rule weights or even to construct the rules themselves. Instead of specifying them manually, learning algorithms can be used to learn both rules and weights. In practice, these algorithms are very slow because they need a large example set and have to test a high number of rules. We apply a number of extensions to these algorithms in order to improve efficiency. Several applications demonstrate the power of learning probabilistic Datalog rules, showing that learning rules is suitable for low dimensional problems (e.g., schema mapping) but inappropriate for higher dimensions like e.g. in text classification.