Exploiting parameter domain knowledge for learning in bayesian networks

Authors:
Tom Mitchell;Radu Stefan Niculescu
Affiliations:
Carnegie Mellon University;Carnegie Mellon University
Venue:
Exploiting parameter domain knowledge for learning in bayesian networks
Year:
2005

Citing 0
Cited 1

Classification in Very High Dimensional Problems with Handfuls of Examples

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The task of learning models for many real-world problems requires researchers to incorporate problem Domain Knowledge into the learning algorithms because there is rarely enough training data to enable accurate learning of the structures and underlying relationships in the problem. Domain Knowledge comes in many forms. Domain Knowledge about relevance of variables (Feature Selection) can help us ignore certain variables when building our model. Domain Knowledge specifying conditional independencies among variables can guide our search over possible model structures. This thesis presents a theoretical framework for incorporating a different kind of knowledge into learning algorithms for Bayesian Networks: Domain Knowledge about relationships among parameters. We develop a unified framework for incorporating general Parameter Domain Knowledge constraints in learning procedures for Bayesian Networks by formulating this as a constrained optimization problem. We solve this problem using iterative algorithms based on Newton-Raphson method for approximating the solutions of a system of equations. We approach learning from both a frequentist and a Bayesian point of view, from both complete and incomplete data. We also derive closed form solutions for our estimators for several types of Parameter Domain Knowledge: parameter sharing, as well as sharing properties of groups of parameters (sum sharing and ratio sharing). While models like Module Networks, Dynamic Bayes Nets and Context Specific Independence models share parameters at either conditional probability table or conditional distribution (within one table) level, our framework is more flexible, allowing sharing at parameter level, across conditional distributions of different lengths and across different conditional probability tables. Other results include several formal guarantees about our estimators and methods for automatically learning domain knowledge. To validate our theory, we carry out experiments showing the benefits of taking advantage of domain knowledge for modelling the fMRI signal during a cognitive task. Additional experiments on synthetic data are also performed.