Structure learning with independent non-identically distributed data

  • Authors:
  • Robert E. Tillman

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are well known algorithms for learning the structure of directed and undirected graphical models from data, but nearly all assume that the data consists of a single i.i.d. sample. In contexts such as fMRI analysis, data may consist of an ensemble of independent samples from a common data generating mechanism which may not have identical distributions. Pooling such data can result in a number of well known statistical problems so each sample must be analyzed individually, which offers no increase in power due to the presence of multiple samples. We show how existing constraint based methods can be modified to learn structure from the aggregate of such data in a statistically sound manner. The prescribed method is simple to implement and based on existing statistical methods employed in metaanalysis and other areas, but works surprisingly well in this context where there are increased concerns due to issues such as retesting. We report results for directed models, but the method given is just as applicable to undirected models.