An efficient causal discovery algorithm for linear models

  • Authors:
  • Zhenxing Wang;Laiwan Chan

  • Affiliations:
  • The Chinese University of Hong Kong, Hong Kong, Hong Kong;The Chinese University of Hong Kong, Hong Kong, Hong Kong

  • Venue:
  • Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bayesian network learning algorithms have been widely used for causal discovery since the pioneer work [13,18]. Among all existing algorithms, three-phase dependency analysis algorithm (TPDA) [5] is the most efficient one in the sense that it has polynomial-time complexity. However, there are still some limitations to be improved. First, TPDA depends on mutual information-based conditional independence (CI) tests, and so is not easy to be applied to continuous data. In addition, TPDA uses two phases to get approximate skeletons of Bayesian networks, which is not efficient in practice. In this paper, we propose a two-phase algorithm with partial correlation-based CI tests: the first phase of the algorithm constructs a Markov random field from data, which provides a close approximation to the structure of the true Bayesian network; at the second phase, the algorithm removes redundant edges according to CI tests to get the true Bayesian network. We show that two-phase algorithm with partial correlation-based CI tests can deal with continuous data following arbitrary distributions rather than only Gaussian distribution.