2011 Special Issue: Estimating exogenous variables in data with more variables than observations

  • Authors:
  • Yasuhiro Sogawa;Shohei Shimizu;Teppei Shimamura;Aapo HyväRinen;Takashi Washio;Seiya Imoto

  • Affiliations:
  • The Institute of Scientific and Industrial Research, Osaka University, Mihogaoka 8-1, Ibaraki, Osaka 567-0047, Japan;The Institute of Scientific and Industrial Research, Osaka University, Mihogaoka 8-1, Ibaraki, Osaka 567-0047, Japan;Human Genome Center, Institute of Medical Science, University of Tokyo, Shirokanedai 4-6-1, Minato-ku, Tokyo 108-8639, Japan;Department of Computer Science, University of Helsinki, P.O. Box 68, FIN-00014, Finland and Department of Mathematics and Statistics, University of Helsinki, P.O. Box 68, FIN-00014, Finland;The Institute of Scientific and Industrial Research, Osaka University, Mihogaoka 8-1, Ibaraki, Osaka 567-0047, Japan;Human Genome Center, Institute of Medical Science, University of Tokyo, Shirokanedai 4-6-1, Minato-ku, Tokyo 108-8639, Japan

  • Venue:
  • Neural Networks
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many statistical methods have been proposed to estimate causal models in classical situations with fewer variables than observations. However, modern datasets including gene expression data increase the needs of high-dimensional causal modeling in challenging situations with orders of magnitude more variables than observations. In this paper, we propose a method to find exogenous variables in a linear non-Gaussian causal model, which requires much smaller sample sizes than conventional methods and works even under orders of magnitude more variables than observations. Exogenous variables work as triggers that activate causal chains in the model, and their identification leads to more efficient experimental designs and better understanding of the causal mechanism. We present experiments with artificial data and real-world gene expression data to evaluate the method.