Adaptive Probabilistic Networks with Hidden Variables
Machine Learning - Special issue on learning with probabilistic representations
Causality: models, reasoning, and inference
Causality: models, reasoning, and inference
Magical thinking in data mining: lessons from CoIL challenge 2000
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Distributed data mining on the grid
Future Generation Computer Systems - Grid computing: Towards a new computing infrastructure
Statistical Matching: Theory and Practice (Wiley Series in Survey Methodology)
Statistical Matching: Theory and Practice (Wiley Series in Survey Methodology)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
A Linear Non-Gaussian Acyclic Model for Causal Discovery
The Journal of Machine Learning Research
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
KEEL: a software tool to assess evolutionary algorithms for data mining problems
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
Statistical matching of multiple sources: A look through coherence
International Journal of Approximate Reasoning
Structure learning with independent non-identically distributed data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Bayesian learning of Bayesian networks with informative priors
Annals of Mathematics and Artificial Intelligence
Bounding the false discovery rate in local Bayesian network learning
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Modeling wine preferences by data mining from physicochemical properties
Decision Support Systems
The Journal of Machine Learning Research
IEEE Transactions on Knowledge and Data Engineering
Permutation testing improves Bayesian network learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model
The Journal of Machine Learning Research
Causal discovery from a mixture of experimental and observational data
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Causal inference and causal explanation with background knowledge
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We present methods able to predict the presence and strength of conditional and unconditional dependencies (correlations) between two variables Y and Z never jointly measured on the same samples, based on multiple data sets measuring a set of common variables. The algorithms are specializations of prior work on learning causal structures from overlapping variable sets. This problem has also been addressed in the field of statistical matching. The proposed methods are applied to a wide range of domains and are shown to accurately predict the presence of thousands of dependencies. Compared against prototypical statistical matching algorithms and within the scope of our experiments, the proposed algorithms make predictions that are better correlated with the sample estimates of the unknown parameters on test data ; this is particularly the case when the number of commonly measured variables is low. The enabling idea behind the methods is to induce one or all causal models that are simultaneously consistent with (fit) all available data sets and prior knowledge and reason with them. This allows constraints stemming from causal assumptions (e.g., Causal Markov Condition, Faithfulness) to propagate. Several methods have been developed based on this idea, for which we propose the unifying name Integrative Causal Analysis (INCA). A contrived example is presented demonstrating the theoretical potential to develop more general methods for co-analyzing heterogeneous data sets. The computational experiments with the novel methods provide evidence that causally-inspired assumptions such as Faithfulness often hold to a good degree of approximation in many real systems and could be exploited for statistical inference. Code, scripts, and data are available at www.mensxmachina.org.