Identifying significant edges in graphical models of molecular networks

Authors:
Marco Scutari;Radhakrishnan Nagarajan
Affiliations:
Genetics Institute, University College London, Darwin Building, Gower Street, WC1E 6BT London, United Kingdom;Division of Biomedical Informatics, Department of Biostatistics, College of Public Health, University of Kentucky, 725 Rose Street, Multidisciplinary Science Bldg, 230F, Lexington, KY 40536-0082, ...
Venue:
Artificial Intelligence in Medicine
Year:
2013

Citing 16
Cited 1

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
Adaptive Probabilistic Networks with Hidden Variables

Machine Learning - Special issue on learning with probabilistic representations
Learning Bayesisan Networks by Genetic Algorithms: A Case Study in the Prediction of Survival in Malignant Skin Melanoma

AIME '97 Proceedings of the 6th Conference on Artificial Intelligence in Medicine in Europe
Optimal structure identification with greedy search

The Journal of Machine Learning Research
The max-min hill-climbing Bayesian network structure learning algorithm

Machine Learning
Learning Bayesian Networks

Learning Bayesian Networks
A Robust Procedure For Gaussian Graphical Model Search From Microarray Data With p Larger Than n

The Journal of Machine Learning Research
Efficient Markov network structure discovery using independence tests

Journal of Artificial Intelligence Research
Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks

The Journal of Machine Learning Research
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning

Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Bayesian Artificial Intelligence, Second Edition

Bayesian Artificial Intelligence, Second Edition
Causal discovery from a mixture of experimental and observational data

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A hybrid anytime algorithm for the construction of causal models from sparse data

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A transformational characterization of equivalent Bayesian network structures

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Guest editorial: Probabilistic problem solving in biomedicine

Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective: Modelling the associations from high-throughput experimental molecular data has provided unprecedented insights into biological pathways and signalling mechanisms. Graphical models and networks have especially proven to be useful abstractions in this regard. Ad hoc thresholds are often used in conjunction with structure learning algorithms to determine significant associations. The present study overcomes this limitation by proposing a statistically motivated approach for identifying significant associations in a network. Methods and materials: A new method that identifies significant associations in graphical models by estimating the threshold minimising the L"1 norm between the cumulative distribution function (CDF) of the observed edge confidences and those of its asymptotic counterpart is proposed. The effectiveness of the proposed method is demonstrated on popular synthetic data sets as well as publicly available experimental molecular data corresponding to gene and protein expression profiles. Results: The improved performance of the proposed approach is demonstrated across the synthetic data sets using sensitivity, specificity and accuracy as performance metrics. The results are also demonstrated across varying sample sizes and three different structure learning algorithms with widely varying assumptions. In all cases, the proposed approach has specificity and accuracy close to 1, while sensitivity increases linearly in the logarithm of the sample size. The estimated threshold systematically outperforms common ad hoc ones in terms of sensitivity while maintaining comparable levels of specificity and accuracy. Networks from experimental data sets are reconstructed accurately with respect to the results from the original papers. Conclusion: Current studies use structure learning algorithms in conjunction with ad hoc thresholds for identifying significant associations in graphical abstractions of biological pathways and signalling mechanisms. Such an ad hoc choice can have pronounced effect on attributing biological significance to the associations in the resulting network and possible downstream analysis. The statistically motivated approach presented in this study has been shown to outperform ad hoc thresholds and is expected to alleviate spurious conclusions of significant associations in such graphical abstractions.