Machine Learning
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Networks: An Introduction
Hi-index | 0.00 |
Genome-wide association studies (GWAS) have become a powerful and affordable tool to study the genetic variations associated with common human diseases. However, only few of the loci found are associated with a moderate or large increase in disease risk and therefore using GWAS findings to study the underlying biological mechanisms remains a challenge. One possible cause for the "missing heritability" is the gene-gene interactions or epistasis. Several methods have been developed and among them Random Forest (RF) is a popular one. RF has been successfully applied in many studies. However, it is also known to rely on marginal main effects. Meanwhile, networks have become a popular approach for characterizing the space of pairwise interactions systematically, which can be informative for classification problems. In this study, we compared the findings of Mutual Information Network (MIN) to that of RF and observed that the variables identified by the two methods overlap with differences. To integrate advantages of MIN into RF, we proposed a hybrid algorithm, MIN-guided RF (MINGRF), which overlays the neighborhood structure of MIN onto the growth of trees. After comparing MINGRF to the standard RF on a bladder cancer dataset, we conclude that MINGRF produces trees with a better accuracy at a smaller computational cost.