DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Modeling oncology gene pathways network with multiple genotypes and phenotypes via a copula method
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Boosting with structure information in the functional space: an application to graph classification
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Regularization and feature selection for networked features
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Semi-supervised learning of sparse linear models in mass spectral imaging
PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Network-based sparse Bayesian classification
Pattern Recognition
Improving accuracy of microarray classification by a simple multi-task feature selection filter
International Journal of Data Mining and Bioinformatics
Support Vector Machine incorporated with feature discrimination
Expert Systems with Applications: An International Journal
An experimental comparison of gene selection by Lasso and Dantzig selector for cancer classification
Computers in Biology and Medicine
Feature grouping and selection over an undirected graph
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Sparse methods for biomedical data
ACM SIGKDD Explorations Newsletter
Mining discriminative subgraphs from global-state networks
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Computational regulatory network construction from microRNA and transcription factor perspectives
ACM SIGBioinformatics Record
International Journal of Data Mining and Bioinformatics
Hi-index | 3.84 |
Motivation: Graphs or networks are common ways of depicting information. In biology in particular, many different biological processes are represented by graphs, such as regulatory networks or metabolic pathways. This kind of a priori information gathered over many years of biomedical research is a useful supplement to the standard numerical genomic data such as microarray gene-expression data. How to incorporate information encoded by the known biological networks or graphs into analysis of numerical data raises interesting statistical challenges. In this article, we introduce a network-constrained regularization procedure for linear regression analysis in order to incorporate the information from these graphs into an analysis of the numerical data, where the network is represented as a graph and its corresponding Laplacian matrix. We define a network-constrained penalty function that penalizes the L1-norm of the coefficients but encourages smoothness of the coefficients on the network. Results: Simulation studies indicated that the method is quite effective in identifying genes and subnetworks that are related to disease and has higher sensitivity than the commonly used procedures that do not use the pathway structure information. Application to one glioblastoma microarray gene-expression dataset identified several subnetworks on several of the Kyoto Encyclopedia of Genes and Genomes (KEGG) transcriptional pathways that are related to survival from glioblastoma, many of which were supported by published literatures. Conclusions: The proposed network-constrained regularization procedure efficiently utilizes the known pathway structures in identifying the relevant genes and the subnetworks that might be related to phenotype in a general regression framework. As more biological networks are identified and documented in databases, the proposed method should find more applications in identifying the subnetworks that are related to diseases and other biological processes. Contact: hongzhe@mail.med.upenn.edu