Regularization and feature selection for networked features

Authors:
Hongliang Fei;Brian Quanz;Jun Huan
Affiliations:
University of Kansas, Lawrence, KS, USA;University of Kansas, Lawrence, KS, USA;University of Kansas, Lawrence, KS, USA
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 3
Cited 1

A global test for groups of genes: testing association with a clinical outcome

Bioinformatics
Network-constrained regularization and variable selection for analysis of genomic data

Bioinformatics
Group lasso with overlap and graph lasso

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning

Feature grouping and selection over an undirected graph

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the standard formalization of supervised learning problems, a datum is represented as a vector of features without prior knowledge about relationships among features. However, for many real world problems, we have such prior knowledge about structure relationships among features. For instance, in Microarray analysis where the genes are features, the genes form biological pathways. Such prior knowledge should be incorporated to build a more accurate and interpretable model, especially in applications with high dimensionality and low sample sizes. Towards an efficient incorporation of the structure relationships, we have designed a classification model where we use an undirected graph to capture the relationship of features. In our method, we combine both L1 norm and Laplacian based L2 norm regularization with logistic regression. In this approach, we enforce model sparsity and smoothness among features to identify a small subset of grouped features. We have derived efficient optimization algorithms based on coordinate decent for the new formulation. Using comprehensive experimental study, we have demonstrated the effectiveness of the proposed learning methods.