Feature grouping and selection over an undirected graph

Authors:
Sen Yang;Lei Yuan;Ying-Cheng Lai;Xiaotong Shen;Peter Wonka;Jieping Ye
Affiliations:
Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA;University of Minnesota, Minneapolis, MN, USA;Arizona State University, Tempe, AZ, USA;Arizona State University, Tempe, AZ, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 4
Cited 2

Multiple kernel learning, conic duality, and the SMO algorithm

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Network-constrained regularization and variable selection for analysis of genomic data

Bioinformatics
Group lasso with overlap and graph lasso

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Regularization and feature selection for networked features

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management

Sparse methods for biomedical data

ACM SIGKDD Explorations Newsletter
Speeding up large-scale learning with a social prior

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

High-dimensional regression/classification continues to be an important and challenging problem, especially when features are highly correlated. Feature selection, combined with additional structure information on the features has been considered to be promising in promoting regression/classification performance. Graph-guided fused lasso (GFlasso) has recently been proposed to facilitate feature selection and graph structure exploitation, when features exhibit certain graph structures. However, the formulation in GFlasso relies on pairwise sample correlations to perform feature grouping, which could introduce additional estimation bias. In this paper, we propose three new feature grouping and selection methods to resolve this issue. The first method employs a convex function to penalize the pairwise l∞ norm of connected regression/classification coefficients, achieving simultaneous feature grouping and selection. The second method improves the first one by utilizing a non-convex function to reduce the estimation bias. The third one is the extension of the second method using a truncated l1 regularization to further reduce the estimation bias. The proposed methods combine feature grouping and feature selection to enhance estimation accuracy. We employ the alternating direction method of multipliers (ADMM) and difference of convex functions (DC) programming to solve the proposed formulations. Our experimental results on synthetic data and two real datasets demonstrate the effectiveness of the proposed methods.