Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data

  • Authors:
  • Yongdai Kim;Sunghoon Kwon;Seuck Heun Song

  • Affiliations:
  • Seoul National University, Korea;Seoul National University, Korea;Korea University, Korea

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

Monitoring gene expression profiles is a novel approach to cancer diagnosis. Several studies have showed that the sparse logistic regression is a useful classification method for gene expression data. Not only does it give a sparse solution with high accuracy, it provides the user with explicit probabilities of classification apart from the class information. However, its optimal extension to more than two classes is not obvious. In this paper, we propose a multiclass extension of sparse logistic regression. Analysis of five publicly available gene expression data sets shows that the proposed method outperforms the standard multinomial logistic model in prediction accuracy as well as gene selectivity.