Multi-task Feature Selection Using the Multiple Inclusion Criterion (MIC)

  • Authors:
  • Paramveer S. Dhillon;Brian Tomasik;Dean Foster;Lyle Ungar

  • Affiliations:
  • CIS Department, University of Pennsylvania, Philadelphia, U.S.A. 19104;Computer Science Department, Swarthmore College, U.S.A. 19081;Statistics Department, University of Pennsylvania, Philadelphia, U.S.A. 19104;CIS Department, University of Pennsylvania, Philadelphia, U.S.A. 19104

  • Venue:
  • ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

We address the problem of joint feature selection in multiple related classification or regression tasks. When doing feature selection with multiple tasks, usually one can "borrow strength" across these tasks to get a more sensitive criterion for deciding which features to select. We propose a novel method, the Multiple Inclusion Criterion (MIC), which modifies stepwise feature selection to more easily select features that are helpful across multiple tasks. Our approach allows each feature to be added to none, some, or all of the tasks. MIC is most beneficial for selecting a small set of predictive features from a large pool of potential features, as is common in genomic and biological datasets. Experimental results on such datasets show that MIC usually outperforms other competing multi-task learning methods not only in terms of accuracy but also by building simpler and more interpretable models.