Expectation propagation for Bayesian multi-task feature selection

  • Authors:
  • Daniel Hernández-Lobato;José Miguel Hernández-Lobato;Thibault Helleputte;Pierre Dupont

  • Affiliations:
  • ICTEAM institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium;Computer Science Department, Universidad Autónoma de Madrid, Madrid, Spain;ICTEAM institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium;ICTEAM institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium

  • Venue:
  • ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a Bayesian model for multitask feature selection. This model is based on a generalized spike and slab sparse prior distribution that enforces the selection of a common subset of features across several tasks. Since exact Bayesian inference in this model is intractable, approximate inference is performed through expectation propagation (EP). EP approximates the posterior distribution of the model using a parametric probability distribution. This posterior approximation is particularly useful to identify relevant features for prediction. We focus on problems for which the number of features d is significantly larger than the number of instances for each task. We propose an efficient parametrization of the EP algorithm that offers a computational complexity linear in d. Experiments on several multitask datasets show that the proposed model outperforms baseline approaches for single-task learning or data pooling across all tasks, as well as two state-of-the-art multi-task learning approaches. Additional experiments confirm the stability of the proposed feature selection with respect to various sub-samplings of the training data.