Learning a meta-level prior for feature relevance from multiple related tasks

Authors:
Su-In Lee;Vassil Chatalbashev;David Vickrey;Daphne Koller
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 13
Cited 19

Bayesian interpolation

Neural Computation
A Bayesian/Information Theoretic Model of Learning to Learn viaMultiple Task Sampling

Machine Learning - Special issue on inductive transfer
Multitask Learning

Machine Learning - Special issue on inductive transfer
Automatic labeling of semantic roles

Computational Linguistics
Empirical Bayes for Learning to Learn

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Bayesian learning for neural networks

Bayesian learning for neural networks
Support Vector Learning for Semantic Argument Classification

Machine Learning
Learning Multiple Tasks with Kernel Methods

The Journal of Machine Learning Research
Learning Gaussian processes from multiple tasks

ICML '05 Proceedings of the 22nd international conference on Machine learning
Online multiclass learning by interclass hypothesis sharing

ICML '06 Proceedings of the 23rd international conference on Machine learning
Constructing informative priors using transfer learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
A study on convolution kernels for shallow semantic parsing

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Convex multi-task feature learning

Machine Learning

KDD Cup and workshop 2007

ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Learning with Few Examples by Transferring Feature Relevance

Proceedings of the 31st DAGM Symposium on Pattern Recognition
Multi-task Feature Selection Using the Multiple Inclusion Criterion (MIC)

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Hierarchical Bayesian domain adaptation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Graph-based transfer learning

Proceedings of the 18th ACM conference on Information and knowledge management
Heterogeneous cross domain ranking in latent space

Proceedings of the 18th ACM conference on Information and knowledge management
A risk minimization framework for domain adaptation

Proceedings of the 18th ACM conference on Information and knowledge management
Modelling complex data by learning which variable to construct

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Learning with few examples for binary and multiclass classification using regularization of randomized trees

Pattern Recognition Letters
Relevant subtask learning by constrained mixture models

Intelligent Data Analysis
Logistic regression for transductive transfer learning from multiple sources

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Minimum Description Length Penalization for Group and Multi-Task Sparse Learning

The Journal of Machine Learning Research
Transfer learning through domain adaptation

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Knowledge transfer across multilingual corpora via latent topics

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Localized factor models for multi-context recommendation

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
It's who you know: graph mining using recursive structural features

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Pairwise cross-domain factor model for heterogeneous transfer ranking

Proceedings of the fifth ACM international conference on Web search and data mining
Transferring knowledge of activity recognition across sensor networks

Pervasive'10 Proceedings of the 8th international conference on Pervasive Computing
Targeting converters for new campaigns through factor models

Proceedings of the 21st international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many prediction tasks, selecting relevant features is essential for achieving good generalization performance. Most feature selection algorithms consider all features to be a priori equally likely to be relevant. In this paper, we use transfer learning---learning on an ensemble of related tasks---to construct an informative prior on feature relevance. We assume that features themselves have meta-features that are predictive of their relevance to the prediction task, and model their relevance as a function of the meta-features using hyperparameters (called meta-priors). We present a convex optimization algorithm for simultaneously learning the meta-priors and feature weights from an ensemble of related prediction tasks which share a similar relevance structure. Our approach transfers the "meta-priors" among different tasks, which makes it possible to deal with settings where tasks have nonoverlapping features or the relevance of the features vary over the tasks. We show that learning feature relevance improves performance on two real data sets which illustrate such settings: (1) predicting ratings in a collaborative filtering task, and (2) distinguishing arguments of a verb in a sentence.