Joint covariate selection and joint subspace selection for multiple classification problems

Authors:
Guillaume Obozinski;Ben Taskar;Michael I. Jordan
Affiliations:
Department of Statistics, University of California at Berkeley, Berkeley, USA 94720-3860;Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA 19104-6389;Department of Statistics and Department of Electrical Engineering and Computer Science, University of California at Berkeley, Berkeley, USA 94720-3860
Venue:
Statistics and Computing
Year:
2010

Citing 0
Cited 42

Group lasso with overlap and graph lasso

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
An accelerated gradient method for trace norm minimization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Multi-task Feature Selection Using the Multiple Inclusion Criterion (MIC)

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Online Learning for Matrix Factorization and Sparse Coding

The Journal of Machine Learning Research
Learning incoherent sparse and low-rank patterns from multiple tasks

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Concept classification with Bayesian multi-task learning

CN '10 Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics
N-best reranking by multitask learning

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Online learning for multi-task feature selection

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Image segmentation with patch-pair density priors

Proceedings of the international conference on Multimedia
Dynamic captioning: video accessibility enhancement for hearing impairment

Proceedings of the international conference on Multimedia
Movie2Comics: a feast of multimedia artwork

Proceedings of the international conference on Multimedia
Cast2Face: character identification in movie with actor-character correspondence

Proceedings of the international conference on Multimedia
iComics: automatic conversion of movie into comics

Proceedings of the international conference on Multimedia
Expectation propagation for Bayesian multi-task feature selection

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Review: Surveying and comparing simultaneous sparse approximation (or group-lasso) algorithms

Signal Processing
Multitask Sparsity via Maximum Entropy Discrimination

The Journal of Machine Learning Research
Minimum Description Length Penalization for Group and Multi-Task Sparse Learning

The Journal of Machine Learning Research
Improving accuracy of microarray classification by a simple multi-task feature selection filter

International Journal of Data Mining and Bioinformatics
Linear discriminant dimensionality reduction

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Video accessibility enhancement for hearing-impaired users

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media
Towards multi-semantic image annotation with graph regularized exclusive group lasso

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Convex and Network Flow Optimization for Structured Sparsity

The Journal of Machine Learning Research
Structured Variable Selection with Sparsity-Inducing Norms

The Journal of Machine Learning Research
Trace Norm Regularization: Reformulations, Algorithms, and Multi-Task Learning

SIAM Journal on Optimization
Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks

ACM Transactions on Knowledge Discovery from Data (TKDD)
Structured sparsity in structured prediction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multitask learning using regularized multiple kernel learning

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
A Simpler Approach to Matrix Completion

The Journal of Machine Learning Research
Optimization with Sparsity-Inducing Penalties

Foundations and Trends® in Machine Learning
Joint feature selection and subspace learning

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Regularization techniques for learning with matrices

The Journal of Machine Learning Research
Kernels for Vector-Valued Functions: A Review

Foundations and Trends® in Machine Learning
Structural and topical dimensions in multi-task patent translation

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Joint feature selection in distributed stochastic learning for large-scale discriminative training in SMT

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Learning with infinitely many features

Machine Learning
Efficient online learning for multitask feature selection

ACM Transactions on Knowledge Discovery from Data (TKDD)
Exact top-k feature selection via l2,0-norm constraint

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Block coordinate descent algorithms for large-scale sparse multiclass classification

Machine Learning
Block-sparse recovery via redundant block OMP

Signal Processing
Multi-label learning under feature extraction budgets

Pattern Recognition Letters
Multi-task learning with one-class SVM

Neurocomputing
Kernel regression with sparse metric learning

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. By penalizing the sum of 驴 2 norms of the blocks of coefficients associated with each covariate across different classification problems, similar sparsity patterns in all models are encouraged. To take computational advantage of the sparsity of solutions at high regularization levels, we propose a blockwise path-following scheme that approximately traces the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems. We also show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace. We present theoretical results showing that this random projection approach converges to the solution yielded by trace-norm regularization. Finally, we present a variety of experimental results exploring joint covariate selection and joint subspace selection, comparing the path-following approach to competing algorithms in terms of prediction accuracy and running time.