Primal sparse Max-margin Markov networks

Authors:
Jun Zhu;Eric P. Xing;Bo Zhang
Affiliations:
Tsinghua University, Beijing, China;Carnegie Mellon University, Pittsburgh, PA, USA;Tsinghua University, Beijing, China
Venue:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2009

Citing 14
Cited 3

Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
Convex Optimization

Convex Optimization
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
On Bayesian classification with Laplace priors

Pattern Recognition Letters
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
Exponentiated gradient algorithms for log-linear structured prediction

Proceedings of the 24th international conference on Machine learning
A scalable modular convex solver for regularized risk minimization

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient projections onto the l1-ball for learning in high dimensions

Proceedings of the 25th international conference on Machine learning
Laplace maximum margin Markov networks

Proceedings of the 25th international conference on Machine learning
Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction

The Journal of Machine Learning Research
On primal and dual sparsity of Markov networks

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Cutting-plane training of structural SVMs

Machine Learning

Maximum Entropy Discrimination Markov Networks

The Journal of Machine Learning Research
Grafting-light: fast, incremental feature selection and structure learning of Markov random fields

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Max-margin Markov networks (M3N) have shown great promise in structured prediction and relational learning. Due to the KKT conditions, the M3N enjoys dual sparsity. However, the existing M3N formulation does not enjoy primal sparsity, which is a desirable property for selecting significant features and reducing the risk of over-fitting. In this paper, we present an l1-norm regularized max-margin Markov network (l1-M3N), which enjoys dual and primal sparsity simultaneously. To learn an l1-M3N, we present three methods including projected sub-gradient, cutting-plane, and a novel EM-style algorithm, which is based on an equivalence between l1-M3N and an adaptive M3N. We perform extensive empirical studies on both synthetic and real data sets. Our experimental results show that: (1) l1-M3N can effectively select significant features; (2) l1-M3N can perform as well as the pseudo-primal sparse Laplace M3N in prediction accuracy, while consistently outperforms other competing methods that enjoy either primal or dual sparsity; and (3) the EM-algorithm is more robust than the other two in pre-diction accuracy and time efficiency.