On primal and dual sparsity of Markov networks

Authors:
Jun Zhu;Eric P. Xing
Affiliations:
Carnegie Mellon University, Pittsburgh, PA and Tsinghua University, Beijing, China;Carnegie Mellon University, Pittsburgh, PA
Venue:
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Year:
2009

Citing 8
Cited 2

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Adaptive Sparseness for Supervised Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Efficient projections onto the l1-ball for learning in high dimensions

Proceedings of the 25th international conference on Machine learning
Laplace maximum margin Markov networks

Proceedings of the 25th international conference on Machine learning
Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction

The Journal of Machine Learning Research
Maximum Entropy Discrimination Markov Networks

The Journal of Machine Learning Research

Primal sparse Max-margin Markov networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximum Entropy Discrimination Markov Networks

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sparsity is a desirable property in high dimensional learning. The l1-norm regularization can lead to primal sparsity, while max-margin methods achieve dual sparsity. Combining these two methods, an l1-norm max-margin Markov network (l1-M3N) can achieve both types of sparsity. This paper analyzes its connections to the Laplace max-margin Markov network (LapM3N), which inherits the dual sparsity of max-margin models but is pseudo-primal sparse, and to a novel adaptive M3N (AdapM3N). We show that the l1-M3N is an extreme case of the LapM3N, and the l1-M3N is equivalent to an AdapM3N. Based on this equivalence we develop a robust EM-style algorithm for learning an l1-M3N. We demonstrate the advantages of the simultaneously (pseudo-) primal and dual sparse models over the ones which enjoy either primal or dual sparsity on both synthetic and real data sets.