Maximum Relative Margin and Data-Dependent Regularization

Authors:
Pannagadatta K. Shivaswamy;Tony Jebara
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2010

Citing 22
Cited 7

The nature of statistical learning theory

The nature of statistical learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
Soft Margins for AdaBoost

Machine Learning
Training Invariant Support Vector Machines

Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Bayes point machines

The Journal of Machine Learning Research
Rademacher and gaussian complexities: risk bounds and structural results

The Journal of Machine Learning Research
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Convex Optimization

Convex Optimization
A Second-Order Perceptron Algorithm

SIAM Journal on Computing
Nonlinear Face Recognition Based on Maximum Average Margin Criterion

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Inference with the Universum

ICML '06 Proceedings of the 23rd international conference on Machine learning
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental parsing with the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Second Order Cone Programming Approaches for Handling Missing and Uncertain Data

The Journal of Machine Learning Research
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
Backpropagation applied to handwritten zip code recognition

Neural Computation
Confidence-weighted linear classification

Proceedings of the 25th international conference on Machine learning
Structured Prediction with Relative Margin

ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms

IEEE Transactions on Neural Networks

Laplacian spectrum learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A novel SVM+NDA model for classification with an application to face recognition

Pattern Recognition
Higher rank Support Tensor Machines for visual recognition

Pattern Recognition
Confidence-weighted linear classification for text categorization

The Journal of Machine Learning Research
Minimal correlation classification

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Adaptive regularization of weight vectors

Machine Learning
Using robust dispersion estimation in support vector machines

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Leading classification methods such as support vector machines (SVMs) and their counterparts achieve strong generalization performance by maximizing the margin of separation between data classes. While the maximum margin approach has achieved promising performance, this article identifies its sensitivity to affine transformations of the data and to directions with large data spread. Maximum margin solutions may be misled by the spread of data and preferentially separate classes along large spread directions. This article corrects these weaknesses by measuring margin not in the absolute sense but rather only relative to the spread of data in any projection direction. Maximum relative margin corresponds to a data-dependent regularization on the classification function while maximum absolute margin corresponds to an l2 norm constraint on the classification function. Interestingly, the proposed improvements only require simple extensions to existing maximum margin formulations and preserve the computational efficiency of SVMs. Through the maximization of relative margin, surprising performance gains are achieved on real-world problems such as digit, text classification and on several other benchmark data sets. In addition, risk bounds are derived for the new formulation based on Rademacher averages.