Learning to classify with missing and corrupted features

Authors:
Ofer Dekel;Ohad Shamir
Affiliations:
Microsoft Research, Redmond, WA;The Hebrew University, Jerusalem, Israel
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 6
Cited 5

Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
Convex Optimization

Convex Optimization
Nightmare at test time: robust learning by feature deletion

ICML '06 Proceedings of the 23rd international conference on Machine learning
Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research
On the generalization ability of on-line learning algorithms

IEEE Transactions on Information Theory

Good learners for evil teachers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Stackelberg games for adversarial prediction problems

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Adversarial support vector machine learning

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Static prediction games for adversarial learning problems

The Journal of Machine Learning Research
Security analysis of online centroid anomaly detection

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

After a classifier is trained using a machine learning algorithm and put to use in a real world system, it often faces noise which did not appear in the training data. Particularly, some subset of features may be missing or may become corrupted. We present two novel machine learning techniques that are robust to this type of classification-time noise. First, we solve an approximation to the learning problem using linear programming. We analyze the tightness of our approximation and prove statistical risk bounds for this approach. Second, we define the online-learning variant of our problem, address this variant using a modified Perceptron, and obtain a statistical learning algorithm using an online-to-batch technique. We conclude with a set of experiments that demonstrate the effectiveness of our algorithms.