Learning from incomplete data with infinite imputations

Authors:
Uwe Dick;Peter Haider;Tobias Scheffer
Affiliations:
Max Planck Institute for Computer Science, Saarbrücken, Germany;Max Planck Institute for Computer Science, Saarbrücken, Germany;Max Planck Institute for Computer Science, Saarbrücken, Germany
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 5
Cited 6

Incomplete-data classification using logistic regression

ICML '05 Proceedings of the 22nd international conference on Machine learning
Feature space perspectives for learning the kernel

Machine Learning
Second Order Cone Programming Approaches for Handling Missing and Uncertain Data

The Journal of Machine Learning Research
Quadratically gated mixture of experts for incomplete data classification

Proceedings of the 24th international conference on Machine learning
Learning convex combinations of continuously parameterized basic kernels

COLT'05 Proceedings of the 18th annual conference on Learning Theory

A Max-Margin Learning Algorithm with Additional Features

FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
A Large Margin Classifier with Additional Features

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Concept Learning from (Very) Ambiguous Examples

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Classification with Incomplete Data Using Dirichlet Process Priors

The Journal of Machine Learning Research
Semiconducting bilinear deep learning for incomplete image recognition

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Information enhancement for data mining

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of learning decision functions from training data in which some attribute values are unobserved. This problem can arise, for instance, when training data is aggregated from multiple sources, and some sources record only a subset of attributes. We derive a generic joint optimization problem in which the distribution governing the missing values is a free parameter. We show that the optimal solution concentrates the density mass on finitely many imputations, and provide a corresponding algorithm for learning from incomplete data. We report on empirical results on benchmark data, and on the email spam application that motivates our work.