Inductive learning models with missing values

Authors:
I. Fortes;L. Mora-LóPez;R. Morales;F. Triguero
Affiliations:
Dept. Matemática Aplicada, E.T.S.I. Informática, Univ. Málaga, Campus Teatinos, Málaga 29071, Spain;Dept. Leng. y C. de la Computación, E.T.S.I. Informática, Univ. Málaga, Campus Teatinos, Málaga 29071, Spain;Dept. Leng. y C. de la Computación, E.T.S.I. Informática, Univ. Málaga, Campus Teatinos, Málaga 29071, Spain;Dept. Leng. y C. de la Computación, E.T.S.I. Informática, Univ. Málaga, Campus Teatinos, Málaga 29071, Spain
Venue:
Mathematical and Computer Modelling: An International Journal
Year:
2006

Citing 16
Cited 6

Statistical analysis with missing data

Statistical analysis with missing data
Unknown attribute values in induction

Proceedings of the sixth international workshop on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
Rough set approach to incomplete information systems

Information Sciences: an International Journal
Rules in incomplete information systems

Information Sciences: an International Journal
Hybrid inductive machine learning: an overview of CLIP algorithms

New learning paradigms in soft computing
Machine Learning

Machine Learning
Induction of Decision Trees

Machine Learning
A Comparison of Several Approaches to Missing Attribute Values in Data Mining

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
Classifying Unseen Cases with Many Missing Values

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Techniques for Dealing with Missing Values in Classification

IDA '97 Proceedings of the Second International Symposium on Advances in Intelligent Data Analysis, Reasoning about Data
CLIP4: hybrid inductive machine learning algorithm that generates inequality rules

Information Sciences: an International Journal - Special issue: Soft computing data mining
A Recursive Partitioning Decision Rule for Nonparametric Classification

IEEE Transactions on Computers
Improved heterogeneous distance functions

Journal of Artificial Intelligence Research
A rough set approach to data with missing attribute values

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology

A DS-AHP approach for multi-attribute decision making problem with incomplete information

Expert Systems with Applications: An International Journal
Rough sets based association rules application for knowledge-based system design

ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part II
Time series AR modeling with missing observations based on the polynomial transformation

Mathematical and Computer Modelling: An International Journal
Optimum estimation of missing values in randomized complete block design by genetic algorithm

Knowledge-Based Systems
A data mining driven risk profiling method for road asset management

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Missing values: how many can they be to preserve classification reliability?

Artificial Intelligence Review

Quantified Score

Hi-index	0.98

Visualization

Abstract

In this paper, a new approach to working with missing attribute values in inductive learning algorithms is introduced. Three fundamental issues are studied: the splitting criterion, the allocation of values to missing attribute values, and the prediction of new observations. The formal definition for the splitting criterion is given. This definition takes into account the missing attribute values and generalizes the classical definition. In relation to the second objective, multiple values are assigned to missing attribute values using a decision theory approach. Each of these multiple values will have an associated confidence and error parameter. The error parameter measures how near or how far the value is from the original value of the attribute. After applying a splitting criterion, a decision tree is obtained (from training sets with or without missing attribute values). This decision tree can be used to predict the class of an observation (with or without missing attribute values). Hence, there are four perspectives. The three perspectives with missing attribute values are studied and experimental results are presented.