Handling missing attribute values in preterm birth data sets

Authors:
Jerzy W. Grzymala-Busse;Linda K. Goodwin;Witold J. Grzymala-Busse;Xinqun Zheng
Affiliations:
,Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS;Nursing Informatics Program, Duke University, Durham, NC;Filterlogix, Lawrence, KS;PC Sprint, Overland Park, KS
Venue:
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Year:
2005

Citing 2
Cited 3

Classification Strategies Using Certain and Possible Rules

RSCTC '98 Proceedings of the First International Conference on Rough Sets and Current Trends in Computing
A Closest Fit Approach to Missing Attribute VAlues in Preterm Birth Data

RSFDGrC '99 Proceedings of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing

A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling missing attribute values: The good synergy between RBFNs and EventCovering method

Neural Networks
A soft computing method for detecting lifetime building thermal insulation failures

Integrated Computer-Aided Engineering
An analysis on the use of pre-processing methods in evolutionary fuzzy systems for subgroup discovery

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The objective of our research was to find the best approach to handle missing attribute values in data sets describing preterm birth provided by the Duke University. Five strategies were used for filling in missing attribute values, based on most common values and closest fit for symbolic attributes, averages for numerical attributes, and a special approach to induce only certain rules from specified information using the MLEM2 approach. The final conclusion is that the best strategy was to use the global most common method for symbolic attributes and the global average method for numerical attributes.