Unknown attribute values in induction
Proceedings of the sixth international workshop on Machine learning
Effective Web data extraction with standard XML technologies
Proceedings of the 10th international conference on World Wide Web
On-Demand Forecasting of Stock Prices Using a Real-Time Predictor
IEEE Transactions on Knowledge and Data Engineering
A Grey-Based Nearest Neighbor Approach for Missing Attribute Value Prediction
Applied Intelligence
Using Grey Relational Analysis to Predict Software Effort with Small Data Sets
METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
The problem of disguised missing data
ACM SIGKDD Explorations Newsletter
Semi-parametric optimization for missing data imputation
Applied Intelligence
EACImpute: An Evolutionary Algorithm for Clustering-Based Imputation
ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications
Missing Value Estimation for Mixed-Attribute Data Sets
IEEE Transactions on Knowledge and Data Engineering
Decision tree classifiers sensitive to heterogeneous costs
Journal of Systems and Software
Noisy data elimination using mutual k-nearest neighbor for classification mining
Journal of Systems and Software
Target tracking using a hierarchical grey-fuzzy motion decision-making method
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
The gray prediction search algorithm for block motion estimation
IEEE Transactions on Circuits and Systems for Video Technology
Quality of information-based source assessment and selection
Neurocomputing
Hi-index | 0.00 |
Existing kNN imputation methods for dealing with missing data are designed according to Minkowski distance or its variants, and have been shown to be generally efficient for numerical variables (features, or attributes). To deal with heterogeneous (i.e., mixed-attributes) data, we propose a novel kNN (k nearest neighbor) imputation method to iteratively imputing missing data, named GkNN (gray kNN) imputation. GkNN selects k nearest neighbors for each missing datum via calculating the gray distance between the missing datum and all the training data rather than traditional distance metric methods, such as Euclidean distance. Such a distance metric can deal with both numerical and categorical attributes. For achieving the better effectiveness, GkNN regards all the imputed instances (i.e., the missing data been imputed) as observed data, which with complete instances (instances without missing values) together to iteratively impute other missing data. We experimentally evaluate the proposed approach, and demonstrate that the gray distance is much better than the Minkowski distance at both capturing the proximity relationship (or nearness) of two instances and dealing with mixed attributes. Moreover, experimental results also show that the GkNN algorithm is much more efficient than existent kNN imputation methods.