Statistical analysis with missing data
Statistical analysis with missing data
C4.5: programs for machine learning
C4.5: programs for machine learning
Missing values and learning of fuzzy rules
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Guest Editors' Introduction: Information Enhancement for Data Mining
IEEE Intelligent Systems
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
Semi-parametric optimization for missing data imputation
Applied Intelligence
GBKII: an imputation method for missing values
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Missing value imputation based on data clustering
Transactions on computational science I
A Novel Framework for Imputation of Missing Values in Databases
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Cost-sensitive classification with respect to waiting cost
Knowledge-Based Systems
Noisy data elimination using mutual k-nearest neighbor for classification mining
Journal of Systems and Software
Simultaneous optimization of artificial neural networks for financial forecasting
Applied Intelligence
Data stream classification with artificial endocrine system
Applied Intelligence
Information enhancement for data mining
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
WebPut: efficient web-based data imputation
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Estimating Semi-Parametric Missing Values with Iterative Imputation
International Journal of Data Warehousing and Mining
Combining kNN Imputation and Bootstrap Calibrated: Empirical Likelihood for Incomplete Data Analysis
International Journal of Data Warehousing and Mining
Imputation for categorical attributes with probabilistic reasoning
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Quality of information-based source assessment and selection
Neurocomputing
Clustering with Missing Values
Fundamenta Informaticae
Hi-index | 0.00 |
Data preparation is an important step in mining incomplete data. To deal with this problem, this paper introduces a new imputation approach called SN (Shell Neighbors) imputation, or simply SNI. The SNI fills in an incomplete instance (with missing values) in a given dataset by only using its left and right nearest neighbors with respect to each factor (attribute), referred them to Shell Neighbors. The left and right nearest neighbors are selected from a set of nearest neighbors of the incomplete instance. The size of the sets of the nearest neighbors is determined with the cross-validation method. And then the SNI is generalized to deal with missing data in datasets with mixed attributes, for example, continuous and categorical attributes. Some experiments are conducted for evaluating the proposed approach, and demonstrate that the generalized SNI method outperforms the kNN imputation method at imputation accuracy and classification accuracy.