Statistical analysis with missing data
Statistical analysis with missing data
Unknown attribute values in induction
Proceedings of the sixth international workshop on Machine learning
Artificial Intelligence Review - Special issue on lazy learning
EM algorithms for PCA and SPCA
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Artificial Intelligence Review - Special issue on lazy learning
Principal Component Analysis with Missing Data and Its Application to Polyhedral Object Modeling
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mixture model clustering for mixed data with missing information
Computational Statistics & Data Analysis
Nearest neighbour approach in the least-squares data imputation algorithms
Information Sciences: an International Journal
The fuzzy approach to statistical analysis
Computational Statistics & Data Analysis
Experiments for the number of clusters in K-means
EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Hi-index | 0.03 |
Methods for imputation of missing data in the so-called least-squares approximation approach, a non-parametric computationally efficient multidimensional technique, are experimentally compared. Contributions are made to each of the three components of the experiment setting: (a) algorithms to be compared, (b) data generation, and (c) patterns of missing data. Specifically, ''global'' methods for least-squares data imputation are reviewed and extensions to them are proposed based on the nearest neighbours (NN) approach. A conventional generator of mixtures of Gaussian distributions is theoretically analysed and, then, modified to scale clusters differently. Patterns of missing data are defined in terms of rows and columns according to three different mechanisms that are referred to as Random missings, Restricted random missings, and Merged database. It appears that NN-based versions almost always outperform their global counterparts. With the Random missings pattern, the winner is always the authors' two-stage method INI, which combines global and local imputation algorithms.