Statistical analysis with missing data
Statistical analysis with missing data
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Enhancing Supervised Learning with Unlabeled Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers
IEEE Transactions on Knowledge and Data Engineering
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
One relevant problem in data quality is the presence of missing data. In cases where missing data are abundant, effective ways to deal with these absences could improve the performance of machine learning algorithms. Missing data can be treated using imputation. Imputation methods replace the missing data by values estimated from the available data. This paper presents Corai, an imputation algorithm which is an adaption of Co-training, a multi-view semi-supervised learning algorithm. The comparison of Coraiwith other imputation methods found in the literature in three data sets from UCI with different levels of missingness inserted into up to three attributes, shows that Coraitends to perform well in data sets at greater percentages of missingness and number of attributes with missing values.