Missing Value Imputation Using a Semi-supervised Rank Aggregation Approach

  • Authors:
  • Edson T. Matsubara;Ronaldo C. Prati;Gustavo E. Batista;Maria C. Monard

  • Affiliations:
  • Institute of Mathematics and Computer Science at University of São Paulo, São Carlos, Brazil ZIP Code 13560-970;Institute of Mathematics and Computer Science at University of São Paulo, São Carlos, Brazil ZIP Code 13560-970;Institute of Mathematics and Computer Science at University of São Paulo, São Carlos, Brazil ZIP Code 13560-970;Institute of Mathematics and Computer Science at University of São Paulo, São Carlos, Brazil ZIP Code 13560-970

  • Venue:
  • SBIA '08 Proceedings of the 19th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

One relevant problem in data quality is the presence of missing data. In cases where missing data are abundant, effective ways to deal with these absences could improve the performance of machine learning algorithms. Missing data can be treated using imputation. Imputation methods replace the missing data by values estimated from the available data. This paper presents Corai, an imputation algorithm which is an adaption of Co-training, a multi-view semi-supervised learning algorithm. The comparison of Coraiwith other imputation methods found in the literature in three data sets from UCI with different levels of missingness inserted into up to three attributes, shows that Coraitends to perform well in data sets at greater percentages of missingness and number of attributes with missing values.