A SVM regression based approach to filling in missing values

  • Authors:
  • Feng Honghai;Chen Guoshun;Yin Cheng;Yang Bingru;Chen Yumei

  • Affiliations:
  • Urban & Rural Construction School, Hebei Agricultural University, Baoding, China;Ordnance Technology Institute, Shijiazhuang, Shijiazhuang, China;Modern Educational Center, Hebei Agricultural University, Baoding, China;Information Engineering School, University of Science and Technology Beijing, Beijing, China;Tian'e Chemical Fiber Company of Hebei Baoding, Baoding, China

  • Venue:
  • KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In KDD procedure, to fill in missing data typically requires a very large investment of time and energy – often 80% to 90% of a data analysis project is spent in making the data reliable enough so that the results can be trustful. In this paper, we propose a SVM regression based algorithm for filling in missing data, i.e. set the decision attribute (output attribute) as the condition attribute (input attribute) and the condition attribute as the decision attribute, then use SVM regression to predict the condition attribute values. SARS data set experimental results show that SVM regression method has the highest precision. The method with which the value of the example that has the minimum distance to the example with missing value will be taken to fill in the missing values takes the second place, and the mean and median methods have lower precision.