Statistical analysis with missing data
Statistical analysis with missing data
The nature of statistical learning theory
The nature of statistical learning theory
Missing Value Estimation Using Mixture of PCAs
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
A Comparison of Several Approaches to Missing Attribute Values in Data Mining
RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
A Recycle Technique of Association Rule for Missing Value Completion
AINA '03 Proceedings of the 17th International Conference on Advanced Information Networking and Applications
Dealing with Missing Software Project Data
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Methodologies for model-free data interpretation of civil engineering structures
Computers and Structures
Some imputation algorithms for restoration of missing data
CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Information Sciences: an International Journal
Hi-index | 0.00 |
In KDD procedure, to fill in missing data typically requires a very large investment of time and energy – often 80% to 90% of a data analysis project is spent in making the data reliable enough so that the results can be trustful. In this paper, we propose a SVM regression based algorithm for filling in missing data, i.e. set the decision attribute (output attribute) as the condition attribute (input attribute) and the condition attribute as the decision attribute, then use SVM regression to predict the condition attribute values. SARS data set experimental results show that SVM regression method has the highest precision. The method with which the value of the example that has the minimum distance to the example with missing value will be taken to fill in the missing values takes the second place, and the mean and median methods have lower precision.