Integrating induction and deduction for noisy data mining
Information Sciences: an International Journal
Soft fuzzy rough sets for robust feature evaluation and selection
Information Sciences: an International Journal
Robust fuzzy rough classifiers
Fuzzy Sets and Systems
A unifying view on dataset shift in classification
Pattern Recognition
A robust missing value imputation method for noisy data
Applied Intelligence
A novel classification algorithm to noise data
ICSI'12 Proceedings of the Third international conference on Advances in Swarm Intelligence - Volume Part II
Relaxed constraints support vector machine
Expert Systems: The Journal of Knowledge Engineering
Information Sciences: an International Journal
A novel variable precision (θ,σ)-fuzzy rough set model based on fuzzy granules
Fuzzy Sets and Systems
Hi-index | 0.00 |
Real-world data mining deals with noisy information sources where data collection inaccuracy, device limitations, data transmission and discretization errors, or man-made perturbations frequently result in imprecise or vague data. Two common practices are to adopt either data cleansing approaches to enhance the data consistency or simply take noisy data as quality sources and feed them into the data mining algorithms. Either way may substantially sacrifice the mining performance. In this paper, we consider an error-aware (EA) data mining design, which takes advantage of statistical error information (such as noise level and noise distribution) to improve data mining results. We assume that such noise knowledge is available in advance, and we propose a solution to incorporate it into the mining process. More specifically, we use noise knowledge to restore original data distributions, which are further used to rectify the model built from noise- corrupted data. We materialize this concept by the proposed EA naive Bayes classification algorithm. Experimental comparisons on real-world datasets will demonstrate the effectiveness of this design.