Statistical analysis with missing data
Statistical analysis with missing data
Unknown attribute values in induction
Proceedings of the sixth international workshop on Machine learning
Characterization and detection of noise in clustering
Pattern Recognition Letters
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Noise modelling and evaluating learning from examples
Artificial Intelligence
Discovering informative patterns and data cleaning
Advances in knowledge discovery and data mining
Data quality and systems theory
Communications of the ACM
The impact of poor data quality on the typical enterprise
Communications of the ACM
Understanding the Crucial Role of AttributeInteraction in Data Mining
Artificial Intelligence Review
Knowledge Acquisition from Databases
Knowledge Acquisition from Databases
Data Quality for the Information Age
Data Quality for the Information Age
A Framework for Analysis of Data Quality Research
IEEE Transactions on Knowledge and Data Engineering
Machine Learning
Machine Learning
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Experiments with Noise Filtering in a Medical Domain
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Unknown Attribute Values Processing by Meta-learner
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Probabilistic Noise Identification and Data Cleaning
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Beyond accuracy: what data quality means to data consumers
Journal of Management Information Systems
Error detection and impact-sensitive instance ranking in noisy datasets
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Using qualitative hypotheses to identify inaccurate data
Journal of Artificial Intelligence Research
Identifying and eliminating mislabeled training instances
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Determining noisy instances relative to attributes of interest
Intelligent Data Analysis
The multiple imputation quantitative noise corrector
Intelligent Data Analysis
On the Effects of Learning Set Corruption in Anomaly-Based Detection of Web Defacements
DIMVA '07 Proceedings of the 4th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Class Specific Fuzzy Decision Trees for Mining High Speed Data Streams
Fundamenta Informaticae
Classification algorithm sensitivity to training data with non representative attribute noise
Decision Support Systems
Structure identification of Bayesian classifiers based on GMDH
Knowledge-Based Systems
Class noise detection using frequent itemsets
Intelligent Data Analysis
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Empirical case studies in attribute noise detection
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews - Special issue on information reuse and integration
A pattern-based outlier detection method identifying abnormal attributes in software project data
Information and Software Technology
Correlation-based detection of attribute outliers
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
IEEE Transactions on Neural Networks
A dynamic classifier ensemble selection approach for noise data
Information Sciences: an International Journal
Advances in Class Noise Detection
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Active learning from stream data using optimal weight classifier ensemble
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Performance Analysis of Class Noise Detection Algorithms
Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
An exploration of learning when data is noisy and imbalanced
Intelligent Data Analysis
Dealing with noise in defect prediction
Proceedings of the 33rd International Conference on Software Engineering
A logical analysis of banks' financial strength ratings
Expert Systems with Applications: An International Journal
An experimental comparison of real and artificial deception using a deception generation model
Decision Support Systems
Minimizing insider misuse through secure Identity Management
Security and Communication Networks
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Measuring stability of feature ranking techniques: a noise-based approach
International Journal of Business Intelligence and Data Mining
Class Specific Fuzzy Decision Trees for Mining High Speed Data Streams
Fundamenta Informaticae
Software mining and fault prediction
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Mining noisy tagging from multi-label space
Proceedings of the 21st ACM international conference on Information and knowledge management
Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Analysis and extension of decision trees based on imprecise probabilities: Application on noisy data
Expert Systems with Applications: An International Journal
Ensemble-based noise detection: noise ranking and visual performance evaluation
Data Mining and Knowledge Discovery
Impact of noise on credit risk prediction: Does data quality really matter?
Intelligent Data Analysis
Hi-index | 0.00 |
Real-world data is never perfect and can often suffer from corruptions (noise) that may impact interpretations of the data, models created from the data and decisions made based on the data. Noise can reduce system performance in terms of classification accuracy, time in building a classifier and the size of the classifier. Accordingly, most existing learning algorithms have integrated various approaches to enhance their learning abilities from noisy environments, but the existence of noise can still introduce serious negative impacts. A more reasonable solution might be to employ some preprocessing mechanisms to handle noisy instances before a learner is formed. Unfortunately, rare research has been conducted to systematically explore the impact of noise, especially from the noise handling point of view. This has made various noise processing techniques less significant, specifically when dealing with noise that is introduced in attributes. In this paper, we present a systematic evaluation on the effect of noise in machine learning. Instead of taking any unified theory of noise to evaluate the noise impacts, we differentiate noise into two categories: class noise and attribute noise, and analyze their impacts on the system performance separately. Because class noise has been widely addressed in existing research efforts, we concentrate on attribute noise. We investigate the relationship between attribute noise and classification accuracy, the impact of noise at different attributes, and possible solutions in handling attribute noise. Our conclusions can be used to guide interested readers to enhance data quality by designing various noise handling mechanisms.