C4.5: programs for machine learning
C4.5: programs for machine learning
Experimentation in software engineering: an introduction
Experimentation in software engineering: an introduction
Machine Learning
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Scaling up text classification for large file systems
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Quantifying counts and costs via classification
Data Mining and Knowledge Discovery
Selective Pre-processing of Imbalanced Data for Improving Classification Performance
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Learning Decision Trees for Unbalanced Data
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Enhancing network based intrusion detection for imbalanced data
International Journal of Knowledge-based and Intelligent Engineering Systems
On the use of surrounding neighbors for synthetic over-sampling of the minority class
SMO'08 Proceedings of the 8th conference on Simulation, modelling and optimization
Computational Intelligence Methods for Bioinformatics and Biostatistics
Hybrid sampling for imbalanced data
Integrated Computer-Aided Engineering - Selected papers from the IEEE Conference on Information Reuse and Integration (IRI), July 13-15, 2008
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Evolutionary sampling and software quality modeling of high-assurance systems
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
An empirical comparison of repetitive undersampling techniques
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Aggregating performance metrics for classifier evaluation
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Diversity exploration and negative correlation learning on imbalanced data sets
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Improving software-quality predictions with data sampling and boosting
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Evolutionary data analysis for the class imbalance problem
Intelligent Data Analysis
CISDA'09 Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications
IEEE Transactions on Neural Networks
Robust weighted kernel logistic regression in imbalanced and rare events data
Computational Statistics & Data Analysis
Global learning of focused entailment graphs
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Conceptual-driven classification for coding advise in health insurance reimbursement
Artificial Intelligence in Medicine
Data mining for credit card fraud: A comparative study
Decision Support Systems
Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Multimedia news exploration and retrieval by integrating keywords, relations and visual features
Multimedia Tools and Applications
Exploring the performance of resampling strategies for the class imbalance problem
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
A dynamic over-sampling procedure based on sensitivity for multi-class problems
Pattern Recognition
General framework for class-specific feature selection
Expert Systems with Applications: An International Journal
Effective recognition of MCCs in mammograms using an improved neural classifier
Engineering Applications of Artificial Intelligence
An exploration of learning when data is noisy and imbalanced
Intelligent Data Analysis
Global learning of typed entailment rules
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Proceedings of the 20th ACM international conference on Information and knowledge management
ANN vs. SVM: Which one performs better in classification of MCCs in mammogram imaging
Knowledge-Based Systems
Learning entailment relations by global graph structure optimization
Computational Linguistics
Predicting high-risk program modules by selecting the right software measurements
Software Quality Control
Hellinger distance decision trees are robust and skew-insensitive
Data Mining and Knowledge Discovery
Identification of Robust Terminal-Area Routes in Convective Weather
Transportation Science
SDAI: An integral evaluation methodology for content-based spam filtering models
Expert Systems with Applications: An International Journal
International Journal of Business Intelligence and Data Mining
BRACID: a comprehensive approach to learning rules from imbalanced data
Journal of Intelligent Information Systems
Document-level sentiment classification: An empirical comparison between SVM and ANN
Expert Systems with Applications: An International Journal
An efficient and simple under-sampling technique for imbalanced time series classification
Proceedings of the 21st ACM international conference on Information and knowledge management
Abnormal object detection by canonical scene-based contextual model
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
An empirical study of learning from imbalanced data
ADC '11 Proceedings of the Twenty-Second Australasian Database Conference - Volume 115
Mining Data Streams with Skewed Distribution based on Ensemble Method
International Journal of Advanced Pervasive and Ubiquitous Computing
IIvotes ensemble for imbalanced data
Intelligent Data Analysis - Combined Learning Methods and Mining Complex Data
Evaluation of a new hybrid algorithm for highly imbalanced classification problems
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
We present a comprehensive suite of experimentation on the subject of learning from imbalanced data. When classes are imbalanced, many learning algorithms can suffer from the perspective of reduced performance. Can data sampling be used to improve the performance of learners built from imbalanced data? Is the effectiveness of sampling related to the type of learner? Do the results change if the objective is to optimize different performance metrics? We address these and other issues in this work, showing that sampling in many cases will improve classifier performance.