C4.5: programs for machine learning
C4.5: programs for machine learning
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Information Retrieval
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
One-class svms for document classification
The Journal of Machine Learning Research
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning classifiers from imbalanced data based on biased minimax probability machine
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Experimental perspectives on learning from imbalanced data
Proceedings of the 24th international conference on Machine learning
An approach to mining the multi-relational imbalanced database
Expert Systems with Applications: An International Journal
Classification of weld flaws with imbalanced class data
Expert Systems with Applications: An International Journal
A New Performance Evaluation Method for Two-Class Imbalanced Problems
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
On the use of surrounding neighbors for synthetic over-sampling of the minority class
SMO'08 Proceedings of the 8th conference on Simulation, modelling and optimization
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Hybrid sampling for imbalanced data
Integrated Computer-Aided Engineering - Selected papers from the IEEE Conference on Information Reuse and Integration (IRI), July 13-15, 2008
A Combination Classification Algorithm Based on Outlier Detection and C4.5
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Evolutionary sampling and software quality modeling of high-assurance systems
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
An empirical comparison of repetitive undersampling techniques
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Improving software-quality predictions with data sampling and boosting
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Evolutionary data analysis for the class imbalance problem
Intelligent Data Analysis
Proceedings of the international conference on Multimedia information retrieval
Adaptive methods for classification in arbitrarily imbalanced and drifting data streams
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
RAMOBoost: ranked minority oversampling in boosting
IEEE Transactions on Neural Networks
Exploring the performance of resampling strategies for the class imbalance problem
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
The imbalanced problem in morphological galaxy classification
CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
General framework for class-specific feature selection
Expert Systems with Applications: An International Journal
An exploration of learning when data is noisy and imbalanced
Intelligent Data Analysis
Borderline over-sampling for imbalanced data classification
International Journal of Knowledge Engineering and Soft Data Paradigms
A hierarchical shrinking decision tree for imbalanced datasets
DNCOCO'06 Proceedings of the 5th WSEAS international conference on Data networks, communications and computers
Classification of high dimensional and imbalanced hyperspectral imagery data
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Margin-based over-sampling method for learning from imbalanced datasets
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Clustering based bagging algorithm on imbalanced data sets
IUKM'11 Proceedings of the 2011 international conference on Integrated uncertainty in knowledge modelling and decision making
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Dynamic categorization of clinical research eligibility criteria by hierarchical clustering
Journal of Biomedical Informatics
Optimizing airline passenger prescreening systems with Bayesian decision models
Computers and Operations Research
Expert Systems with Applications: An International Journal
Artificial Intelligence in Medicine
A novel synthetic minority oversampling technique for imbalanced data set learning
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Preprocessing unbalanced data using support vector machine
Decision Support Systems
A new over-sampling approach: Random-SMOTE for learning from imbalanced data sets
KSEM'11 Proceedings of the 5th international conference on Knowledge Science, Engineering and Management
A normal distribution-based over-sampling approach to imbalanced data classification
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique
Applied Intelligence
Prediction of liquefaction potential based on CPT up-sampling
Computers & Geosciences
Learning SVM with weighted maximum margin criterion for classification of imbalanced data
Mathematical and Computer Modelling: An International Journal
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
An efficient and simple under-sampling technique for imbalanced time series classification
Proceedings of the 21st ACM international conference on Information and knowledge management
Over-Sampling from an auxiliary domain
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
A comparative study of sampling methods and algorithms for imbalanced time series classification
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Densifying Distance Spaces for Shape and Image Retrieval
Journal of Mathematical Imaging and Vision
Empirical study of bagging predictors on medical data
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Evaluation of sampling methods for learning from imbalanced data
ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories
Expert Systems with Applications: An International Journal
Information Sciences: an International Journal
Computational predictive models for organic semiconductors
Journal of Computational Electronics
Multimedia Tools and Applications
Imbalanced evolving self-organizing learning
Neurocomputing
Hi-index | 0.01 |
In recent years, mining with imbalanced data sets receives more and more attentions in both theoretical and practical aspects. This paper introduces the importance of imbalanced data sets and their broad application domains in data mining, and then summarizes the evaluation metrics and the existing methods to evaluate and solve the imbalance problem. Synthetic minority over-sampling technique (SMOTE) is one of the over-sampling methods addressing this problem. Based on SMOTE method, this paper presents two new minority over-sampling methods, borderline-SMOTE1 and borderline-SMOTE2, in which only the minority examples near the borderline are over-sampled. For the minority class, experiments show that our approaches achieve better TP rate and F-value than SMOTE and random over-sampling methods.