Improved Estimates for the Accuracy of Small Disjuncts
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Mining association rules with multiple minimum supports
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments
Machine Learning
Mining needle in a haystack: classifying rare classes via two-phase rule induction
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning and making decisions when costs and probabilities are both unknown
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Information Retrieval
Learning When Negative Examples Abound
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Improving Minority Class Prediction Using Case-Specific Feature Weights
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Brief Introduction to Boosting
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
A Quantitative Study of Small Disjuncts
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Predicting rare classes: can boosting make any weak learner strong?
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Handbook of data mining and knowledge discovery
Tree Induction for Probability-Based Ranking
Machine Learning
The class imbalance problem: A systematic study
Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
A novelty detection approach to classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Applying both positive and negative selection to supervised learning for anomaly detection
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution
IEEE Transactions on Knowledge and Data Engineering
Instance Filtering for entity recognition
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Linear Asymmetric Classifier for cascade detectors
ICML '05 Proceedings of the 22nd international conference on Machine learning
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
A probabilistic classifier system and its application in data mining
Evolutionary Computation
Focusing on non-respondents: Response modeling with novelty detectors
Expert Systems with Applications: An International Journal
Defect prevention in software processes: An action-based approach
Journal of Systems and Software
Rough Sets for Handling Imbalanced Data: Combining Filtering and Rule-based Classifiers
Fundamenta Informaticae - SPECIAL ISSUE ON CONCURRENCY SPECIFICATION AND PROGRAMMING (CS&P 2005) Ruciane-Nide, Poland, 28-30 September 2005
Local decomposition for rare class analysis
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
Video diver: generic video indexing with diverse features
Proceedings of the international workshop on Workshop on multimedia information retrieval
Using classifier ensembles to label spatially disjoint data
Information Fusion
A weighted rough set based method developed for class imbalance learning
Information Sciences: an International Journal
Do unbalanced data have a negative effect on LDA?
Pattern Recognition
Do unbalanced data have a negative effect on LDA?
Pattern Recognition
Learning verb complements for modern greek: Balancing the noisy dataset
Natural Language Engineering
Detection of stock price movements using chance discovery and genetic programming
International Journal of Knowledge-based and Intelligent Engineering Systems - Chance discovery
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Helping Teachers Handle the Flood of Data in Online Student Discussions
ITS '08 Proceedings of the 9th international conference on Intelligent Tutoring Systems
Selective Pre-processing of Imbalanced Data for Improving Classification Performance
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Imbalanced SVM Learning with Margin Compensation
ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks
Imbalanced text classification: A term weighting approach
Expert Systems with Applications: An International Journal
A comparative study on rough set based class imbalance learning
Knowledge-Based Systems
Integrating in-process software defect prediction with association mining to discover defect pattern
Information and Software Technology
Handling class imbalance in customer churn prediction
Expert Systems with Applications: An International Journal
Web robot detection: A probabilistic reasoning approach
Computer Networks: The International Journal of Computer and Telecommunications Networking
International Journal of Approximate Reasoning
Journal of Systems and Software
Locally application of cascade generalization for classification problems
Intelligent Decision Technologies
MDS: a novel method for class imbalance learning
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Countering imbalanced datasets to improve adverse drug event predictive models in labor and delivery
Journal of Biomedical Informatics
Expert Systems with Applications: An International Journal
ACM SIGKDD Explorations Newsletter
Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Quantification and semi-supervised classification methods for handling changes in class distribution
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised Machine Learning: A Review of Classification Techniques
Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies
On multi-class cost-sensitive learning
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
The Needles-in-Haystack Problem
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A Combination Classification Algorithm Based on Outlier Detection and C4.5
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Rule Learning with Probabilistic Smoothing
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Using language modeling to select useful annotation data
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Margin calibration in SVM class-imbalanced learning
Neurocomputing
Evolutionary sampling and software quality modeling of high-assurance systems
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
SVMs modeling for highly imbalanced classification
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Exploratory undersampling for class-imbalance learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An empirical comparison of repetitive undersampling techniques
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Classification of Imbalanced Data Sets by Using the Hybrid Re-sampling Algorithm Based on Isomap
ISICA '09 Proceedings of the 4th International Symposium on Advances in Computation and Intelligence
Support vector self-organizing learning for imbalanced medical data
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Handling class imbalance problem in cultural modeling
ISI'09 Proceedings of the 2009 IEEE international conference on Intelligence and security informatics
Information Sciences: an International Journal
Facetwise analysis of XCS for problems with class imbalances
IEEE Transactions on Evolutionary Computation
Improving software-quality predictions with data sampling and boosting
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Evolutionary data analysis for the class imbalance problem
Intelligent Data Analysis
COG: local decomposition for rare class analysis
Data Mining and Knowledge Discovery
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A survey on the application of genetic programming to classification
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Improving spamdexing detection via a two-stage classification strategy
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
A study of dynamic meta-learning for failure prediction in large-scale systems
Journal of Parallel and Distributed Computing
First elements on knowledge discovery guided by domain knowledge (KDDK)
CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
Study on customer churn prediction methods based on multiple classifiers combination
IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
FSVM-CIL: fuzzy support vector machines for class imbalance learning
IEEE Transactions on Fuzzy Systems - Special section on computing with words
How XCS deals with rarities in domains with continuous attributes
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Proceedings of the 12th annual conference companion on Genetic and evolutionary computation
IEEE Transactions on Neural Networks
Journal of Intelligent Information Systems
Robust weighted kernel logistic regression in imbalanced and rare events data
Computational Statistics & Data Analysis
Hierarchical service analytics for improving productivity in an enterprise service center
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Expert Systems with Applications: An International Journal
CODE: a data complexity framework for imbalanced datasets
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
An empirical study of applying ensembles of heterogeneous classifiers on imperfect data
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Finding minimal rare itemsets and rare association rules
KSEM'10 Proceedings of the 4th international conference on Knowledge science, engineering and management
RAMOBoost: ranked minority oversampling in boosting
IEEE Transactions on Neural Networks
A data mining framework for detecting subscription fraud in telecommunication
Engineering Applications of Artificial Intelligence
Supporting Collaborative Learning and E-Discussions Using Artificial Intelligence Techniques
International Journal of Artificial Intelligence in Education
Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Detecting and ordering salient regions
Data Mining and Knowledge Discovery
Generalization of association rules through disjunction
Annals of Mathematics and Artificial Intelligence
Learning without default: a study of one-class classification and the low-default portfolio problem
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
Proceedings of the 14th International Conference on Extending Database Technology
A dynamic over-sampling procedure based on sensitivity for multi-class problems
Pattern Recognition
Inactive learning?: difficulties employing active learning in practice
ACM SIGKDD Explorations Newsletter
An exploration of learning when data is noisy and imbalanced
Intelligent Data Analysis
Borderline over-sampling for imbalanced data classification
International Journal of Knowledge Engineering and Soft Data Paradigms
An empirical evaluation of rotation-based ensemble classifiers for customer churn prediction
Expert Systems with Applications: An International Journal
Genetic algorithms as a pre processing strategy for imbalanced datasets
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Good seed makes a good crop: accelerating active learning using language modeling
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Distributed tuning of machine learning algorithms using MapReduce Clusters
Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications
ClassySeg: a machine learning approach to automatic stroke segmentation
Proceedings of the Eighth Eurographics Symposium on Sketch-Based Interfaces and Modeling
HAIS'11 Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part I
Margin-based over-sampling method for learning from imbalanced datasets
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Improving k nearest neighbor with exemplar generalization for imbalanced classification
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Sample subset optimization for classifying imbalanced biological data
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Mining competitor relationships from online news: A network-based approach
Electronic Commerce Research and Applications
Asymmetric Kernel scaling for imbalanced data classification
WILF'11 Proceedings of the 9th international conference on Fuzzy logic and applications
Data preparation techniques for improving rare class prediction
MAMECTIS/NOLASC/CONTROL/WAMUS'11 Proceedings of the 13th WSEAS international conference on mathematical methods, computational techniques and intelligent systems, and 10th WSEAS international conference on non-linear analysis, non-linear systems and chaos, and 7th WSEAS international conference on dynamical systems and control, and 11th WSEAS international conference on Wavelet analysis and multirate systems: recent researches in computational techniques, non-linear systems and control
A learning strategy for highly imbalanced classification
Proceedings of the Third International Conference on Internet Multimedia Computing and Service
Clustering based bagging algorithm on imbalanced data sets
IUKM'11 Proceedings of the 2011 international conference on Integrated uncertainty in knowledge modelling and decision making
Multi-instance multi-label learning
Artificial Intelligence
Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Automatic annotation of protein functional class from sparse and imbalanced data sets
VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
The novelty detection approach for different degrees of class imbalance
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Improving SVM training by means of NTIL when the data sets are imbalanced
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Optimisation and evaluation of random forests for imbalanced datasets
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Adjusting and generalizing CBA algorithm to handling class imbalance
Expert Systems with Applications: An International Journal
The class imbalance problem in TLC image classification
ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part II
Mining rare association rules in the datasets with widely varying items' frequencies
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Expert Systems with Applications: An International Journal
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Evolving neural networks with maximum AUC for imbalanced data classification
HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part I
Expert Systems with Applications: An International Journal
Relay boost fusion for learning rare concepts in multimedia
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
A novel synthetic minority oversampling technique for imbalanced data set learning
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Preprocessing unbalanced data using support vector machine
Decision Support Systems
Handling concept drift via ensemble and class distribution estimation technique
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Controlling multi-class error rates for MLP classifier by bias adjustment based on penalty matrix
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
A new over-sampling approach: Random-SMOTE for learning from imbalanced data sets
KSEM'11 Proceedings of the 5th international conference on Knowledge Science, Engineering and Management
An efficient approach to mine periodic-frequent patterns in transactional databases
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Save the best for last? The treatment of dominant predictors in financial forecasting
Expert Systems with Applications: An International Journal
Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers
Artificial Intelligence in Medicine
Estimating conversion rate in display advertising from past erformance data
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Foundation of mining class-imbalanced data
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Expert Systems with Applications: An International Journal
A novel classification algorithm to noise data
ICSI'12 Proceedings of the Third international conference on Advances in Swarm Intelligence - Volume Part II
Extensions of ant-miner algorithm to deal with class imbalance problem
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Rough Sets for Handling Imbalanced Data: Combining Filtering and Rule-based Classifiers
Fundamenta Informaticae - SPECIAL ISSUE ON CONCURRENCY SPECIFICATION AND PROGRAMMING (CS&P 2005) Ruciane-Nide, Poland, 28-30 September 2005
ACM Computing Surveys (CSUR)
BRACID: a comprehensive approach to learning rules from imbalanced data
Journal of Intelligent Information Systems
Personal and Ubiquitous Computing
Over-Sampling from an auxiliary domain
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
App recommendation: a contest between satisfaction and temptation
Proceedings of the sixth ACM international conference on Web search and data mining
International Journal of Information Retrieval Research
Cost-Sensitive Learning via Priority Sampling to Improve the Return on Marketing and CRM Investment
Journal of Management Information Systems
Feature selection for high-dimensional imbalanced data
Neurocomputing
An empirical study of learning from imbalanced data
ADC '11 Proceedings of the Twenty-Second Australasian Database Conference - Volume 115
Detection and classification of peer-to-peer traffic: A survey
ACM Computing Surveys (CSUR)
Artificial Intelligence in Medicine
Advances in Artificial Intelligence
An improved neighborhood-restricted association rule-based recommender system
ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137
Class imbalance and the curse of minority hubs
Knowledge-Based Systems
Variance inflation in high dimensional Support Vector Machines
Pattern Recognition Letters
Causal inference with rare events in large-scale time-series data
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Training and assessing classification rules with imbalanced data
Data Mining and Knowledge Discovery
Information Sciences: an International Journal
Adjusted F-measure and kernel scaling for imbalanced data learning
Information Sciences: an International Journal
Technical Section: A machine learning approach to automatic stroke segmentation
Computers and Graphics
Multimedia Tools and Applications
A time-efficient breadth-first level-wise lattice-traversal algorithm to discover rare itemsets
Data Mining and Knowledge Discovery
Aggregative quantification for regression
Data Mining and Knowledge Discovery
Imbalanced evolving self-organizing learning
Neurocomputing
Key roles of closed sets and minimal generators in concise representations of frequent patterns
Intelligent Data Analysis
IIvotes ensemble for imbalanced data
Intelligent Data Analysis - Combined Learning Methods and Mining Complex Data
Robust classification of imbalanced data using one-class and two-class SVM-based multiclassifiers
Intelligent Data Analysis - Business Analytics and Intelligent Optimization
Hi-index | 0.02 |
Rare objects are often of great interest and great value. Until recently, however, rarity has not received much attention in the context of data mining. Now, as increasingly complex real-world problems are addressed, rarity, and the related problem of imbalanced data, are taking center stage. This article discusses the role that rare classes and rare cases play in data mining. The problems that can result from these two forms of rarity are described in detail, as are methods for addressing these problems. These descriptions utilize examples from existing research. So that this article provides a good survey of the literature on rarity in data mining. This article also demonstrates that rare classes and rare cases are very similar phenomena---both forms of rarity are shown to cause similar problems during data mining and benefit from the same remediation methods.