The nature of statistical learning theory
The nature of statistical learning theory
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Less is More: Active Learning with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Extreme re-balancing for SVMs: a case study
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Classification and knowledge discovery in protein databases
Journal of Biomedical Informatics - Special issue: Biomedical machine learning
Exploratory Under-Sampling for Class-Imbalance Learning
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Fast Kernel Classifiers with Online and Active Learning
The Journal of Machine Learning Research
The class imbalance problem: A systematic study
Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
A novelty detection approach to classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
Supervised machine learning algorithms for protein structure classification
Computational Biology and Chemistry
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Reducing class imbalance during active learning for named entity annotation
Proceedings of the fifth international conference on Knowledge capture
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Proceedings of the international conference on Multimedia information retrieval
A large-scale active learning system for topical categorization on the web
Proceedings of the 19th international conference on World wide web
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Bringing active learning to life
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Adaptive methods for classification in arbitrarily imbalanced and drifting data streams
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
RAMOBoost: ranked minority oversampling in boosting
IEEE Transactions on Neural Networks
Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Incremental multiple classifier active learning for concept indexing in images and videos
MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Inactive learning?: difficulties employing active learning in practice
ACM SIGKDD Explorations Newsletter
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Combining integrated sampling with SVM ensembles for learning from imbalanced datasets
Information Processing and Management: an International Journal
Finding rare classes: adapting generative and discriminative models in active learning
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Expert Systems with Applications: An International Journal
Sub-sampling: Real-time vision for micro air vehicles
Robotics and Autonomous Systems
Generating balanced classifier-independent training samples from unlabeled data
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Real-time top-n recommendation in social streams
Proceedings of the sixth ACM conference on Recommender systems
Active learning for imbalanced sentiment classification
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A new probabilistic active sample selection algorithm for class imbalance problem
International Journal of Knowledge Engineering and Soft Data Paradigms
Social Link Prediction in Online Social Tagging Systems
ACM Transactions on Information Systems (TOIS)
Distributed Privacy-Preserving Decision Support System for Highly Imbalanced Clinical Data
ACM Transactions on Management Information Systems (TMIS)
TeRec: a temporal recommender system over tweet stream
Proceedings of the VLDB Endowment
Class imbalance and the curse of minority hubs
Knowledge-Based Systems
Imbalanced evolving self-organizing learning
Neurocomputing
Hi-index | 0.00 |
This paper is concerned with the class imbalance problem which has been known to hinder the learning performance of classification algorithms. The problem occurs when there are significantly less number of observations of the target concept. Various real-world classification tasks, such as medical diagnosis, text categorization and fraud detection suffer from this phenomenon. The standard machine learning algorithms yield better prediction performance with balanced datasets. In this paper, we demonstrate that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes. We also propose an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset. The proposed method yields an efficient querying system and allows active learning to be applied to very large datasets. Our experimental results show that with an early stopping criteria, active learning achieves a fast solution with competitive prediction performance in imbalanced data classification.