A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Machine Learning
Learning When Negative Examples Abound
ECML '97 Proceedings of the 9th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
The genomics of a signaling pathway: a KDD Cup challenge task
ACM SIGKDD Explorations Newsletter
One class SVM for yeast regulation prediction
ACM SIGKDD Explorations Newsletter
On Evaluating Performance of Classifiers for Rare Classes
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
One-class svms for document classification
The Journal of Machine Learning Research
The class imbalance problem: A systematic study
Intelligent Data Analysis
A novelty detection approach to classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Predicting the product purchase patterns of corporate customers
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Focusing on non-respondents: Response modeling with novelty detectors
Expert Systems with Applications: An International Journal
One-class document classification via Neural Networks
Neurocomputing
Learning on the border: active learning in imbalanced data classification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Classification of Anti-learnable Biological and Synthetic Data
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Imbalanced text classification: A term weighting approach
Expert Systems with Applications: An International Journal
A New Performance Evaluation Method for Two-Class Imbalanced Problems
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
SVMs modeling for highly imbalanced classification
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Parameter optimization of Kernel-based one-class classifier on imbalance text learning
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Multi-modality in one-class classification
Proceedings of the 19th international conference on World wide web
Cost-sensitive supported vector learning to rank imbalanced data set
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
FSVM-CIL: fuzzy support vector machines for class imbalance learning
IEEE Transactions on Fuzzy Systems - Special section on computing with words
Expert Systems with Applications: An International Journal
Learning without default: a study of one-class classification and the low-default portfolio problem
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
A dynamic over-sampling procedure based on sensitivity for multi-class problems
Pattern Recognition
Authorship attribution with latent Dirichlet allocation
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Margin-based over-sampling method for learning from imbalanced datasets
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Novelty detection for the inspection of light-emitting diodes
Expert Systems with Applications: An International Journal
The novelty detection approach for different degrees of class imbalance
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
FISA: feature-based instance selection for imbalanced text classification
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
An analysis of the anti-learning phenomenon for the class symmetric polyhedron
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Expert Systems with Applications: An International Journal
Machine learning techniques and mammographic risk assessment
IWDM'10 Proceedings of the 10th international conference on Digital Mammography
Computational Biology and Chemistry
Computational Biology and Chemistry
Parameter estimation of one-class SVM on imbalance text classification
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
WSEAS Transactions on Information Science and Applications
Personal and Ubiquitous Computing
NLP-driven constructive learning for filtering an IR document stream
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
A new framework for optimal classifier design
Pattern Recognition
Novel classifier scheme for imbalanced problems
Pattern Recognition Letters
Classification and outlier detection based on topic based pattern synthesis
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Adjusted F-measure and kernel scaling for imbalanced data learning
Information Sciences: an International Journal
Robust classification of imbalanced data using one-class and two-class SVM-based multiclassifiers
Intelligent Data Analysis - Business Analytics and Intelligent Optimization
Hi-index | 0.01 |
There are many practical applications where learning from single class examples is either, the only possible solution, or has a distinct performance advantage. The first case occurs when obtaining examples of a second class is difficult, e.g., classifying sites of "interest" based on web accesses. The second situation is exemplified by the gene knock-out experiments for understanding Aryl Hydrocarbon Receptor signalling pathway that provided the data for the second task of the KDD 2002 Cup, where minority one-class SVMs significantly outperform models learnt using examples from both classes.This paper explores the limits of supervised learning of a two class discrimination from data with heavily unbalanced class proportions. We focus on the case of supervised learning with support vector machines. We consider the impact of both sampling and weighting imbalance compensation techniques and then extend the balancing to extreme situations when one of the classes is ignored completely and the learning is accomplished using examples from a single class.Our investigation with the data for KDD 2002 Cup as well as text benchmarks such as Reuters Newswire shows that there is a consistent pattern of performance differences between one and two-class learning for all SVMs investigated, and these patterns persist even with aggressive dimensionality reduction through automated feature selection. Using insight gained from the above analysis, we generate synthetic data showing similar pattern of performance.