IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining One-Class Classifiers
MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Authorship verification as a one-class classification problem
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Estimating the Support of a High-Dimensional Distribution
Neural Computation
Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Ensembles of One Class Support Vector Machines
MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Structured One-Class Classification
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Predicting quality flaws in user-generated content: the case of wikipedia
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Clustering-based ensembles for one-class classification
Information Sciences: an International Journal
Hi-index | 0.00 |
A number of relevant information retrieval classification problems are one-class classification problems at heart. I.e., labeled data is only available for one class, the so-called target class, and common discrimination-based classification approaches, be them binary or multiclass, are not applicable. Achieving a high effectiveness when solving one-class problems is difficult anyway and it becomes even more challenging when the target class data is multimodal, which is often the case. To address these concerns we propose a cluster-based one-class ensemble that consists of four steps: (1) applying a clustering algorithm to the target class data, (2) training an individual one-class classifier for each of the identified clusters, (3) aggregating the decisions of the individual classifiers, and (4) selecting the best fitting clustering model. We evaluate our approach with four datasets: an artificially generated dataset, a dataset compiled from a known multiclass text corpus, and two datasets related to one-class problems that received much attention recently, namely authorship verification and quality flaw prediction. Our approach outperforms a one-class SVM on all four datasets.