RADAR: rare category detection via computation of boundary degree

Authors:
Hao Huang;Qinming He;Jiangfeng He;Lianhang Ma
Affiliations:
College of Computer Science and Technology, Zhejiang University, Hangzhou, China;College of Computer Science and Technology, Zhejiang University, Hangzhou, China;College of Computer Science and Technology, Zhejiang University, Hangzhou, China;College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Venue:
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Year:
2011

Citing 4
Cited 2

BORDER: Efficient Computation of Boundary Points

IEEE Transactions on Knowledge and Data Engineering
Large Scale Detection of Irregularities in Accounting Data

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Graph-Based Rare Category Detection

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Category detection using hierarchical mean shift

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

A unifying theory of active discovery and learning

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Rare category exploration

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Rare category detection is an open challenge for active learning. It can help select anomalies and then query their class labels with human experts. Compared with traditional anomaly detection, this task does not focus on finding individual and isolated instances. Instead, it selects interesting and useful anomalies from small compact clusters. Furthermore, the goal of rare category detection is to request as few queries as possible to find at least one representative data point from each rare class. Previous research works can be divided into three major groups, model-based, density-based and clustering-based methods. Performance of these approaches is affected by the local densities of the rare classes. In this paper, we develop a density insensitive method for rare category detection called RADAR. It makes use of reverse k-nearest neighbors to measure the boundary degree of each data point, and then selects examples with high boundary degree for the class-label querying. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our algorithm.