Annotated Minimum Volume Sets for Nonparametric Anomaly Discovery

  • Authors:
  • Clayton D. Scott;Eric D. Kolaczyk

  • Affiliations:
  • University of Michigan, Dept. of Elec. Eng. and Comp. Sci., Ann Arbor, MI 48105;Boston University, Dept. of Mathematics and Statistics, Boston, MA 02215

  • Venue:
  • SSP '07 Proceedings of the 2007 IEEE/SP 14th Workshop on Statistical Signal Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We consider an anomaly detection problem, wherein a combination of typical and anomalous data are observed and it is necessary to identify the anomalies in this particular dataset without recourse to labeled exemplars. We take as our goal to produce an annotated ranking of the observations, indicating the relative priority for each to be examined further as a possible anomaly, while making no assumptions on the distribution of typical data. We propose a framework in which each observation is linked to a corresponding minimum volume set and, implicitly adopting a hypothesis testing perspective, each set is associated with a test. An inherent ordering of these sets yields a natural ranking, while the association of each test with a false discovery rate yields an appropriate annotation. The combination of minimum volume set methods with false discovery rate principles, in the context of data contaminated by anomalies, is new and estimation of the key underlying quantities requires that a number of issues be addressed. We offer some solutions to the relevant estimation problems, and illustrate the proposed methodology on synthetic and computer network traffic data.