A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
Fast and reliable anomaly detection in categorical data
Proceedings of the 21st ACM international conference on Information and knowledge management
Mining irregularities in maritime container itineraries
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Event stream database based architecture to detect network intrusion: (industry article)
Proceedings of the 7th ACM international conference on Distributed event-based systems
FLEAD: online frequency likelihood estimation anomaly detection for mobile sensing
Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication
Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery
A least-squares approach to anomaly detection in static and sequential data
Pattern Recognition Letters
Ensembles for unsupervised outlier detection: challenges and research questions a position paper
ACM SIGKDD Explorations Newsletter
A ranking-based algorithm for detection of outliers in categorical data
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
This survey attempts to provide a comprehensive and structured overview of the existing research for the problem of detecting anomalies in discrete/symbolic sequences. The objective is to provide a global understanding of the sequence anomaly detection problem and how existing techniques relate to each other. The key contribution of this survey is the classification of the existing research into three distinct categories, based on the problem formulation that they are trying to solve. These problem formulations are: 1) identifying anomalous sequences with respect to a database of normal sequences; 2) identifying an anomalous subsequence within a long sequence; and 3) identifying a pattern in a sequence whose frequency of occurrence is anomalous. We show how each of these problem formulations is characteristically distinct from each other and discuss their relevance in various application domains. We review techniques from many disparate and disconnected application domains that address each of these formulations. Within each problem formulation, we group techniques into categories based on the nature of the underlying algorithm. For each category, we provide a basic anomaly detection technique, and show how the existing techniques are variants of the basic technique. This approach shows how different techniques within a category are related or different from each other. Our categorization reveals new variants and combinations that have not been investigated before for anomaly detection. We also provide a discussion of relative strengths and weaknesses of different techniques. We show how techniques developed for one problem formulation can be adapted to solve a different formulation, thereby providing several novel adaptations to solve the different problem formulations. We also highlight the applicability of the techniques that handle discrete sequences to other related areas such as online anomaly detection and time series anomaly detection.