Event detection from time series data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Activity monitoring: noticing interesting changes in behavior
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A unifying framework for detecting outliers and change points from non-stationary time series data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A Unifying Framework for Detecting Outliers and Change Points from Time Series
IEEE Transactions on Knowledge and Data Engineering
Statistical change detection for multi-dimensional data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Information and Complexity in Statistical Modeling
Information and Complexity in Statistical Modeling
Intelligent file scoring system for malware detection from the gray list
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Model selection by sequentially normalized least squares
Journal of Multivariate Analysis
Hi-index | 0.00 |
We are concerned with the issue of real-time change-point detection in time series. This technology has recently received vast attentions in the area of data mining since it can be applied to a wide variety of important risk management issues such as the detection of failures of computer devices from computer performance data, the detection of masqueraders/ malicious executables from computer access logs, etc. In this paper we propose a new method of real-time change point detection employing the sequentially discounting normalized maximum likelihood coding (SDNML). Here the SDNML is a method for sequential data compression of a sequence, which we newly develop in this paper. It attains the least code length for the sequence and the effect of past data is gradually discounted as time goes on, hence the data compression can be done adaptively to non-stationary data sources. In our method, the SDNML is used to learn the mechanism of a time series, then a change-point score at each time is measured in terms of the SDNML code-length. We empirically demonstrate the significant superiority of our method over existing methods, such as the predictive-coding method and the hypothesis testingmethod, in terms of detection accuracy and computational efficiency for artificial data sets. We further apply our method into real security issues called malware detection. We empirically demonstrate that our method is able to detect unseen security incidents at significantly early stages.