LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining top-n local outliers in large databases
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Infominer: mining surprising periodic patterns
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A new approach to analyzing gene expression time series data
Proceedings of the sixth annual international conference on Computational biology
Findout: finding outliers in very large datasets
Knowledge and Information Systems
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Fast Outlier Detection in High Dimensional Spaces
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Mining Deviants in a Time Series Database
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Finding surprising patterns in a time series database in linear time and space
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
A symbolic representation of time series, with implications for streaming algorithms
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Translation-invariant mixture models for curve clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Online novelty detection on temporal sequences
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Surprising Periodic Patterns
Data Mining and Knowledge Discovery
RDF: A Density-Based Outlier Detection Method using Vertical Data Representation
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Feature bagging for outlier detection
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Modeling Multiple Time Series for Anomaly Detection
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Assumption-free anomaly detection in time series
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Outlier detection by sampling with accuracy guarantees
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
SAXually Explicit Images: Finding Unusual Shapes
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Shift-invariant grouped multi-task learning for Gaussian processes
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Searching and mining trillions of time series subsequences under dynamic time warping
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Artificial Intelligence in Medicine
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Data mining a trillion time series subsequences under dynamic time warping
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Catalogs of periodic variable stars contain large numbers of periodic light-curves (photometric time series data from the astrophysics domain). Separating anomalous objects from well-known classes is an important step towards the discovery of new classes of astronomical objects. Most anomaly detection methods for time series data assume either a single continuous time series or a set of time series whose periods are aligned. Light-curve data precludes the use of these methods as the periods of any given pair of light-curves may be out of sync. One may use an existing anomaly detection method if, prior to similarity calculation, one performs the costly act of aligning two light-curves, an operation that scales poorly to massive data sets. This paper presents PCAD, an unsupervised anomaly detection method for large sets of unsynchronized periodic time-series data, that outputs a ranked list of both global and local anomalies. It calculates its anomaly score for each light-curve in relation to a set of centroids produced by a modified k-means clustering algorithm. Our method is able to scale to large data sets through the use of sampling. We validate our method on both light-curve data and other time series data sets. We demonstrate its effectiveness at finding known anomalies, and discuss the effect of sample size and number of centroids on our results. We compare our method to naive solutions and existing time series anomaly detection methods for unphased data, and show that PCAD's reported anomalies are comparable to or better than all other methods. Finally, astrophysicists on our team have verified that PCAD finds true anomalies that might be indicative of novel astrophysical phenomena.