Elements of information theory
Elements of information theory
Choosability and fractional chromatic numbers
Proceedings of an international symposium on Graphs and combinatorics
Nonparametric conditional predictive regions for time series
Computational Statistics & Data Analysis
Diagnosing network-wide traffic anomalies
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
A Classification Framework for Anomaly Detection
The Journal of Machine Learning Research
Estimating the Support of a High-Dimensional Distribution
Neural Computation
The Journal of Machine Learning Research
Consistency and Convergence Rates of One-Class SVMs and Related Algorithms
The Journal of Machine Learning Research
Using Local Dependencies within Batches to Improve Large Margin Classifiers
The Journal of Machine Learning Research
Minimum complexity regression estimation with weakly dependent observations
IEEE Transactions on Information Theory - Part 2
Minimax-optimal classification with dyadic decision trees
IEEE Transactions on Information Theory
Hi-index | 0.00 |
A minimum volume (MV) set, at level @a, is a set G"@a^* having minimum volume among all those sets containing at least @a probability mass. MV sets provide a natural notion of the 'central mass' of a distribution and, as such, have recently become popular as a tool for the detection of anomalies in multivariate data. Motivated by the fact that anomaly detection problems frequently arise in settings with temporally indexed measurements, we propose here a new method for the estimation of MV sets from dependent data. Our method is based on the concept of complexity-penalized estimation, extending recent work of Scott and Nowak for the case of independent and identically distributed measurements, and has both desirable theoretical properties and a practical implementation. Of particular note is the fact that, for a large class of stochastic processes, choice of an appropriate complexity penalty reduces to the selection of a single tuning parameter, which represents the data dependency of the underlying stochastic process. While in reality the dependence structure is unknown, we offer a data-dependent method for selecting this parameter, based on subsampling principles. Our work is motivated by and illustrated through an application to the detection of anomalous traffic levels in Internet traffic time series.