Machine Learning
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination
The Journal of Machine Learning Research
Hi-index | 0.00 |
Anomaly detection in data streams requires a signal of an unusual event, but an actionable response requires diagnostics. Consequently, an important task is to isolate to the few key attributes that contribute to the signal from among a large collection. We introduce this contributor problem to the machine learning community and present a solution for monitoring in modern systems (with nonlinear reference conditions, high dimensions, categorical attributes, missing data, and so forth). The objective is to identify attributes that contribute to a signal, for both individual and multiple anomalies, or from several anomaly groups. Although related to the feature selection problem, the extreme sparseness of anomalies leads to scores that are designed specifically for the contributors problem. Statistical criteria are provided to quantitatively address decision rules and false alarms and the method can be computed quickly. Comparisons are made to traditional contribution plots.