Anomaly Detection in Dynamic Systems Using Weak Estimators

Authors:
Justin Zhan;B. John Oommen;Johanna Crisostomo
Affiliations:
North Carolina A&T State University;Carleton University;Carnegie Mellon University
Venue:
ACM Transactions on Internet Technology (TOIT)
Year:
2011

Citing 15
Cited 1

Learning automata: an introduction

Learning automata: an introduction
WordNet: a lexical database for English

Communications of the ACM
An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature selection using linear classifier weights: interaction with classification models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
An evaluation of statistical spam filtering techniques

ACM Transactions on Asian Language Information Processing (TALIP)
Collaborative Spam Filtering Using E-Mail Networks

Computer
Using online linear classifiers to filter spam emails

Pattern Analysis & Applications
Controlling spam with SpamAssassin

Linux Journal
Stochastic learning-based weak estimation of multinomial random variables and its applications to pattern recognition in non-stationary environments

Pattern Recognition
A Fault-Tolerant Routing Algorithm for Mobile Ad Hoc Networks Using a Stochastic Learning-Based Weak Estimation Procedure

WIMOB '06 Proceedings of the 2006 IEEE International Conference on Wireless and Mobile Computing, Networking and Communications
Review: A review of machine learning approaches to Spam filtering

Expert Systems with Applications: An International Journal
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
Stochastic Automata-Based Estimators for Adaptively Compressing Files With Nonstationary Distributions

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

A stochastic search on the line-based solution to discretized estimation

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Anomaly detection involves identifying observations that deviate from the normal behavior of a system. One of the ways to achieve this is by identifying the phenomena that characterize “normal” observations. Subsequently, based on the characteristics of data learned from the “normal” observations, new observations are classified as being either “normal” or not. Most state-of-the-art approaches, especially those which belong to the family of parameterized statistical schemes, work under the assumption that the underlying distributions of the observations are stationary. That is, they assume that the distributions that are learned during the training (or learning) phase, though unknown, are not time-varying. They further assume that the same distributions are relevant even as new observations are encountered. Although such a “stationarity” assumption is relevant for many applications, there are some anomaly detection problems where stationarity cannot be assumed. For example, in network monitoring, the patterns which are learned to represent normal behavior may change over time due to several factors such as network infrastructure expansion, new services, growth of user population, and so on. Similarly, in meteorology, identifying anomalous temperature patterns involves taking into account seasonal changes of normal observations. Detecting anomalies or outliers under these circumstances introduces several challenges. Indeed, the ability to adapt to changes in nonstationary environments is necessary so that anomalous observations can be identified even with changes in what would otherwise be classified as “normal” behavior. In this article we propose to apply a family of weak estimators for anomaly detection in dynamic environments. In particular, we apply this theory to spam email detection. Our experimental results demonstrate that our proposal is both feasible and effective for the detection of such anomalous emails.