On the Effects of Learning Set Corruption in Anomaly-Based Detection of Web Defacements

Authors:
Eric Medvet;Alberto Bartoli
Affiliations:
DEEI, University of Trieste, Via Valerio, Trieste,;DEEI, University of Trieste, Via Valerio, Trieste,
Venue:
DIMVA '07 Proceedings of the 4th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Year:
2007

Citing 10
Cited 2

Machine learning techniques for the computer security domain of anomaly detection

Machine learning techniques for the computer security domain of anomaly detection
Anomaly detection of web-based attacks

Proceedings of the 10th ACM conference on Computer and communications security
Selection, combination, and evaluation of effective software sensors for detecting abnormal computer usage

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A Survey of Outlier Detection Methodologies

Artificial Intelligence Review
Learning from little: comparison of classifiers given little training

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Class noise vs. attribute noise: a quantitative study of their impacts

Artificial Intelligence Review
Anomalous system call detection

ACM Transactions on Information and System Security (TISSEC)
Unsupervised SVM Based on p-kernels for Anomaly Detection

ICICIC '06 Proceedings of the First International Conference on Innovative Computing, Information and Control - Volume 2
Automatic Integrity Checks for Remote Web Resources

IEEE Internet Computing
On-line anomaly detection of deployed software: a statistical machine learning approach

Proceedings of the 3rd international workshop on Software quality assurance

A Framework for Large-Scale Detection of Web Site Defacements

ACM Transactions on Internet Technology (TOIT)
Anomaly detection techniques for a web defacement monitoring service

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

Anomaly detection is a commonly used approach for constructing intrusion detection systems. A key requirement is that the data used for building the resource profile are indeed attack-free, but this issue is often skipped or taken for granted. In this work we consider the problem of corruption in the learning data, with respect to a specific detection system, i.e., a web site integrity checker. We used corrupted learning sets and observed their impact on performance (in terms of false positives and false negatives). This analysis enabled us to gain important insights into this rather unexplored issue. Based on this analysis we also present a procedure for detecting whether a learning set is corrupted. We evaluated the performance of our proposal and obtained very good results up to a corruption rate close to 50%. Our experiments are based on collections of real data and consider three different flavors of anomaly detection.