Feature based techniques for auto-detection of novel email worms

Authors:
Mohammad M. Masud;Latifur Khan;Bhavani Thuraisingham
Affiliations:
Department of Computer Science, The University of Texas at Dallas, Richardson, Texas;Department of Computer Science, The University of Texas at Dallas, Richardson, Texas;Department of Computer Science, The University of Texas at Dallas, Richardson, Texas
Venue:
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2007

Citing 8
Cited 4

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
C4.5: programs for machine learning

C4.5: programs for machine learning
Machine Learning

Machine Learning
MEF: Malicious Email Filter - A UNIX Mail Filter That Detects Malicious Windows Executables

Proceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference
Polygraph: Automatically Generating Signatures for Polymorphic Worms

SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Autograph: toward automated, distributed worm signature detection

SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
An email worm vaccine architecture

ISPEC'05 Proceedings of the First international conference on Information Security Practice and Experience

A scalable multi-level feature extraction technique to detect malicious executables

Information Systems Frontiers
Feature reduction to speed up malware classification

NordSec'11 Proceedings of the 16th Nordic conference on Information Security Technology for Applications
Taster's choice: a comparative analysis of spam feeds

Proceedings of the 2012 ACM conference on Internet measurement conference
Using low-level dynamic attributes for malware detection based on data mining methods

MMM-ACNS'12 Proceedings of the 6th international conference on Mathematical Methods, Models and Architectures for Computer Network Security: computer network security

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work focuses on applying data mining techniques to detect email worms. We apply a feature-based detection technique. These features are extracted using different statistical and behavioral analysis of emails sent over a certain period of time. The number of features thus extracted is too large. So, our goal is to select the best set of features that can efficiently distinguish between normal and viral emails using classification techniques. First, we apply Principal Component Analysis (PCA) to reduce the high dimensionality of data and to find a projected, optimal set of attributes. We observe that the application of PCA on a benchmark dataset improves the accuracy of detecting novel worms. Second, we apply J48 decision tree algorithm to determine the relative importance of features based on information gain. We are able to identify a subset of features, along with a set of classification rules that have a better performance in detecting novel worms than the original set of features or PCA-reduced features. Finally, we compare our results with published results and discuss our future plans to extend this work.