A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
MEF: Malicious Email Filter - A UNIX Mail Filter That Detects Malicious Windows Executables
Proceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference
Polygraph: Automatically Generating Signatures for Polymorphic Worms
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Autograph: toward automated, distributed worm signature detection
SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
An email worm vaccine architecture
ISPEC'05 Proceedings of the First international conference on Information Security Practice and Experience
A scalable multi-level feature extraction technique to detect malicious executables
Information Systems Frontiers
Feature reduction to speed up malware classification
NordSec'11 Proceedings of the 16th Nordic conference on Information Security Technology for Applications
Taster's choice: a comparative analysis of spam feeds
Proceedings of the 2012 ACM conference on Internet measurement conference
Using low-level dynamic attributes for malware detection based on data mining methods
MMM-ACNS'12 Proceedings of the 6th international conference on Mathematical Methods, Models and Architectures for Computer Network Security: computer network security
Hi-index | 0.00 |
This work focuses on applying data mining techniques to detect email worms. We apply a feature-based detection technique. These features are extracted using different statistical and behavioral analysis of emails sent over a certain period of time. The number of features thus extracted is too large. So, our goal is to select the best set of features that can efficiently distinguish between normal and viral emails using classification techniques. First, we apply Principal Component Analysis (PCA) to reduce the high dimensionality of data and to find a projected, optimal set of attributes. We observe that the application of PCA on a benchmark dataset improves the accuracy of detecting novel worms. Second, we apply J48 decision tree algorithm to determine the relative importance of features based on information gain. We are able to identify a subset of features, along with a set of classification rules that have a better performance in detecting novel worms than the original set of features or PCA-reduced features. Finally, we compare our results with published results and discuss our future plans to extend this work.