Targeting spam control on middleboxes: Spam detection based on layer-3 e-mail content classification

Authors:
Muhammad N. Marsono;M. Watheq El-Kharashi;Fayez Gebali
Affiliations:
Faculty of Electrical Engineering, Universiti Teknologi Malaysia, 81310 Johor Bahru, Malaysia;Department of Computer and Systems Engineering, Ain Shams University, Cairo 11517, Egypt;Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada V8W 3P6
Venue:
Computer Networks: The International Journal of Computer and Telecommunications Networking
Year:
2009

Citing 24
Cited 4

Practical network support for IP traceback

Proceedings of the conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
Computer Networks

Computer Networks
Beyond folklore: observations on fragmented traffic

IEEE/ACM Transactions on Networking (TON)
Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure

Neural Computation
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
The Making of a Spam Zombie Army: Dissecting the Sobig Worms

IEEE Security and Privacy
Spam filters: bayes vs. chi-squared; letters vs. words

ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Stopping outgoing spam

EC '04 Proceedings of the 5th ACM conference on Electronic commerce
Density-based spam detector

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Characterizing a spam traffic

Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Combining winnow and orthogonal sparse bigrams for incremental spam filtering

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Polygraph: Automatically Generating Signatures for Polymorphic Worms

SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Identifying spam without peeking at the contents

Crossroads
Detecting evasion attacks at high speeds without reassembly

Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Spam and the ongoing battle for the inbox

Communications of the ACM - Spam and the ongoing battle for the inbox
Speeding up TCP/IP: faster processors are not enough

PCC '02 Proceedings of the Performance, Computing, and Communications Conference, 2002. on 21st IEEE International
Prioritized e-mail servicing to reduce non-spam delay and loss: a performance analysis

International Journal of Network Management
Architecture for a Hardware-Based, TCP/IP Content-Processing System

IEEE Micro
Will New Standards Help Curb Spam?

Computer
A spam rejection scheme during SMTP sessions based on layer-3 e-mail classification

Journal of Network and Computer Applications
An intelligent approach of packet marking at edge router for IP traceback

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Polymorphic worm detection using structural information of executables

RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
No free lunch theorems for optimization

IEEE Transactions on Evolutionary Computation
Support vector machines for spam categorization

IEEE Transactions on Neural Networks

Study on Ensemble Classification Methods towards Spam Filtering

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Session-based classification of internet applications in 3G wireless networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Packet-level open-digest fingerprinting for spam detection on middleboxes

International Journal of Network Management
Grindstone4Spam: An optimization toolkit for boosting e-mail classification

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a spam detection technique, at the packet level (layer 3), based on classification of e-mail contents. Our proposal targets spam control implementations on middleboxes. E-mails are first pre-classified (pre-detected) for spam on a per-packet basis, without the need for reassembly. This, in turn, allows fast e-mail class estimation (spam detection) at receiving e-mail servers to support more effective spam handling on both inbound and outbound (relayed) e-mails. In this paper, the naive Bayes classification technique is adapted to support both pre-classification and fast e-mail class estimation, on a per-packet basis. We focus on evaluating the accuracy of spam detection at layer 3, considering the constraints on processing byte-streams over the network, including packet re-ordering, fragmentation, overlapped bytes, and different packet sizes. Results show that the proposed layer-3 classification technique gives less than 0.5% false positive, which approximately equals the performance attained at layer 7. This shows that classifying e-mails at the packet level could differentiate non-spam from spam with high confidence for a viable spam control implementation on middleboxes.