Concentration based feature construction approach for spam detection

Authors:
Ying Tan;Chao Deng;Guangchen Ruan
Affiliations:
Key Laboratory of Machine Perception and Intelligence and Department of Machine Intelligence, School of Electronics Engineering and Computer Science, Peking University, MOE, Beijing, P. R. China;Key Laboratory of Machine Perception and Intelligence and Department of Machine Intelligence, School of Electronics Engineering and Computer Science, Peking University, MOE, Beijing, P. R. China;Key Laboratory of Machine Perception and Intelligence and Department of Machine Intelligence, School of Electronics Engineering and Computer Science, Peking University, MOE, Beijing, P. R. China
Venue:
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Year:
2009

Citing 7
Cited 2

An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A Neural Network Based Approach to Automated E-Mail Classification

WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
A Support Vector Machine with a Hybrid Kernel and Minimal Vapnik-Chervonenkis Dimension

IEEE Transactions on Knowledge and Data Engineering
A Multi-Faceted Approach towards Spam-Resistible Mail

PRDC '05 Proceedings of the 11th Pacific Rim International Symposium on Dependable Computing
Learning to classify e-mail

Information Sciences: an International Journal
Multiple-Point bit mutation method of detector generation for SNSD model

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Support vector machines for spam categorization

IEEE Transactions on Neural Networks

An immune concentration based virus detection approach using particle swarm optimization

ICSI'10 Proceedings of the First international conference on Advances in Swarm Intelligence - Volume Part I
PSO for feature construction and binary classification

Proceedings of the 15th annual conference on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Inspired by human immune system, a concentration based feature construction (CFC) approach which utilizes a two-element concentration vector as the feature vector is proposed for spam detection in this paper. In the CFC approach, 'self' and 'non-self' concentrations are constructed by using 'self' and 'non-self' gene libraries, respectively, and subsequently are used to form a vector with two elements of concentrations for characterizing the e-mail efficiently. As a result, the design of classifier actually amounts to establishing a mapping between two real-value inputs and one binary output. The classification of the e-mail is considered as an optimization problem aiming at minimizing a formulated cost function. A clonal particle swarm optimization (CPSO) algorithm proposed by the leading author is also employed for this purpose. Several classifiers including linear discriminant, multi-layer neural networks and support vector machine are used to verify the effectiveness and robustness of the CFC approach. Experimental results demonstrate that the proposed CFC approach not only has a very much fast speed but also gives 97% and 99% of accuracy just using a two-element concentration feature vector on corpus PU1 and Ling, respectively.