Document Ranking and the Vector-Space Model
IEEE Software
A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists
Information Retrieval
Selecting and constructing features using grammatical evolution
Pattern Recognition Letters
Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks
Expert Systems with Applications: An International Journal
Combining neural networks and semantic feature space for email classification
Knowledge-Based Systems
Automatic thesaurus construction for spam filtering using revised back propagation neural network
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper a method for feature selection and classification of email spam messages is presented. The selection of features is performed in two steps: The selection is performed by measuring their entropy and a fine-tuning selection is implemented using a genetic algorithm. In the classification process, a Radial Basis Function Network is used to ensure robust classification rate even in case of complex cluster structure. The proposed method shows that, when using a two-level feature selection, a better accuracy is achieved than using one-stage selection. Also, the use of a lemmatizer or a stop-word list gives minimal classification improvement. The proposed method achieves 96-97% average accuracy when using only 20 features out of 15000.