Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Improved robustness of signature-based near-replica detection via lexicon randomization
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Spam and the Social-Technical Gap
Computer
Content based SMS spam filtering
Proceedings of the 2006 ACM symposium on Document engineering
Lexicon randomization for near-duplicate detection with I-Match
The Journal of Supercomputing
Speed Up Statistical Spam Filter by Approximation
IEEE Transactions on Computers
Identifying almost identical files using context triggered piecewise hashing
Digital Investigation: The International Journal of Digital Forensics & Incident Response
SMS Spam Detection Using Noncontent Features
IEEE Intelligent Systems
Hi-index | 0.00 |
Nowadays, email spam problem continues growing drastically and many spam detection algorithms have been developed at the same time. However, there are several shortcomings shared by most of these algorithms. In order to solve these shortcomings, we present a structure free Self-adaptive piecewise hashing algorithm(SFSPH) together with its super method(SFSPH-S, which is much faster than SFSPH but has lower accuracy). Both of them are based on the extremum characteristic theory, robin fingerprint algorithm and optimization theory. Then we designed several experiments to evaluate the algorithms' performance, including accuracy, speed and robustness, by comparing them with the famous DSC algorithm and the Email Remove-duplicate Algorithm Based on SHA-1(ERABS). Our extensive experiments demonstrated the good performance and accuracy of our algorithm for spam filtering.