Algorithms for clustering data
Algorithms for clustering data
ACM Computing Surveys (CSUR)
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Combining Multiple Weak Clusterings
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Solving cluster ensemble problems by bipartite graph partitioning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Aggregating inconsistent information: ranking and clustering
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning to detect phishing emails
Proceedings of the 16th international conference on World Wide Web
Modified global k-means algorithm for minimum sum-of-squares clustering problems
Pattern Recognition
Pattern Recognition, Fourth Edition
Pattern Recognition, Fourth Edition
Using Differencing to Increase Distinctiveness for Phishing Website Clustering
UIC-ATC '09 Proceedings of the 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing
A new model for classifying DNA code inspired by neural networks and FSA
PKAW'06 Proceedings of the 9th Pacific Rim Knowledge Acquisition international conference on Advances in Knowledge Acquisition and Management
Detection of CAN by ensemble classifiers based on ripple down rules
PKAW'12 Proceedings of the 12th Pacific Rim conference on Knowledge Management and Acquisition for Intelligent Systems
A multi-tier ensemble construction of classifiers for phishing email detection and filtering
CSS'12 Proceedings of the 4th international conference on Cyberspace Safety and Security
Hi-index | 0.00 |
This article investigates internet commerce security applications of a novel combined method, which uses unsupervised consensus clustering algorithms in combination with supervised classification methods. First, a variety of independent clustering algorithms are applied to a randomized sample of data. Second, several consensus functions and sophisticated algorithms are used to combine these independent clusterings into one final consensus clustering. Third, the consensus clustering of the randomized sample is used as a training set to train several fast supervised classification algorithms. Finally, these fast classification algorithms are used to classify the whole large data set. One of the advantages of this approach is in its ability to facilitate the inclusion of contributions from domain experts in order to adjust the training set created by consensus clustering. We apply this approach to profiling phishing emails selected from a very large data set supplied by the industry partners of the Centre for Informatics and Applied Optimization. Our experiments compare the performance of several classification algorithms incorporated in this scheme.