A robust and scalable clustering algorithm for mixed type attributes in large database environment
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
GCA: A real-time grid-based clustering algorithm for large data set
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Hi-index | 0.00 |
A semi-supervised clustering algorithm based on the traditional k-means algorithm is proposed for network anomaly detection. We improve the original algorithm mainly in three aspects. First, the number of clusters is automatically decided by merging and splitting of clusters. Second, a small portion of labeled samples are employed to supervise the clustering process in the merging and splitting stage. Also, we modify the algorithm to directly process the symbolic attribute values. Experimental result on the KDD 99 intrusion detection datasets shows that our algorithm has high detection rate while maintaining a low false positive rate.