Network snomaly detection based on semi-supervised clustering

Authors:
Wei Xiaotao;Huang Houkuan;Tian Shengfeng
Affiliations:
School of Software, Beijing Jiaotong University, Beijing, China;School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China;School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Venue:
SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
Year:
2007

Citing 8
Cited 0

A robust and scalable clustering algorithm for mixed type attributes in large database environment

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
GCA: A real-time grid-based clustering algorithm for large data set

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
A novel intrusion detection model based on multi-layer self-organizing maps and principal component analysis

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

A semi-supervised clustering algorithm based on the traditional k-means algorithm is proposed for network anomaly detection. We improve the original algorithm mainly in three aspects. First, the number of clusters is automatically decided by merging and splitting of clusters. Second, a small portion of labeled samples are employed to supervise the clustering process in the merging and splitting stage. Also, we modify the algorithm to directly process the symbolic attribute values. Experimental result on the KDD 99 intrusion detection datasets shows that our algorithm has high detection rate while maintaining a low false positive rate.