A Modified Version of the K-Means Algorithm with a Distance Based on Cluster Symmetry
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Introductory and Advanced Topics
Data Mining: Introductory and Advanced Topics
Redefining Clustering for High-Dimensional Applications
IEEE Transactions on Knowledge and Data Engineering
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
An Efficient Clustering Scheme to Exploit Hierarchical Data in Network Traffic Analysis
IEEE Transactions on Knowledge and Data Engineering
Continuous Clustering of Moving Objects
IEEE Transactions on Knowledge and Data Engineering
K-means Clustering Algorithm with Improved Initial Center
WKDD '09 Proceedings of the 2009 Second International Workshop on Knowledge Discovery and Data Mining
Improved K-Means Algorithm and Application in Customer Segmentation
APWCS '10 Proceedings of the 2010 Asia-Pacific Conference on Wearable Computing Systems
Clustering with a genetically optimized approach
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
Clustering techniques are used to group up the transactions based on the relevancy. Cluster analysis is one of the primary data analysis method. The clustering process can be done in two ways such that Hierarchical clusters and partition clustering. Hierarchical clustering technique uses the structure and data values. The partition clustering technique uses the data similarity factors. Transactions are partitioned into small groups. K-means clustering algorithm is one of the widely used clustering algorithms. Local cluster accuracy is high in the K-means clustering algorithm. Inter cluster relationship is not concentrated in the K-means algorithm. K-means clustering algorithm requires the cluster count as the major input. The system chooses random transactions are initial centroid for each cluster. Cluster accuracy is associated with the initial centroid estimation process. The random transaction based centroid selection model may choose similar transactions. In this case the cluster accuracy is limited with respect to the distance between the centroid values. The proposed system is designed to improve the K-means clustering algorithm with efficient centroid estimation models. Three centroid estimation models are proposed system. They are random selection with distance management, mean distance model and inter cluster distance model. Cosine distance measure and Euclidean distance measure are used to estimate similarity between the transactions. Three centroid estimation models are tested with two distance measure schemes. Precision and recall and fitness measure are used to test the cluster accuracy levels. Java language and Oracle database are selected for the system development.