The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Accelerating exact k-means algorithms with geometric reasoning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Multidimensional binary search trees used for associative searching
Communications of the ACM
SALSA: the stochastic approach for link-structure analysis
ACM Transactions on Information Systems (TOIS)
Proceedings of the 11th international conference on World Wide Web
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
IEEE Transactions on Pattern Analysis and Machine Intelligence
CNSR '04 Proceedings of the Second Annual Conference on Communication Networks and Services Research
A new algorithm for clustering search results
Data & Knowledge Engineering
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Hi-index | 0.00 |
With the vast amount of information available online, searching documents relevant to a given query requires the user to go through many titles and snippets. This searching time can be reduced by grouping search results into clusters so that the user can select the relevant cluster at a glance by looking at the cluster labels. A new method of search results clustering is introduced in this paper which clusters the search results using optimised K-means algorithm using the terms from URL, title tag and meta tag as features. Optimisation of K-means algorithm is done by selecting the initial centroids using scale factor method. The proposed method of clustering is compared with existing snippet clustering algorithms in terms of intra-cluster distance and inter-cluster distance. Results show that the proposed method produces high quality clusters than the existing methods.