Optimised K-means for web search

Authors:
S. Poomagal;T. Hamsapriya
Affiliations:
Department of Computer and Information Sciences, PSG College of Technology, Peelamedu, Coimbatore - 641004, India.;Department of Information Technology, PSG College of Technology, Peelamedu, Coimbatore - 641004, India
Venue:
International Journal of Advanced Intelligence Paradigms
Year:
2012

Citing 11
Cited 0

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Accelerating exact k-means algorithms with geometric reasoning

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Grouper: a dynamic clustering interface to Web search results

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Multidimensional binary search trees used for associative searching

Communications of the ACM
SALSA: the stochastic approach for link-structure analysis

ACM Transactions on Information Systems (TOIS)
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Weighted PageRank Algorithm

CNSR '04 Proceedings of the Second Annual Conference on Communication Networks and Services Research
A new algorithm for clustering search results

Data & Knowledge Engineering
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the vast amount of information available online, searching documents relevant to a given query requires the user to go through many titles and snippets. This searching time can be reduced by grouping search results into clusters so that the user can select the relevant cluster at a glance by looking at the cluster labels. A new method of search results clustering is introduced in this paper which clusters the search results using optimised K-means algorithm using the terms from URL, title tag and meta tag as features. Optimisation of K-means algorithm is done by selecting the initial centroids using scale factor method. The proposed method of clustering is compared with existing snippet clustering algorithms in terms of intra-cluster distance and inter-cluster distance. Results show that the proposed method produces high quality clusters than the existing methods.