Content free clustering for search engine query log

  • Authors:
  • Mehdi Hosseini;Hassan Abolhassani;Mohsn Sayyadi Harikandeh

  • Affiliations:
  • Sharif University of Technology, Web Intelligence Research Laboratory, Tehran, Iran;Sharif University of Technology, Web Intelligence Research Laboratory, Tehran, Iran;Sharif University of Technology, Computer Engineering Departmen, Tehran, Iran

  • Venue:
  • SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web query clustering is widely used by web information systems. In this paper we present a new content free method for web query log clustering. Query clustering has many applications including page ranking in web search, personalizing search result and web query expansion. In our approach, we first construct a bipartite graph for queries and visited URLs of a query log. Most of the clusters of queries are connected together with noisy users selections. So some huge connected components are produced. To eliminate such noisy links all queries and related URLs are projected in reduced dimensional space by applying singular value decomposition. Finally, a clustering algorithm will be applied in each pruned connected component, in new space. The method has been evaluated using a real world data set and by comparing it to existing approaches, the results show promising improvements.