Clustering of search engine keywords using access logs

Authors:
Shingo Otsuka;Masaru Kitsuregawa
Affiliations:
Institute of Industrial Science, The University of Tokyo, Meguro-ku, Tokyo, Japan;Institute of Industrial Science, The University of Tokyo, Meguro-ku, Tokyo, Japan
Venue:
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Year:
2006

Citing 11
Cited 2

Characterizing browsing strategies in the World-Wide Web

Proceedings of the Third International World-Wide Web conference on Technology, tools and applications
Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Query clustering using user logs

ACM Transactions on Information Systems (TOIS)
Creating a Web community chart for navigating related communities

Proceedings of the 12th ACM conference on Hypertext and Hypermedia
Self-Organization and Identification of Web Communities

Computer
Web mining for web personalization

ACM Transactions on Internet Technology (TOIT)
Naviz: Website Navigational Behavior Visualizer

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A Unified Framework for Clustering Heterogeneous Web Objects

WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
Correlation-based Document Clustering using Web Logs

HICSS '01 Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-Volume 5 - Volume 5
Web Mining: Information and Pattern Discovery on the World Wide Web

ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence

Optimal distance bounds for fast search on compressed time-series query logs

ACM Transactions on the Web (TWEB)
QUBiC: An adaptive approach to query-based recommendation

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

It the becomes possible that users can get kinds of information by just inputting search keyword(s) representing the topic which users are interested in. But it is not always true that users can hit upon search keyword(s) properly. In this paper, by using Web access logs (called panel logs), which are collected URL histories of Japanese users (called panels) selected without static deviation similar to the survey on TV audience rating, we study the methods of clustering search keywords. Different from the existing systems where the related search keywords are extracted based on the set of URLs viewed by the users after input of their original search keyword(s), we propose two novel methods of clustering the search words. One is based on the Web communities (set of similar web pages); the other is based on the set of nouns obtained by morphological analysis of Web pages. According to evaluation results, our proposed methods can extract more related search keywords than that based on URL.