Users, Queries and Documents: A Unified Representation for Web Mining

Authors:
Michelangelo Diligenti;Marco Gori;Marco Maggini
Affiliations:
-;-;-
Venue:
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Year:
2009

Citing 14
Cited 0

Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering user queries of a search engine

Proceedings of the 10th international conference on World Wide Web
Enhanced topic distillation using text, markup tags, and hyperlinks

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Combining evidence for automatic web session identification

Information Processing and Management: an International Journal - Issues of context in information retrieval
A taxonomy of web search

ACM SIGIR Forum
Dynamic web log session identification with statistical language models

Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Query expansion using random walk models

Proceedings of the 14th ACM international conference on Information and knowledge management
Modeling User Search Behavior

LA-WEB '05 Proceedings of the Third Latin American Web Congress
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Random walks on the click graph

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Video suggestion and discovery for youtube: taking random walks through the view graph

Proceedings of the 17th international conference on World Wide Web
Behavioral classification on the click graph

Proceedings of the 17th international conference on World Wide Web
The query-flow graph: model and applications

Proceedings of the 17th ACM conference on Information and knowledge management
Query recommendation using query logs in search engines

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The collective feedback of the users of an Information Retrieval system has been proved to be useful in many tasks. A popular approach in the literature is to process the logs stored by Internet Service Providers (ISP), Intranet proxies or Web search engines to extract a query-document bi-partite graph. In this paper, we propose to use a richer data structure which is able to preserve most of the information available in the logs including query refinements, page visits and search activity. In particular, we represent the query refinements as separate transitions between the corresponding query nodes in the graph and we augment the graph by associating one node to each single user. Users are linked to the queries which they have issued and to the documents they have visited. The resulting data structure is a complete representation of the collective search activity performed by the users of a search engine or of an Intranet. The experimental results show that this more powerful representation can be successfully used to improve the quality of query clustering and to discover query suggestions.