Study on the Click Context of Web Search Users for Reliability Analysis

Authors:
Rongwei Cen;Yiqun Liu;Min Zhang;Liyun Ru;Shaoping Ma
Affiliations:
State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...
Venue:
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Year:
2009

Citing 15
Cited 1

Assessing agreement on classification tasks: the kappa statistic

Computational Linguistics
Discovery of Web Robot Sessions Based on their Navigational Patterns

Data Mining and Knowledge Discovery
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling User Search Behavior

LA-WEB '05 Proceedings of the Third Latin American Web Congress
Learning user interaction models for predicting web search result preferences

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search ranking by incorporating user behavior information

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Web search clickstreams

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Improving web spam classification using rank-time features

AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Extracting semantic relations from query logs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Using the wisdom of the crowds for keyword generation

Proceedings of the 17th international conference on World Wide Web
Characterizing typical and atypical user sessions in clickstreams

Proceedings of the 17th international conference on World Wide Web
Are click-through data adequate for learning web search rankings?

Proceedings of the 17th ACM conference on Information and knowledge management
Efficient multiple-click models in web search

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Generating labels from clicks

Proceedings of the Second ACM International Conference on Web Search and Data Mining

Empirical Study on Rare Query Characteristics

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

User behavior information analysis has been shown important for optimization and evaluation of Web search and has become one of the major areas in both information retrieval and knowledge management researches. This paper focuses on users' searching behavior reliability study based on large scale query and click-through logs collected from commercial search engines. The concept of reliability is defined in a probabilistic notion. The context of user click behavior on search results is analyzed in terms of relevance. Five features, namely query number, click entropy, first click ratio, last click ratio, and rank position, are proposed and studied to separate reliable user clicks from the others. Experimental results show that the proposed method evaluates the reliability of user behavior effectively. The AUC value of the ROC curve is 0.792, and the algorithm maintains 92.8% relevant clicks when filtering out 40% low quality clicks.