Study on the Click Context of Web Search Users for Reliability Analysis

  • Authors:
  • Rongwei Cen;Yiqun Liu;Min Zhang;Liyun Ru;Shaoping Ma

  • Affiliations:
  • State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, ...

  • Venue:
  • AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

User behavior information analysis has been shown important for optimization and evaluation of Web search and has become one of the major areas in both information retrieval and knowledge management researches. This paper focuses on users' searching behavior reliability study based on large scale query and click-through logs collected from commercial search engines. The concept of reliability is defined in a probabilistic notion. The context of user click behavior on search results is analyzed in terms of relevance. Five features, namely query number, click entropy, first click ratio, last click ratio, and rank position, are proposed and studied to separate reliable user clicks from the others. Experimental results show that the proposed method evaluates the reliability of user behavior effectively. The AUC value of the ROC curve is 0.792, and the algorithm maintains 92.8% relevant clicks when filtering out 40% low quality clicks.