Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Accurately interpreting clickthrough data as implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
User performance versus precision measures for simple search tasks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
How well does result relevance predict session satisfaction?
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The relationship between IR effectiveness measures and user satisfaction
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
An experimental comparison of click position-bias models
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A new interpretation of average precision
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness
ACM Transactions on Information Systems (TOIS)
A dynamic bayesian network click model for web search ranking
Proceedings of the 18th international conference on World wide web
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Click-based evidence for decaying weight distributions in search effectiveness metrics
Information Retrieval
A user behavior model for average precision and its generalization to graded judgments
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Expected browsing utility for web search evaluation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
System effectiveness, user models, and user utility: a conceptual framework for investigation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
What deliberately degrading search quality tells us about discount functions
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Discounted cumulative gain and user decision models
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Simulating simple user behavior for system effectiveness evaluation
Proceedings of the 20th ACM international conference on Information and knowledge management
Relative effect of spam and irrelevant documents on user interaction with search engines
Proceedings of the 20th ACM international conference on Information and knowledge management
Time-based calibration of effectiveness measures
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 4th Information Interaction in Context Symposium
Stochastic simulation of time-biased gain
Proceedings of the 21st ACM international conference on Information and knowledge management
Models and metrics: IR evaluation as a user process
Proceedings of the Seventeenth Australasian Document Computing Symposium
Choices in batch information retrieval evaluation
Proceedings of the 18th Australasian Document Computing Symposium
Hi-index | 0.00 |
Retrieval system effectiveness can be measured in two quite different ways: by monitoring the behavior of users and gathering data about the ease and accuracy with which they accomplish certain specified information-seeking tasks; or by using numeric effectiveness metrics to score system runs in reference to a set of relevance judgments. In the second approach, the effectiveness metric is chosen in the belief that user task performance, if it were to be measured by the first approach, should be linked to the score provided by the metric. This work explores that link, by analyzing the assumptions and implications of a number of effectiveness metrics, and exploring how these relate to observable user behaviors. Data recorded as part of a user study included user self-assessment of search task difficulty; gaze position; and click activity. Our results show that user behavior is influenced by a blend of many factors, including the extent to which relevant documents are encountered, the stage of the search process, and task difficulty. These insights can be used to guide development of batch effectiveness metrics.