Variations in relevance judgments and the measurement of retrieval effectiveness
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Query chains: learning to rank from implicit feedback
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
User performance versus precision measures for simple search tasks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Extending the Linear Model with R (Texts in Statistical Science)
Extending the Linear Model with R (Texts in Statistical Science)
Learn from web search logs to organize search results
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A user browsing model to predict search engine click data from past observations.
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness
ACM Transactions on Information Systems (TOIS)
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Modeling Expected Utility of Multi-session Information Distillation
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Empirical justification of the gain and discount function for nDCG
Proceedings of the 18th ACM conference on Information and knowledge management
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Click-based evidence for decaying weight distributions in search effectiveness metrics
Information Retrieval
Extending average precision to graded relevance judgments
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Expected browsing utility for web search evaluation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
System effectiveness, user models, and user utility: a conceptual framework for investigation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Evaluating multi-query sessions
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Time-based calibration of effectiveness measures
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
On per-topic variance in IR evaluation
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modeling user variance in time-biased gain
Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval
Incorporating variability in user behavior into systems based evaluation
Proceedings of the 21st ACM international conference on Information and knowledge management
Evaluating reward and risk for vertical selection
Proceedings of the 21st ACM international conference on Information and knowledge management
Models and metrics: IR evaluation as a user process
Proceedings of the Seventeenth Australasian Document Computing Symposium
Model Based Comparison of Discounted Cumulative Gain and Average Precision
Journal of Discrete Algorithms
Summaries, ranked retrieval and sessions: a unified framework for information access evaluation
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Utilizing query change for session search
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A general evaluation measure for document organization tasks
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Evaluating and predicting user engagement change with degraded search relevance
Proceedings of the 22nd international conference on World Wide Web
Modeling behavioral factors ininteractive information retrieval
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Users versus models: what observation tells us about effectiveness metrics
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.01 |
Information retrieval effectiveness evaluation typically takes one of two forms: batch experiments based on static test collections, or lab studies measuring actual users interacting with a system. Test collection experiments are sometimes viewed as introducing too many simplifying assumptions to accurately predict the usefulness of a system to its users. As a result, there is great interest in creating test collections and measures that better model user behavior. One line of research involves developing measures that include a parameterized user model; choosing a parameter value simulates a particular type of user. We propose that these measures offer an opportunity to more accurately simulate the variance due to user behavior, and thus to analyze system effectiveness to a simulated user population. We introduce a Bayesian procedure for producing sampling distributions from click data, and show how to use statistical tools to quantify the effects of variance due to parameter selection.