Evaluation by comparing result sets in context

Authors:
Paul Thomas;David Hawking
Affiliations:
Australian National University, Canberra, Australia;CSIRO ICT Centre, Canberra, Australia
Venue:
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Year:
2006

Citing 17
Cited 24

Evaluating the impact of an online library catalogue on subject searching behaviour at the catalogue and at the shelves

Journal of Documentation
The Cranfield tests on index language devices

Readings in information retrieval
Measures of relative relevance and ranked half-life: performance indicators for interactive IR

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
“User revealment”—a comparison of initial queries and ensuing question development in online searching and in human reference interactions

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Implicit interest indicators

Proceedings of the 6th international conference on Intelligent user interfaces
Approaches to collection selection and results merging for distributed information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Measuring Search Engine Quality

Information Retrieval
Stuff I've seen: a system for personal information retrieval and re-use

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Result merging strategies for a current news metasearcher

Information Processing and Management: an International Journal
The perfect search engine is not enough: a study of orienteering behavior in directed search

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Retrieval evaluation with incomplete information

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Display time as implicit feedback: understanding task effects

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A study of factors affecting the utility of implicit relevance feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Implicit user modeling for personalized search

Proceedings of the 14th ACM international conference on Information and knowledge management
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Questionnaire mode effects in interactive information retrieval experiments

Information Processing and Management: an International Journal
Writing in the library: Exploring tighter integration of digital library use with the writing process

Information Processing and Management: an International Journal
An investigation on a community's web search variability

ACSC '08 Proceedings of the thirty-first Australasian conference on Computer science - Volume 74
User preference choices for complex question answering

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Effects of performance feedback on users' evaluations of an interactive IR system

Proceedings of the second international symposium on Information interaction in context
Experiences evaluating personal metasearch

Proceedings of the second international symposium on Information interaction in context
Nullification test collections for web spam and SEO

Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Methods for Evaluating Interactive Information Retrieval Systems with Users

Foundations and Trends in Information Retrieval
Redundancy, diversity and interdependent document relevance

ACM SIGIR Forum
Effects of position and number of relevant documents retrieved on users' evaluations of system performance

ACM Transactions on Information Systems (TOIS)
Live web search experiments for the rest of us

Proceedings of the 19th international conference on World wide web
A review of factors influencing user satisfaction in information retrieval

Journal of the American Society for Information Science and Technology
Do user preferences and evaluation measures line up?

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparing the sensitivity of information retrieval metrics

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Using query context models to construct topical search engines

Proceedings of the third symposium on Information interaction in context
Evaluating search systems using result page context

Proceedings of the third symposium on Information interaction in context
Personalizing web search using long term browsing history

Proceedings of the fourth ACM international conference on Web search and data mining
A methodology for evaluating aggregated search results

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Googling Bing: reassessing the impact of brand on the perceived quality of two contemporary search engines

BCS '10 Proceedings of the 24th BCS Interaction Specialist Group Conference
Vertical selection in the information domain of children

Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
News vertical search: when and what to display to users

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Relevance dimensions in preference-based IR evaluation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
SRbench--a benchmark for soundtrack recommendation systems

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Contextual and dimensional relevance judgments for reusable SERP-level evaluation

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Familiar evaluation methodologies for information retrieval (IR) are not well suited to the task of comparing systems in many real settings. These systems and evaluation methods must support contextual, interactive retrieval over changing, heterogeneous data collections, including private and confidential information.We have implemented a comparison tool which can be inserted into the natural IR process. It provides a familiar search interface, presents a small number of result sets in side-by-side panels, elicits searcher judgments, and logs interaction events. The tool permits study of real information needs as they occur, uses the documents actually available at the time of the search, and records judgments taking into account the instantaneous needs of the searcher.We have validated our proposed evaluation approach and explored potential biases by comparing different whole-of-Web search facilities using a Web-based version of the tool. In four experiments, one with supplied queries in the laboratory and three with real queries in the workplace, subjects showed no discernable left-right bias and were able to reliably distinguish between high- and low-quality result sets. We found that judgments were strongly predicted by simple implicit measures.Following validation we undertook a case study comparing two leading whole-of-Web search engines. The approach is now being used in several ongoing investigations.