Results and challenges in Web search evaluation
WWW '99 Proceedings of the eighth international conference on World Wide Web
IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A review of web searching studies and a framework for future research
Journal of the American Society for Information Science and Technology
Eliciting and analyzing expert judgment: a practical guide
Eliciting and analyzing expert judgment: a practical guide
Measuring Search Engine Quality
Information Retrieval
When will information retrieval be "good enough"?
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
User performance versus precision measures for simple search tasks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation by comparing result sets in context
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
What are you looking for?: an eye-tracking study of information usage in web search
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
An eye tracking study of the effect of target rank on web search
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The influence of caption features on clickthrough patterns in web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Studying the use of popular destinations to enhance web search interaction
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
An experimental comparison of click position-bias models
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Learning diverse rankings with multi-armed bandits
Proceedings of the 25th international conference on Machine learning
The good and the bad system: does the test collection predict users' effectiveness?
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Relevance assessment: are judges exchangeable and does it matter
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
On test collections for adaptive information retrieval
Information Processing and Management: an International Journal
How does clickthrough data reflect retrieval quality?
Proceedings of the 17th ACM conference on Information and knowledge management
Understanding the relationship between searchers' queries and information goals
Proceedings of the 17th ACM conference on Information and knowledge management
PSkip: estimating relevance ranking quality from web search clickthrough data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Good abandonment in mobile and PC internet search
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A comparison of query and term suggestion features for interactive searching
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Including summaries in system evaluation
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Brand and its effect on user perception of search engine performance
Journal of the American Society for Information Science and Technology
Minimally invasive randomization for collecting unbiased preferences from clickthrough logs
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Here or there: preference judgments for relevance
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Eyetracking Web Usability
The good, the bad, and the random: an eye-tracking study of ad quality in web search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparing the sensitivity of information retrieval metrics
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Evaluating whole-page relevance
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Addressing people's information needs directly in a web search result page
Proceedings of the 20th international conference on World wide web
Click the search button and be happy: evaluating direct and immediate information access
Proceedings of the 20th ACM international conference on Information and knowledge management
The effect of aggregated search coherence on search behavior
Proceedings of the 21st ACM international conference on Information and knowledge management
Relevance dimensions in preference-based IR evaluation
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Factors affecting aggregated search coherence and search behavior
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Contextual and dimensional relevance judgments for reusable SERP-level evaluation
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
We introduce a method for evaluating the relevance of all visible components of a Web search results page, in the context of that results page. Contrary to Cranfield-style evaluation methods, our approach recognizes that a user's initial search interaction is with the result page produced by a search system, not the landing pages linked from it. Our key contribution is that the method allows us to investigate aspects of component relevance that are difficult or impossible to judge in isolation. Such contextual aspects include component-level information redundancy and cross-component coherence. We report on how the method complements traditional document relevance measurement and its support for comparative relevance assessment across multiple search engines. We also study possible issues with applying the method, including brand presentation effects, inter-judge agreement, and comparisons with document-based relevance judgments. Our findings show this is a useful method for evaluating the dominant user experience in interacting with search systems.