A behavioral approach to information retrieval system design
Journal of Documentation
On the evaluation of IR systems
Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Web search behavior of Internet experts and newbies
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Primarily history: historians and the search for primary source materials
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Engineering a multi-purpose test collection for web retrieval experiments
Information Processing and Management: an International Journal
MonetDB/XQuery: a fast XQuery processor powered by a relational engine
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Defining a session on Web search engines: Research Articles
Journal of the American Society for Information Science and Technology
Investigating the querying and browsing behavior of advanced search engine users
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On the history of evaluation in IR
Journal of Information Science
Information Retrieval
Information problem solving by experts and novices: analysis of a complex cognitive skill
Computers in Human Behavior
Focused Search in Digital Archives
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
A search log-based approach to evaluation
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Hi-index | 0.00 |
Evaluation is needed in order to benchmark and improve systems. In information retrieval (IR), evaluation is centered around the test collection, i.e. the set of documents that systems should retrieve given the matching queries coming from users. Much of the evaluation is uniform, i.e. there is one test collection and every query is processed in the same way by a system. But does one size fit all? Queries are created by different users in different contexts. This paper presents a method to contextualize the IR evaluation using search logs. We study search log files in the archival domain, and the retrieval of archival finding aids in the popular standard Encoded Archival Description (EAD) in particular. We study various aspects of the searching behavior in the log, and use them to define particular searcher stereotypes. Focusing on two user stereotypes, namely novice and expert users, we can automatically derive queries and pseudo-relevance judgments from the interaction data in the log files. We investigate how this can be used for context-sensitive system evaluation tailored to these user stereotypes. Our findings are in line with and complement prior user studies of archival users. The results also show that satisfying the demand of expert users is harder compared to novices as experts have more challenging information seeking needs, but also that the choice of system does not influence the relative IR performance of a system between different user groups.