The pragmatics of information retrieval experimentation, revisited
Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Context-based question-answering evaluation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating the evaluation: a case study using the TREC 2002 question answering track
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Cross-Evaluation: A new model for information system evaluation
Journal of the American Society for Information Science and Technology
Hitiqa: High-quality intelligence through interactive question answering
Natural Language Engineering
Designing an interactive open-domain question answering system
Natural Language Engineering
Questionnaires for eliciting evaluation data from users of interactive question answering systems
Natural Language Engineering
Methods for Evaluating Interactive Information Retrieval Systems with Users
Foundations and Trends in Information Retrieval
Are self-assessments reliable indicators of topic knowledge?
Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
Applying web usage mining for adaptive intranet navigation
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Examining users' knowledge change in the task completion process
Information Processing and Management: an International Journal
Hi-index | 0.00 |
We describe a large-scale evaluation of four interactive question answering system with real users. The purpose of the evaluation was to develop evaluation methods and metrics for interactive QA systems. We present our evaluation method as a case study, and discuss the design and administration of the evaluation components and the effectiveness of several evaluation techniques with respect to their validity and discriminatory power. Our goal is to provide a roadmap to others for conducting evaluations of their own systems, and to put forward a research agenda for interactive QA evaluation.