Evaluation by comparing result sets in context
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Will pyramids built of nuggets topple over?
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Intra-assessor consistency in question answering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Improving the performance of the reinforcement learning model for answering complex questions
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Question answering systems increasingly need to deal with complex information needs that require more than simple factoid answers. The evaluation of such systems is usually carried out using precision- or recall-based system performance metrics. Previous work has demonstrated that when users are shown two search result lists side-by-side, they can reliably differentiate between the qualities of the lists. We investigate the consistency between this user-based approach and system-oriented metrics in the question answering environment. Our initial results indicate that the two methodologies show a high level of disagreement.