Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
The Cranfield tests on index language devices
Readings in information retrieval
Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Simulation of user judgments in bibliographic retrieval systems
SIGIR '81 Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval: theoretical issues in information retrieval
Information Retrieval
Modern Information Retrieval
The Philosophy of Information Retrieval Evaluation
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The TREC robust retrieval track
ACM SIGIR Forum
Using controlled query generation to evaluate blind relevance feedback algorithms
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Bias and the limits of pooling
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision with incomplete and imperfect judgments
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Building simulated queries for known-item topics: an analysis using six european languages
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A new interpretation of average precision
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness
ACM Transactions on Information Systems (TOIS)
Retrievability: an evaluation measure for higher order information access tasks
Proceedings of the 17th ACM conference on Information and knowledge management
Structural relevance: a common basis for the evaluation of structured document retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
A proposal for chemical information retrieval evaluation
Proceedings of the 1st ACM workshop on Patent information retrieval
Automatic query generation for patent search
Proceedings of the 18th ACM conference on Information and knowledge management
Here or there: preference judgments for relevance
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
CLEF-IP 2009: retrieval experiments in the intellectual property domain
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Should MT systems be used as black boxes in CLIR?
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Simple vs. sophisticated approaches for patent prior-art search
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
An investigation of decompounding for cross-language patent search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Building queries for prior-art search
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
An efficient method for using machine translation technologies in cross-language patent search
Proceedings of the 20th ACM international conference on Information and knowledge management
Patent query reduction using pseudo relevance feedback
Proceedings of the 20th ACM international conference on Information and knowledge management
United we fall, divided we stand: a study of query segmentation and prf for patent prior art search
Proceedings of the 4th workshop on Patent information retrieval
A study on query expansion methods for patent retrieval
Proceedings of the 4th workshop on Patent information retrieval
Improving retrievability with improved cluster-based pseudo-relevance feedback selection
Expert Systems with Applications: An International Journal
Automatic refinement of patent queries using concept importance predictors
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Learning-Based pseudo-relevance feedback for patent retrieval
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
A scalable approach for performing proximal search for verbose patent search queries
Proceedings of the 21st ACM international conference on Information and knowledge management
Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
The water filling model and the cube test: multi-dimensional evaluation for professional search
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused on aspects of precision and recall, and while these are often discussed with equal importance, in practice most attention has been given to precision focused metrics. Even for recall-oriented IR tasks of growing importance, such as patent retrieval, these precision based scores remain the primary evaluation measures. Our study examines different evaluation measures for a recall-oriented patent retrieval task and demonstrates the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and the user's search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the user's expected search effort.