Foundations of statistical natural language processing
Foundations of statistical natural language processing
Extended Boolean information retrieval
Communications of the ACM
A probabilistic model of information retrieval: development and comparative experiments Part 2
Information Processing and Management: an International Journal
A formal study of information retrieval heuristics
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A study of search tactics for patentability search: a case study on patent engineers
Proceedings of the 1st ACM workshop on Patent information retrieval
Improving search engines using human computation games
Proceedings of the 18th ACM conference on Information and knowledge management
Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection
Proceedings of the 18th ACM conference on Information and knowledge management
Accessibility in information retrieval
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Search system requirements of patent analysts
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
On the relationship between effectiveness and accessibility
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the third symposium on Information interaction in context
Improving retrievability of patents in prior-art search
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Retrievability is a measure of access that quantifies how easily documents can be found using a retrieval system. Such a measure is of particular interest within the patent domain, because if a retrieval system makes some patents hard to find, then patent searchers will have a difficult time retrieving these patents. This may mean that a patent searcher could miss important and relevant patents because of the retrieval system. In this paper, we describe measures of retrievability and how they can be applied to measure the overall access to a collection given a retrieval system. We then identify three features of best-match retrieval models that are hypothesized to lead to an improvement in access to all documents in the collection: sensitivity to term frequency, length normalization and convexity. Since patent searchers tend to favor Boolean models over best-match models, hybrid retrieval models are proposed that incorporate these features while preserving the desirable aspects of the traditional Boolean model. An empirical study conducted on four large patent corpora demonstrates that these hybrid models provide better access to the corpus of patents than the traditional Boolean model.