Analyses of multiple evidence combination
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
A study of results overlap and uniqueness among major web search engines
Information Processing and Management: an International Journal
Web searcher interaction with the Dogpile.com metasearch engine
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
The effectiveness of metasearch data fusion procedures depends crucially on the properties of common documents distributions. Because we usually know neither how different search engines assign relevance scores nor the similarity of these assignments, common documents of the individual ranked lists are the only base of combining search results. So it is very important to study the properties of common documents distributions. One of these properties is the Overlap Property (OP) of documents retrieved by different search engines. According to OP, the overlap between the relevant documents is usually greater than the overlap between non-relevant ones. Although OP was repeatedly observed and discussed, no theoretical explanation of this empirical property was elaborated. This paper considers formal research of properties of the common documents distributions. In particular, sufficient and necessary condition of OP is elaborated and it is proved that OP should take place practically under arbitrary circumstances.