Fast intersection algorithms for sorted sequences

Authors:
Ricardo Baeza-Yates;Alejandro Salinger
Affiliations:
Yahoo! Research, Barcelona, Spain;Dept. of Computer Science, Univ. of Waterloo, Canada
Venue:
Algorithms and Applications
Year:
2010

Citing 12
Cited 2

Efficient text searching

Efficient text searching
Compared to what?: an introduction to the analysis of algorithms

Compared to what?: an introduction to the analysis of algorithms
Two probabilistic results on merging

SIAM Journal on Computing
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Lower bounds for set intersection queries

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Adaptive set intersections, unions, and differences

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Adaptive intersection and t-threshold problems

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Modern Information Retrieval

Modern Information Retrieval
Experiments on Adaptive Set Intersections for Text Retrieval Systems

ALENEX '01 Revised Papers from the Third International Workshop on Algorithm Engineering and Experimentation
An experimental investigation of set intersection algorithms for text searching

Journal of Experimental Algorithmics (JEA)
Compact set representation for information retrieval

SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Experimental analysis of a fast intersection algorithm for sorted sequences

SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval

Workload-aware indexing for keyword search in social networks

Proceedings of the 20th ACM international conference on Information and knowledge management
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents and analyzes a simple intersection algorithm for sorted sequences that is fast on average. It is related to the multiple searching problem and to merging. We present the worst and average case analysis, showing that in the former, the complexity nicely adapts to the smallest list size. In the latter case, it performs less comparisons than the total number of elements on both inputs, n and m, when n=αm (α1), achieving O(m log(n/m)) complexity. The algorithm is motivated by its application to fast query processing in Web search engines, where large intersections, or differences, must be performed fast. In this case we experimentally show that the algorithm is faster than previous solutions.