Exponential time improvement for min-wise based algorithms
Information and Computation
Exponential time improvement for min-wise based algorithms
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Faster upper bounding of intersection sizes
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
A common technique for processing conjunctive queries is to first match each predicate separately using an index lookup, and then compute the intersection of the resulting rowid lists, via an AND-tree. The performance of this technique depends crucially on the order of lists in this tree: it is important to compute early the intersections that will produce small results. But this optimization is hard to do when the data or predicates have correlation. We present a new algorithm for ordering the lists in an AND-tree tree by sampling the intermediate intersection sizes. We prove that our algorithm is near-optimal and validate its effectiveness experimentally on datasets with a variety of distributions.