Range selection and median: tight cell probe lower bounds and adaptive data structures

  • Authors:
  • Allan Grønlund Jørgensen;Kasper Green Larsen

  • Affiliations:
  • Aarhus University, Denmark;Aarhus University, Denmark

  • Venue:
  • Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Range selection is the problem of preprocessing an input array A of n unique integers, such that given a query (i, j, k), one can report the k'th smallest integer in the subarray A[i], A[i + 1],..., A[j]. In this paper we consider static data structures in the word-RAM for range selection and several natural special cases thereof. The first special case is known as range median, which arises when k is fixed to ⌊(j -- i + 1)/2⌋. The second case, denoted prefix selection, arises when i is fixed to 0. Finally, we also consider the bounded rank prefix selection problem and the fixed rank range selection problem. In the former, data structures must support prefix selection queries under the assumption that k ≤ κ for some value κ ≤ n given at construction time, while in the latter, data structures must support range selection queries where k is fixed beforehand for all queries. We prove cell probe lower bounds for range selection, prefix selection and range median, stating that any data structure that uses S words of space needs Ω(log n/log(Sw/n)) time to answer a query. In particular, any data structure that uses nlogO(1) n space needs Ω(log n/log log n) time to answer a query, and any data structure that supports queries in constant time, needs n1+Ω(1) space. For data structures that uses n logO(1) n space this matches the best known upper bound. Additionally, we present a linear space data structure that supports range selection queries in O(log k/log log n + log log n) time. Finally, we prove that any data structure that uses S space, needs Ω(log κ/log(Sw/n)) time to answer a bounded rank prefix selection query and Ω(log k/log(Sw/n)) time to answer a fixed rank range selection query. This shows that our data structure is optimal except for small values of k.