No sorting? better searching!

  • Authors:
  • Gianni Franceschini;Roberto Grossi

  • Affiliations:
  • University of Pisa, Pisa, Italy;University di Pisa, Pisa, Italy

  • Venue:
  • ACM Transactions on Algorithms (TALG)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Questions about order versus disorder in systems and models have been fascinating scientists over the years. In computer science, order is intimately related to sorting, commonly meant as the task of arranging keys in increasing or decreasing order with respect to an underlying total order relation. The sorted organization is amenable for searching a set of n keys, since each search requires Θ(log n) comparisons in the worst case, which is optimal if the cost of a single comparison can be considered a constant. Nevertheless, we prove that disorder implicitly provides more information than order does. For the general case of searching an array of multidimensional keys whose comparison cost is proportional to their length (and hence which cannot be considered a constant), we demonstrate that “suitable” disorder gives better bounds than those derivable by using the natural lexicographic order. We start from previous work done by Andersson et al. [2001], who proved that Θ(k log log n/log log(4 + klog log n/log n) + k + log n) character comparisons (or probes) comprise the tight complexity for searching a plain sorted array of n keys, each of length k, arranged in lexicographic order. We describe a novel permutation of the n keys that is different from the sorted order. When keys are kept “unsorted” in the array according to this permutation, the complexity of searching drops to Θ(k + log n) character comparisons (or probes) in the worst case, which is optimal among all possible permutations, up to a constant factor. Consequently, disorder carries more information than does order; this fact was not observable before, since the latter two bounds are Θ(log n) when k = O(1). More implications are discussed in the article, including searching in the bit-probe model.