Sorting streamed multisets

  • Authors:
  • Travis Gagie

  • Affiliations:
  • Dipartimento di Informatica, Università del Piemonte Orientale, Alessandria, AL, Italy

  • Venue:
  • Information Processing Letters
  • Year:
  • 2008

Quantified Score

Hi-index 0.89

Visualization

Abstract

Sorting is a classic problem and one to which many others reduce easily. In the streaming model, however, we are allowed only one pass over the input and sublinear memory, so in general we cannot sort. In this paper we show that, to determine the sorted order of a multiset s of size n containing @s=2 distinct elements using one pass and o(nlog@s) bits of memory, it is generally necessary and sufficient that its entropy H=o(log@s). Specifically, if s={s"1,...,s"n} and s"i"""1,...,s"i"""n is the stable sort of s, then we can compute i"1,...,i"n in one pass using O((H+1)n) time and O(Hn) bits of memory, with a simple combination of classic techniques. On the other hand, in the worst case it takes that much memory to compute any sorted ordering of s in one pass.