External selection

Authors:
Jop F. Sibeyn
Affiliations:
Max-Planck-Institut für Informatik, Saarbrücken, Germany
Venue:
STACS'99 Proceedings of the 16th annual conference on Theoretical aspects of computer science
Year:
1999

Citing 8
Cited 4

The input/output complexity of sorting and related problems

Communications of the ACM
Deterministic distribution sort in shared and distributed memory multiprocessors

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Selecting the median

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Expected time bounds for selection

Communications of the ACM
Approximate and Exact Deterministic Parallel Selection

MFCS '93 Proceedings of the 18th International Symposium on Mathematical Foundations of Computer Science
Linear-time In-place Selection in Less than 3n Comparisons

ISAAC '95 Proceedings of the 6th International Symposium on Algorithms and Computation
Sample Sort on Meshes

Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Randomized Parallel Selection

Proceedings of the Tenth Conference on Foundations of Software Technology and Theoretical Computer Science

External memory algorithms and data structures: dealing with massive data

ACM Computing Surveys (CSUR)
Performance engineering case study: heap construction

Journal of Experimental Algorithmics (JEA)
Lower bounds for external memory dictionaries

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
External memory algorithms

Handbook of massive data sets

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential selection has been solved in linear time by Blum e.a. Running this algorithm on a problem of size N withN M, the size of the main memory, results in an algorithm that reads and writes O(N) elements, while the number of comparisons is also bounded by O(N). This is asymptotically optimal, but the constants are so large that in practice sorting is faster for most values of M and N. This paper provides the fi rst detailed study of the external selection problem. A randomized algorithm of a conventional type is close to optimal in all respects. Our deterministic algorithm is more or less the same, but fi rst the algorithm builds an index structure of all the elements. This effort is not wasted: the index structure allows the retrieval of elements so that we do not need a second scan through all the data. This index structure can also be used for repeated selections, and can be extended over time. For a problem of size N, the deterministic algorithm reads N + o(N) elements and writes only o(N) elements and is thereby optimal to within lower-order terms.