Data structures for maintaining set partitions

  • Authors:
  • Michael A. Bender;Saurabh Sethia;Steven S. Skiena

  • Affiliations:
  • Department of Computer Science, SUNY Stony Brook, Stony Brook, New York;School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon;Department of Computer Science, SUNY Stony Brook, Stony Brook, New York

  • Venue:
  • Random Structures & Algorithms
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficiently maintaining the partition induced by a set of features is an important problem in building decision-tree classifiers. In order to identify a small set of discriminating features, we need the capability of efficiently adding and removing specific features and determining the effect of these changes on the induced classification or partition. In this paper we introduce a variety of randomized and deterministic data structures to support these operations on both general and geometrically induced set partitions. We give both Monte Carlo and Las Vegas data structures that realize near-optimal time bounds and are practical to implement. We then provide a faster solution to this problem in the geometric setting. Finally, we present a data structure that efficiently estimates the number of partitions separating elements.