A simple optimal representation for balanced parentheses

  • Authors:
  • Richard F. Geary;Naila Rahman;Rajeev Raman;Venkatesh Raman

  • Affiliations:
  • Department of Computer Science, University of Leicester, Leicester LE1 7RH, UK;Department of Computer Science, University of Leicester, Leicester LE1 7RH, UK;Department of Computer Science, University of Leicester, Leicester LE1 7RH, UK;Institute of Mathematical Sciences, Chennai 600 113, India

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2006

Quantified Score

Hi-index 5.23

Visualization

Abstract

We consider succinct, or highly space-efficient, representations of a (static) string consisting of n pairs of balanced parentheses, which support natural operations such as finding the matching parenthesis for a given parenthesis, or finding the pair of parentheses that most tightly enclose a given pair. This problem was considered by Jacobson [Space-efficient static trees and graphs, in: Proc. of the 30th FOCS, 1989, pp. 549-554] and Munro and Raman [Succinct representation of balanced parentheses and static trees, SIAM J. Comput. 31 (2001) 762-776] who gave O(n)-bit and 2n+o(n)-bit representations, respectively, that supported the above operations in O(1) time on the RAM model of computation. This data structure is a fundamental tool in succinct representations, and has applications in representing suffix trees, ordinal trees, planar graphs and permutations. We consider the practical performance of parenthesis representations. First, we give a new 2n+o(n)-bit representation that supports all the above operations in O(1) time. This representation is conceptually simpler, its space bound has a smaller o(n) term and it also has a simple and uniform o(n) time and space construction algorithm. We implement our data structure and a variant of Jacobson's, and evaluate their practical performance (speed and memory usage), when used in a succinct representation of trees derived from XML documents. As a baseline, we compare our representations against a widely used implementation of the standard DOM (document object model) representation of XML documents. Both succinct representations use orders of magnitude less space than DOM and tree traversal operations are usually only slightly slower than in DOM.