The onion technique: indexing for linear optimization queries

  • Authors:
  • Yuan-Chi Chang;Lawrence Bergman;Vittorio Castelli;Chung-Sheng Li;Ming-Ling Lo;John R. Smith

  • Affiliations:
  • Data Management, IBM T. J. Watson Research Center, P. O.Box, 704, Yorktown Heights, NY;Data Management, IBM T. J. Watson Research Center, P. O.Box, 704, Yorktown Heights, NY;Data Management, IBM T. J. Watson Research Center, P. O.Box, 704, Yorktown Heights, NY;Data Management, IBM T. J. Watson Research Center, P. O.Box, 704, Yorktown Heights, NY;Data Management, IBM T. J. Watson Research Center, P. O.Box, 704, Yorktown Heights, NY;Data Management, IBM T. J. Watson Research Center, P. O.Box, 704, Yorktown Heights, NY

  • Venue:
  • SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the Onion technique, a special indexing structure for linear optimization queries. Linear optimization queries ask for top-N records subject to the maximization or minimization of linearly weighted sum of record attribute values. Such query appears in many applications employing linear models and is an effective way to summarize representative cases, such as the top-50 ranked colleges. The Onion indexing is based on a geometric property of convex hull, which guarantees that the optimal value can always be found at one or more of its vertices. The Onion indexing makes use of this property to construct convex hulls in layers with outer layers enclosing inner layers geometrically. A data record is indexed by its layer number or equivalently its depth in the layered convex hull. Queries with linear weightings issued at run time are evaluated from the outmost layer inwards. We show experimentally that the Onion indexing achieves orders of magnitude speedup against sequential linear scan when N is small compared to the cardinality of the set. The Onion technique also enables progressive retrieval, which processes and returns ranked results in a progressive manner. Furthermore, the proposed indexing can be extended into a hierarchical organization of data to accommodate both global and local queries.