Accessing data cubes along complex dimensions

  • Authors:
  • Yuping Yang;Mukesh Singhal

  • Affiliations:
  • Department of Computer and Information Science, The Ohio State University, Columbus, OH;4201 Wilson Blvd., Rm. 1145, National Science Foundation, Arlington, VA

  • Venue:
  • Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a data warehouse, data cubes are accessed through their dimensions. If dimensions are numerical, because numerical data can be clustered or sorted, fast access methods such as binary search or B+ trees can be applied. However, complex attributes such as keyword sets of document contents are not easily sorted or clustered. Although it is highly desirable that documents can be searched through their sets of keywords.Signature index is known for its ability to search along complex attributes. We propose a new indexing structure, dimensional signature index (DSI), for fast query processing in data cubes. DSI is particularly suitable for accessing data in data cubes through complex dimensions.Through a mathematical analysis, we found that if one signature index (feature index) is built for each dimension of the data cube, if the size of all feature indices is equal to the size of a large signature index for the entire data cube as a flat file, and if a query execution involves all dimensions of a data cube, the search cost in all these feature indices is the same as the search cost in the large signature index for the entire data cube.The significance of this discovery is that usually a query does not involve all dimensions of a data cube. By making one feature index for each dimension, only those feature indices involved in the query predicates need to be accessed. On average, this represents significant faster query executions than using a large signature file for the entire data cube.The use of DSI scheme does not exclude the use of other fast signature index schemes. Each feature index in DSI can also use any of the previously proposed fast signature indices (S-trees, multi-leveled, frame-sliced, etc.) to achieve even faster access speed.