Efficient OLAP with UDFs

Authors:
Zhibo Chen;Carlos Ordonez
Affiliations:
University of Houston, Houston, TX, USA;University of Houston, Houston, TX, USA
Venue:
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Year:
2008

Citing 15
Cited 7

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Integrating association rule mining with relational database systems: alternatives and implications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Online association rule mining

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
NonStop SQL/MX primitives for knowledge discovery

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
CubiST: a new algorithm for improving the performance of ad-hoc OLAP queries

Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Vertical and horizontal percentage aggregations

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Constraining and summarizing association rules in medical data

Knowledge and Information Systems
Enhanced mining of association rules from data cubes

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Vector and matrix operations programmed with UDFs in a relational DBMS

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Programming SQL Server 2005

Programming SQL Server 2005
Building statistical models and scoring with UDFs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Fast and dynamic OLAP exploration using UDFs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
OLAP with UDFs in digital libraries

Proceedings of the 18th ACM conference on Information and knowledge management
ONTOCUBE: efficient ontology extraction using OLAP cubes

Proceedings of the 20th ACM international conference on Information and knowledge management
Interactive exploration and visualization of OLAP cubes

Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Analytics over large-scale multidimensional data: the big data revolution!

Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Query processing on cubes mapped from ontologies to dimension hierarchies

Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Optimizing OLAP cube processing on solid state drives

Proceedings of the sixteenth international workshop on Data warehousing and OLAP

Quantified Score

Hi-index	0.00

Visualization

Abstract

Since the early 1990s, On-Line Analytical Processing (OLAP) has been a well studied research topic that has focused on implementation outside the database, either with OLAP servers or entirely within the client computers. Our approach involves the computation and storage of OLAP cubes using User-Defined Functions (UDF) with a database management system. UDFs offer users a chance to write their own code that can then called like any other standard SQL function. By generating OLAP cubes within a UDF, we are able to create the entire lattice in main memory. The UDF also allows the user to assert more control over the actual generation process than when using standard OLAP functions such as the CUBE operator. We introduce a data structure that can not only efficiently create an OLAP lattice in main memory, but also be adapted to generate association rule itemsets with minimal change. We experimentally show that the UDF approach is more efficient than SQL using one real dataset and a synthetic dataset. Also, we present several experiments showing that generating association rule itemsets using the UDF approach is comparable to a SQL approach. In this paper, we show that techniques such as OLAP and association rules can be efficiently pushed into the UDF, and has better performance, in most cases, compared to standard SQL functions.