OLAP over continuous domains via density-based hierarchical clustering

Authors:
Michelangelo Ceci;Alfredo Cuzzocrea;Donato Malerba
Affiliations:
Dipartimento di Informatica, Università degli Studi di Bari "Aldo Modo", Bari, Italy;ICAR-CNR and University of Calabria, Rende, Cosenza, Italy;Dipartimento di Informatica, Università degli Studi di Bari "Aldo Modo", Bari, Italy
Venue:
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Year:
2011

Citing 23
Cited 0

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Towards on-line analytical mining in large databases

ACM SIGMOD Record
Clustering methods for large databases: from the past to the future

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
PARSIMONY: An infrastructure for parallel multidimensional analysis and data mining

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Cure: an efficient clustering algorithm for large databases

Information Systems
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
iDiff: Informative Summarization of Differences in Multidimensional Aggregates

Data Mining and Knowledge Discovery
Cubegrades: Generalizing Association Rules

Data Mining and Knowledge Discovery
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Mining Multi-Dimensional Constrained Gradients in Data Cubes

Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An OLAP-based Scalable Web Access Analysis Engine

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
WaveCluster: a wavelet-based clustering approach for spatial data in very large databases

The VLDB Journal — The International Journal on Very Large Data Bases
Selectivity estimators for multidimensional range queries over real attributes

The VLDB Journal — The International Journal on Very Large Data Bases
Automatic Subspace Clustering of High Dimensional Data

Data Mining and Knowledge Discovery
Improving range-sum query evaluation on data cubes via polynomial approximation

Data & Knowledge Engineering
Enhanced mining of association rules from data cubes

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
CrossClus: user-guided multi-relational clustering

Data Mining and Knowledge Discovery
Hierarchical clustering for OLAP: the CUBE File approach

The VLDB Journal — The International Journal on Very Large Data Bases
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
ClustCube: an OLAP-based framework for clustering and mining complex database objects

Proceedings of the 2011 ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In traditional OLAP systems, roll-up and drill-down operations over data cubes exploit fixed hierarchies defined on discrete attributes that play the roles of dimensions, and operate along them. However, in recent years, a new tendency of considering even continuous attributes as dimensions, hence hierarchical members become continuous accordingly, has emerged mostly due to novel and emerging application scenarios like sensor and data stream management tools. A clear advantage of this emerging approach is that of avoiding the beforehand definition of an ad-hoc discretization hierarchy along each OLAP dimension. Following this latest trend, in this paper we propose a novel method for effectively and efficiently supporting roll-up and drill-down operations over OLAP data cubes with continuous dimensions via a density-based hierarchical clustering algorithm. This algorithm allows us to hierarchically cluster together dimension instances by also taking fact-table measures into account in order to enhance the clustering effect with respect to the possible analysis. Experiments on two well-known multidimensional datasets clearly show the advantages of the proposed solution.