Enhanced clustering of complex database objects in the clustcube framework

Authors:
Alfredo Cuzzocrea;Paolo Serafino
Affiliations:
ICAR-CNR & University of Calabria, Rende, Cosenza, Italy;University of Calabria, Rende, Cosenza, Italy
Venue:
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Year:
2012

Citing 11
Cited 1

Towards on-line analytical mining in large databases

ACM SIGMOD Record
Clustering methods for large databases: from the past to the future

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
BIRCH: A New Data Clustering Algorithm and Its Applications

Data Mining and Knowledge Discovery
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
Automatic Subspace Clustering of High Dimensional Data

Data Mining and Knowledge Discovery
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach

IEEE Transactions on Knowledge and Data Engineering
Comparing clusterings---an information based distance

Journal of Multivariate Analysis
CrossClus: user-guided multi-relational clustering

Data Mining and Knowledge Discovery
Enabling OLAP in mobile environments via intelligent data cube compression techniques

Journal of Intelligent Information Systems
ClustCube: an OLAP-based framework for clustering and mining complex database objects

Proceedings of the 2011 ACM Symposium on Applied Computing

DOLAP 2012 workshop summary

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper significantly extends our previous research contribution [1], where we introduced the OLAP-based ClustCube framework for clustering and mining complex database objects extracted from distributed database settings. In particular, in this research we provide the following two novel contributions over [1]. First, we provide an innovative tree-based distance function over complex objects that takes into account the typical tree-like nature of these objects in distributed database settings. This novel distance is a relevant contribution over the simpler low-level-field-based distance presented in [1]. Second, we provide a comprehensive experimental campaign of ClustCube algorithms for computing ClustCube cubes, according to both performance metrics and accuracy metrics, against a well-known benchmark data set, and in comparison with a state-of-the-art subspace clustering algorithm for high-dimensional data. Retrieved results clearly demonstrate the superiority of our approach.