Towards on-line analytical mining in large databases
ACM SIGMOD Record
Clustering methods for large databases: from the past to the future
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Mining and Knowledge Discovery
CLARANS: A Method for Clustering Objects for Spatial Data Mining
IEEE Transactions on Knowledge and Data Engineering
Automatic Subspace Clustering of High Dimensional Data
Data Mining and Knowledge Discovery
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach
IEEE Transactions on Knowledge and Data Engineering
Comparing clusterings---an information based distance
Journal of Multivariate Analysis
CrossClus: user-guided multi-relational clustering
Data Mining and Knowledge Discovery
Enabling OLAP in mobile environments via intelligent data cube compression techniques
Journal of Intelligent Information Systems
ClustCube: an OLAP-based framework for clustering and mining complex database objects
Proceedings of the 2011 ACM Symposium on Applied Computing
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
This paper significantly extends our previous research contribution [1], where we introduced the OLAP-based ClustCube framework for clustering and mining complex database objects extracted from distributed database settings. In particular, in this research we provide the following two novel contributions over [1]. First, we provide an innovative tree-based distance function over complex objects that takes into account the typical tree-like nature of these objects in distributed database settings. This novel distance is a relevant contribution over the simpler low-level-field-based distance presented in [1]. Second, we provide a comprehensive experimental campaign of ClustCube algorithms for computing ClustCube cubes, according to both performance metrics and accuracy metrics, against a well-known benchmark data set, and in comparison with a state-of-the-art subspace clustering algorithm for high-dimensional data. Retrieved results clearly demonstrate the superiority of our approach.