Mining top-K multidimensional gradients

Authors:
Ronnie Alves;Orlando Belo;Joel Ribeiro
Affiliations:
Department of Informatics, School of Engineering, University of Minho, Braga, Portugal;Department of Informatics, School of Engineering, University of Minho, Braga, Portugal;Department of Informatics, School of Engineering, University of Minho, Braga, Portugal
Venue:
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Year:
2007

Citing 11
Cited 0

The onion technique: indexing for linear optimization queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
i3: intelligent, interactive investigation of OLAP data cubes

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
PREFER: a system for the efficient execution of multi-parametric ranked queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation

ACM Transactions on Database Systems (TODS)
Cubegrades: Generalizing Association Rules

Data Mining and Knowledge Discovery
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Intelligent Rollups in Multidimensional OLAP Data

Proceedings of the 27th International Conference on Very Large Data Bases
Mining Constrained Gradients in Large Databases

IEEE Transactions on Knowledge and Data Engineering
Answering top-k queries with multi-dimensional selections: the ranking cube approach

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
High-dimensional OLAP: a minimal cubing approach

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several business applications such as marketing basket analysis, clickstream analysis, fraud detection and churning migration analysis demand gradient data analysis. By employing gradient data analysis one is able to identify trends, outliers and answering "what-if" questions over large databases. Gradient queries were first introduced by Imielinski et al [1] as the cubegrade problem. The main idea is to detect interesting changes in a multidimensional space (MDS). Thus, changes in a set of measures (aggregates) are associated with changes in sector characteristics (dimensions). MDS contains a huge number of cells which poses great challenge for mining gradient cells on a useful time. Dong et al [2] have proposed gradient constraints to smooth the computational costs involved in such queries. Even by using such constraints on large databases, the number of interesting cases to evaluate is still large. In this work, we are interested to explore best cases (Top-K cells) of interesting multidimensional gradients. There several studies on Top-K queries, but preference queries with multidimensional selection were introduced quite recently by Dong et al [9]. Furthermore, traditional Top-K methods work well in presence of convex functions (gradients are non-convex ones). We have revisited iceberg cubing for complex measures, since it is the basis for mining gradient cells. We also propose a gradient-based cubing strategy to evaluate interesting gradient regions in MDS. Thus, the main challenge is to find maximum gradient regions (MGRs) that maximize the task of mining Top-K gradient cells. Our performance study indicates that our strategy is effective on finding the most interesting gradients in multidimensional space.