Mining Constrained Gradients in Large Databases

Authors:
Guozhu Dong;Jiawei Han;Joyce M. W. Lam;Jian Pei;Ke Wang;Wei Zou
Affiliations:
IEEE;IEEE;-;IEEE Computer Society;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2004

Citing 20
Cited 15

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A statistical theory for quantitative association rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Cubegrades: Generalizing Association Rules

Data Mining and Knowledge Discovery
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Mining Frequent Item Sets with Convertible Constraints

Proceedings of the 17th International Conference on Data Engineering
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Intelligent Rollups in Multidimensional OLAP Data

Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient Mining of Constrained Correlated Sets

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Enhanced mining of association rules from data cubes

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Topological approaches to covering rough sets

Information Sciences: an International Journal
Mining significant change patterns in multidimensional spaces

International Journal of Business Intelligence and Data Mining
Strategies for complex data cube queries

Applied Intelligence
Reduction about approximation spaces of covering generalized rough sets

International Journal of Approximate Reasoning
gPrune: a constraint pushing framework for graph pattern mining

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Techniques for finding similarity knowledge in OLAP reports

Expert Systems with Applications: An International Journal
Extracting semantics in OLAP databases using emerging cubes

Information Sciences: an International Journal
Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs

Information Sciences: an International Journal
Binary relation based rough sets

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Multi knowledge based rough approximations and applications

Knowledge-Based Systems
Efficient computation of multi-feature data cubes

KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Mining top-K multidimensional gradients

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Related family: A new method for attribute reduction of covering information systems

Information Sciences: an International Journal
Constrained Cube Lattices for Multidimensional Database Mining

International Journal of Data Warehousing and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many data analysis tasks can be viewed as search or mining in a multidimensional space (MDS). In such MDSs, dimensions capture potentially important factors for given applications, and cells represent combinations of values for the factors. To systematically analyze data in MDS, an interesting notion, called "cubegrade驴 was recently introduced by Imielinski et al. [CHECK END OF SENTENCE], which focuses on the notable changes in measures in MDS by comparing a cell (which we refer to as probe cell) with its gradient cells, namely, its ancestors, descendants, and siblings. We call such queries gradient analysis queries (GQs). Since an MDS can contain billions of cells, it is important to answer GQs efficiently. In this study, we focus on developing efficient methods for mining GQs constrained by certain (weakly) antimonotone constraints. Instead of conducting an independent gradient-cell search once per probe cell, which is inefficient due to much repeated work, we propose an efficient algorithm, LiveSet-Driven. This algorithm finds all good gradient-probe cell pairs in one search pass. It utilizes measure-value analysis and dimension-match analysis in a set-oriented manner, to achieve bidirectional pruning between the sets of hopeful probe cells and of hopeful gradient cells. Moreover, it adopts a hypertree structure and an H-cubing method to compress data and to maximize sharing of computation. Our performance study shows that this algorithm is efficient and scalable. In addition to data cubes, we extend our study to another important scenario: mining constrained gradients in transactional databases where each item is associated with some measures such as price. Such transactional databases can be viewed as sparse MDSs where items represent dimensions, although they have significantly different characteristics than data cubes. We outline efficient mining methods for this problem in this paper.