Optimizing relational store for e-catalog queries: a data mining approach

Authors:
Min Wang;X. Sean Wang
Affiliations:
IBM T. J. Watson Research Center, Hawthorne, NY;George Mason University, Fairfax, Virginia
Venue:
Proceedings of the 2002 ACM symposium on Applied computing
Year:
2002

Citing 10
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Continuous querying in database-centric Web applications

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
A Query Processing Strategy for the Decomposed Storage Model

Proceedings of the Third International Conference on Data Engineering
Mining Frequent Itemsets Using Support Constraints

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Storage and Querying of E-Commerce Data

Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Fuzzy Functional Dependency and Its Application to Approximate Data Querying

IDEAS '00 Proceedings of the 2000 International Symposium on Database Engineering & Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

A frequent use of database management systems in electronic commerce is to provide electronic product catalogs (e-catalogs) that allow users to search for products of interest via constraints on attributes. An intuitively straightforward representation of e-catalogs is to use one table for the whole e-catalog as it is conceptually easy to maintain and query. However, for any e-commerce business with a reasonably large number of products and product types, its e-catalog usually involves a large number of attributes due to the great variety of the products, and at the same time, contains a large number of null values due to the fact that each product only has values under a relatively small number of attributes. Because of these properties, the above intuitive method does not work well in current relational database systems. Techniques have been proposed in the literature to deal with this problem, namely binary and vertical schemas. However, these techniques fail to take advantage of inherent properties of realistic e-catalogs to provide superior performance. This paper proposes a novel decomposition method for e-catalogs based on association rule discovery, a data mining technique. The method discovers groups of attributes that frequently appear together, i.e., are frequently used together to describe products, and generates schemas that contain these groups. This paper also reports experimental results showing the efficiency of the method.