Discovering critical edge sequences in E-commerce catalogs

Authors:
Kaushik Dutta;Debra VanderMeer;Anindya Datta;Krithi Ramamritham
Affiliations:
Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA;University of Massachusetts-Amherst and IIT-Bombay
Venue:
Proceedings of the 3rd ACM conference on Electronic Commerce
Year:
2001

Citing 11
Cited 3

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to algorithms

Introduction to algorithms
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Optimization of Run-time Management of Data Intensive Web-sites

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Quantifiable data mining using ratio rules

The VLDB Journal — The International Journal on Very Large Data Bases
Data mining for path traversal patterns in a web environment

ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Mining longest repeating subsequences to predict world wide web surfing

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2

Cost and Response Time Simulation forWeb-based Applications on Mobile Channels

QSIC '05 Proceedings of the Fifth International Conference on Quality Software
Mining Nonambiguous Temporal Patterns for Interval-Based Events

IEEE Transactions on Knowledge and Data Engineering
Performance tuning and cost discovery of mobile web-based applications

International Journal of Web Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web sites allow the collection of vast amounts of navigational data -- clickstreams of user traversals through the site. These massive data stores offer the tantalizing possibility of uncovering interesting patterns within the dataset. For e-businesses, always looking for an edge in the hyper-competitive online marketplace, this possibility is of particular interest. Of significant particular interest to e-businesses is the discovery of Critical Edge Sequences (CES), which denote frequently traversed subpaths in the catalog. CESs can be used to improve site performance and site management, increase the effectiveness of advertising on the site, and gather additional knowledge of customer interest patterns on the site.Using traditional graph-based and web mining strategies to find CESs could turn out to be expensive in both space and time. In this paper, we propose a method to compute the most popular paths bewteen node pairs in a catalog, which are then used to discover CESs. Our method is both space-efficient and accurate, providing a vast reduction in the storage requirement with a minimum impact on accuracy. This algorithm, executed off-line in batch mode, is also practical with respect to running time. As a variant of single-source shortest-path, it runs in log linear time.