Active Mining in a Distributed Setting

Authors:
Srinivasan Parthasarathy;Sandhya Dwarkadas;Mitsunori Ogihara
Affiliations:
-;-;-
Venue:
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Year:
1999

Citing 13
Cited 0

A course in density estimation

A course in density estimation
C4.5: programs for machine learning

C4.5: programs for machine learning
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Finding interesting rules from large sets of discovered association rules

CIKM '94 Proceedings of the third international conference on Information and knowledge management
Fast discovery of association rules

Advances in knowledge discovery and data mining
Discovering Patterns from Large and Dynamic Sequential Data

Journal of Intelligent Information Systems
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient enumeration of frequent sequences

Proceedings of the seventh international conference on Information and knowledge management
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Online Generation of Association Rules

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most current work in data mining assumes that the data is static, and a database update requires re-mining both the old and new data. In this article, we propose an alternative approach. We outline a general strategy by which data mining algorithms can be made active -- i.e., maintain valid mined information in the presence of user interaction and database updates. We describe a runtime framework that allows efficient caching and sharing of data among clients and servers. We then demonstrate how existing algorithms for four key mining tasks: Discretization, AssociationMining, Sequence Mining, and Similarity Discovery, can be re-architected so that they maintain valid mined information across i) database updates, and ii) user interactions in a client-server setting, while minimizing the amount of data re-accessed.