Active Mining in a Distributed Setting

  • Authors:
  • Srinivasan Parthasarathy;Sandhya Dwarkadas;Mitsunori Ogihara

  • Affiliations:
  • -;-;-

  • Venue:
  • Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most current work in data mining assumes that the data is static, and a database update requires re-mining both the old and new data. In this article, we propose an alternative approach. We outline a general strategy by which data mining algorithms can be made active -- i.e., maintain valid mined information in the presence of user interaction and database updates. We describe a runtime framework that allows efficient caching and sharing of data among clients and servers. We then demonstrate how existing algorithms for four key mining tasks: Discretization, AssociationMining, Sequence Mining, and Similarity Discovery, can be re-architected so that they maintain valid mined information across i) database updates, and ii) user interactions in a client-server setting, while minimizing the amount of data re-accessed.