Distributed Web Mining Using Bayesian Networks from Multiple Data Streams

  • Authors:
  • R. Chen;Krishnamoorthy Sivakumar;Hillol Kargupta

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a collective approach to mine Bayesian net-works from distributed heterogenous web-log data streams. In this approach we first learn a local Bayesian network at each site using the local data. Then each site identifies the observations that are most likely to be evidence of coupling between local and non-local variables and transmits asub-set of these observations to a central site. Another Bayesian network is learnt at the central site using the data transmittedfrom the local site. The local and central Bayesian networks are combined to obtain a collective Bayesian net-work, that models the entire data. We applied this techniqueto mine multiple data streams where data centralization is difficult because of large response time and scalability issues.Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented.