Distributed frequent items detection on uncertain data

  • Authors:
  • Shuang Wang;Guoren Wang;Jitong Chen

  • Affiliations:
  • Software College, Northeastern University, Shenyang, China and College of Information Science and Engineering, Northeastern University, Shenyang, China;College of Information Science and Engineering, Northeastern University, Shenyang, China;Software College, Northeastern University, Shenyang, China

  • Venue:
  • ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent items detection is one of the valuable techniques in many applications, such as network monitor, network intrusion detection, worm virus detection, and so on. This technique has been well studied on deterministic databases. However, it is a new task on emerging uncertain database, especially in distributed environment. In this paper, a new definition of frequent items on uncertain data is defined. Based on the definition, a polynomial algorithm is proposed, which can efficiently answer the queries in central environment. Furthermore, this work designs the communication-efficient algorithms for retrieving the top-k items with the largest probability from distributed sites. The algorithms compute the upper bound of each round of the transmission, and filter the data as much as possible, which have no chance to influence the query result. Extensive experiments show that the algorithms can process the queries correctly and reduce communication cost efficiently with various data set.