A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
Summary cache: a scalable wide-area Web cache sharing protocol
Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
R* Optimizer Validation and Performance Evaluation for Distributed Queries
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Efficient join processing over uncertain data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Improving distributed join efficiency with extended bloom filter operations
AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Optimizing Distributed Joins with Bloom Filters
ICDCIT '08 Proceedings of the 5th International Conference on Distributed Computing and Internet Technology
A Survey of Uncertain Data Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
Probabilistic Threshold Range Aggregate Query Processing over Uncertain Data
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Confidence-Aware Join Algorithms
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ranking distributed probabilistic data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Evaluation of probabilistic threshold queries in MCDB
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Efficient and Progressive Algorithms for Distributed Skyline Queries over Uncertain Data
ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
Cardinality estimation and dynamic length adaptation for Bloom filters
Distributed and Parallel Databases
Hi-index | 0.00 |
Large amount of uncertain data is collected by many emerging applications which contain multiple sources in a distributed manner. Previous efforts on querying uncertain data in distributed environment have only focus on ranking and skyline, join queries have not been addressed in earlier work despite their importance in databases. In this paper, we address distributed probabilistic threshold join query, which retrieves results satisfying the join condition with combining probabilities that meet the threshold requirement from distributed sites. We propose a new kind of bloom filters called Probability Bloom Filters (PBF) to represent set with probabilistic attribute and design a PBF based Bloomjoin algorithm for executing distributed probabilistic threshold join query with communication efficiency. Furthermore, we provide theoretical analysis of the network cost of our algorithm and demonstrate it by simulation. The experiment results show that our algorithm can save network cost efficiently by comparing to original Bloomjoin algorithm in most scenarios.