Query Processing in a Fragmented Relational Distributed System: Mermaid
IEEE Transactions on Software Engineering - Annals of discrete mathematics, 24
A Symmetric Fragment and Replicate Algorithm for Distributed Joinsyout
IEEE Transactions on Parallel and Distributed Systems
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
A Heuristic Approach to Distributed Query Processing
VLDB '82 Proceedings of the 8th International Conference on Very Large Data Bases
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Query Processing in Distributed Database System
IEEE Transactions on Software Engineering
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
A comparison of join algorithms for log processing in MaPreduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Hi-index | 0.00 |
There are many advantages for large scale data management in the cloud. More and more companies start to migrate their data into cloud data management systems. Join query becomes a challenging research problem in cloud. To finish a join query in the cloud, data among different nodes need to be transferred. The arrangement of data transmission and local data processing is known as a distribution strategy for a query. The transmission cost (network workload between servers and the transmission time delay) will be very high if the strategy is not properly chosen. Existing cloud systems either do not support join query or just use MapReduce to support some simple join queries. The problem of using redundant data for join query optimization in cloud environment is studied in this paper. Two novel algorithms, Set Cover based algorithm (SC) and Minimum Element based algorithm (ME), are proposed to reduce data transmission cost. The experiment results demonstrate that the proposed methods can greatly reduce the data transmission cost compared with the naive method. Besides, the result is very close to the optimal strategy.