Achieving Communication Efficiency through Push-Pull Partitioning of Semantic Spaces to Disseminate Dynamic Information

Authors:
Amitabha Bagchi;Amitabh Chaudhary;Michael T. Goodrich;Chen Li;Michal Shmueli-Scheuer
Affiliations:
-;-;IEEE;IEEE;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2006

Citing 22
Cited 3

Equi-depth multidimensional histograms

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Distributed algorithms for dynamic replication of data

PODS '92 Proceedings of the eleventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Balancing push and pull for data broadcast

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Optimal histograms for hierarchical range queries (extended abstract)

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive push-pull: disseminating dynamic web data

Proceedings of the 10th international conference on World Wide Web
STHoles: a multidimensional workload-aware histogram

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Adaptive precision setting for cached approximate values

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Enabling dynamic content caching for database-driven web sites

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A survey of web caching schemes for the Internet

ACM SIGCOMM Computer Communication Review
Client Data Caching: A Foundation for High Performance Object Database Systems

Client Data Caching: A Foundation for High Performance Object Database Systems
Dynamically Selecting Optimal Distribution Strategies for Web Documents

IEEE Transactions on Computers
Affinity-based management of main memory database clusters

ACM Transactions on Internet Technology (TOIT)
Scaling Access to Heterogeneous Data Sources with DISCO

IEEE Transactions on Knowledge and Data Engineering
Caching on the World Wide Web

IEEE Transactions on Knowledge and Data Engineering
Mobile Computing and Databases-A Survey

IEEE Transactions on Knowledge and Data Engineering
Loading a Cache with Query Results

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Semantic Data Caching and Replacement

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A predicate-based caching scheme for client-server database architectures

The VLDB Journal — The International Journal on Very Large Data Bases
Optimistic Transaction Processing Algorithms in Pure-Push and Adaptive Broadcast Environments

ICPADS '01 Proceedings of the Eighth International Conference on Parallel and Distributed Systems
Maintaining coherency of dynamic data in cooperating repositories

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
View invalidation for dynamic content caching in multitiered architectures

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Object Caching for Queries and Updates

WALCOM '09 Proceedings of the 3rd International Workshop on Algorithms and Computation
A dynamic data middleware cache for rapidly-growing scientific repositories

Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
Processing flows of information: From data stream to complex event processing

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many database applications that need to disseminate dynamic information from a server to various clients can suffer from heavy communication costs. Data caching at a client can help mitigate these costs, particularly when individual {\rm PUSH}{\hbox{-}}{\rm PULL} decisions are made for the different semantic regions in the data space. The server is responsible for notifying the client about updates in the {\rm PUSH} regions. The client needs to contact the server for queries that ask for data in the {\rm PULL} regions. We call the idea of partitioning the data space into {\rm PUSH}{\hbox{-}}{\rm PULL} regions to minimize communication cost data gerrymandering. In this paper, we present solutions to technical challenges in adopting this simple but powerful idea. We give a provably optimal-cost dynamic programming algorithm for gerrymandering on a single query attribute. We propose a family of efficient heuristics for gerrymandering on multiple query attributes. We handle the dynamic case in which the workloads of queries and updates evolve over time. We validate our methods through extensive experiments on real and synthetic data sets.