O2, an object-oriented data model
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Introduction to algorithms
TCP/IP illustrated (vol. 1): the protocols
TCP/IP illustrated (vol. 1): the protocols
Approximation algorithms for NP-hard problems
Approximation algorithms for NP-hard problems
The algorithm design manual
Online computation and competitive analysis
Online computation and competitive analysis
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Adaptive push-pull: disseminating dynamic web data
Proceedings of the 10th international conference on World Wide Web
Adaptive precision setting for cached approximate values
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Enabling dynamic content caching for database-driven web sites
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Divergence caching in client-server architectures
PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
Best-effort cache synchronization with source cooperation
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
The SDSS skyserver: public access to the sloan digital sky server data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
IEEE Transactions on Knowledge and Data Engineering
Mariposa: A New Architecture for Distributed Data
Proceedings of the Tenth International Conference on Data Engineering
Semantic Data Caching and Replacement
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Mariposa: a wide-area distributed database system
The VLDB Journal — The International Journal on Very Large Data Bases
In-Memory Data Management in the Application Tier
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
DBCache: middle-tier database caching for highly scalable e-business architectures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Support for relaxed currency and consistency constraints in MTCache
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Exploring the tradeoff between performance and data freshness in database-driven Web servers
The VLDB Journal — The International Journal on Very Large Data Bases
Bypass Caching: Making Scientific Databases Good Network Citizens
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
IEEE Transactions on Knowledge and Data Engineering
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Scalable query result caching for web applications
Proceedings of the VLDB Endowment
A workload-driven unit of cache replacement for mid-tier database caching
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Hi-index | 0.00 |
Modern scientific repositories are growing rapidly in size. Scientists are increasingly interested in viewing the latest data as part of query results. Current scientific middleware cache systems, however, assume repositories are static. Thus, they cannot answer scientific queries with the latest data. The queries, instead, are routed to the repository until data at the cache is refreshed. In data-intensive scientific disciplines, such as astronomy, indiscriminate query routing or data refreshing often results in runaway network costs. This severely affects the performance and scalability of the repositories and makes poor use of the cache system. We present Delta a dynamic data middleware cache system for rapidly-growing scientific repositories. Delta's key component is a decision framework that adaptively decouples data objects---choosing to keep some data object at the cache, when they are heavily queried, and keeping some data objects at the repository, when they are heavily updated. Our algorithm profiles incoming workload to search for optimal data decoupling that reduces network costs. It leverages formal concepts from the network flow problem, and is robust to evolving scientific workloads. We evaluate the efficacy of Delta, through a prototype implementation, by running query traces collected from a real astronomy survey.