Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
On-line caching as cache size varies
SODA '91 Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms
TCP/IP illustrated (vol. 1): the protocols
TCP/IP illustrated (vol. 1): the protocols
The LRU-K page replacement algorithm for database disk buffering
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Page replacement with multi-size pages and applications to Web caching
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Proxy caching that estimates page load delays
Selected papers from the sixth international conference on World Wide Web
Online computation and competitive analysis
Online computation and competitive analysis
LP-based analysis of greedy-dual-size
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Distributed processing of very large datasets with DataCutter
Parallel Computing - Clusters and computational grids for scientific computing
DBCache: database caching for web application servers
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Framework for Cache Management for Mobile Databases: Design and Evaluation
Distributed and Parallel Databases
Operating Systems Theory
A self-managing data cache for edge-of-network web applications
Proceedings of the eleventh international conference on Information and knowledge management
Mariposa: A New Architecture for Distributed Data
Proceedings of the Tenth International Conference on Data Engineering
Role of Aging, Frequency, and Size in Web Cache Replacement Policies
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
A Scalable Algorithm for Answering Queries Using Views
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
A Performance Study of Query Optimization Algorithms on a Database System Supporting Procedures
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Form-Based Proxy Caching for Database-Backed Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Semantic Data Caching and Replacement
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Average-Case Competitive Analyses for Ski-Rental Problems
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
WATCHMAN: A Data Warehouse Intelligent Cache Manager
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Optimal implementation of conjunctive queries in relational data bases
STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Cost-Sensitive Cache Replacement Algorithms
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Popularity-Aware Greedy Dual-Size Web Proxy Caching Algorithms
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
ARC: A Self-Tuning, Low Overhead Replacement Cache
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
IEEE Journal on Selected Areas in Communications
Estimating query result sizes for proxy caching in scientific database federations
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Throughput-optimized, global-scale join processing in scientific federations
NETB'07 Proceedings of the 3rd USENIX international workshop on Networking meets databases
Workload-Aware Histograms for Remote Applications
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Object Caching for Queries and Updates
WALCOM '09 Proceedings of the 3rd International Workshop on Algorithms and Computation
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
Caching and Materialization for Web Databases
Foundations and Trends in Databases
Admission policies for caches of search engine results
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
A workload-driven unit of cache replacement for mid-tier database caching
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Efficient querying of distributed provenance stores
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Predicting cost amortization for query services
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A dynamic data middleware cache for rapidly-growing scientific repositories
Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
Hi-index | 0.00 |
Scientific database federations are geographically distributed and network bound. Thus, they could benefit from proxy caching. However, existing caching techniques are not suitable for their workloads, which compare and join large data sets. Existing techniques reduce parallelism by conducting distributed queries in a single cache and lose the data reduction benefits of performing selections at each database. We develop the bypass-yield formulation of caching, which reduces network traffic in wide-area database federations, while preserving parallelism and data reduction. Bypass-yield caching is altruistic; caches minimize the overall network traffic generated by the federation, rather than focusing on local performance. We present an adaptive, workload-driven algorithm for managing a bypass-yield cache. We also develop on-line algorithms that make no assumptions about workload: a k-competitive deterministic algorithm and a randomized algorithm with minimal space complexity. We verify the efficacy of bypass-yield caching by running workload traces collected from the Sloan Digital Sky Survey through a prototype implementation.