Resource containers: a new facility for resource management in server systems
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
A Scalable Architecture for Cooperative Web Caching
Revised Papers from the NETWORKING 2002 Workshops on Web Engineering and Peer-to-Peer Computing
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
A survey of Web cache replacement strategies
ACM Computing Surveys (CSUR)
Handbook of Scheduling: Algorithms, Models, and Performance Analysis
Handbook of Scheduling: Algorithms, Models, and Performance Analysis
A Survey of Peer-to-Peer Storage Techniques for Distributed File Systems
ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II - Volume 02
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Distributing the Sloan Digital Sky Survey Using UDT and Sector
E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Falkon: a Fast and Light-weight tasK executiON framework
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Accelerating large-scale data exploration through data diffusion
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
Data mining using high performance data clouds: experimental studies using sector and sphere
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Toward loosely coupled programming on petascale systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Integrating local job scheduler – LSFTM with GfarmTM
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Case studies in storage access by loosely coupled petascale applications
Proceedings of the 4th Annual Workshop on Petascale Data Storage
Middleware support for many-task computing
Cluster Computing
File-Access Characteristics of Data-Intensive Workflow Applications
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Computation mapping for multi-level storage cache hierarchies
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A layout-aware optimization strategy for collective I/O
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Making a case for distributed file systems at Exascale
Proceedings of the third international workshop on Large-scale system and application performance
A Workflow-Aware Storage System: An Opportunity Study
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
SimMatrix: SIMulator for MAny-Task computing execution fabRIc at eXascale
Proceedings of the High Performance Computing Symposium
Hi-index | 0.00 |
Data-intensive applications involving the analysis of large datasets often require large amounts of compute and storage resources, for which data locality can be crucial to high throughput and performance. We propose a "data diffusion" approach that acquires compute and storage resources dynamically, replicates data in response to demand, and schedules computations close to data. As demand increases, more resources are acquired, thus allowing faster response to subsequent requests that refer to the same data; when demand drops, resources are released. This approach can provide the benefits of dedicated hardware without the associated high costs, depending on workload and resource characteristics. To explore the feasibility of data diffusion, we offer both a theoretical and an empirical analysis. We define an abstract model for data diffusion, introduce new scheduling policies with heuristics to optimize real-world performance, and develop a competitive online cache eviction policy. We also offer many empirical experiments to explore the benefits of dynamically expanding and contracting resources based on load, to improve system responsiveness while keeping wasted resources small. We show performance improvements of one to two orders of magnitude across three diverse workloads when compared to the performance of parallel file systems with throughputs approaching 80 Gb/s on a modest cluster of 200 processors. We also compare data diffusion with a best model for active storage, contrasting the difference between a pull-model found in data diffusion and a push-model found in active storage.