A fully automated fault-tolerant system for distributed video processing and off-site replication
NOSSDAV '04 Proceedings of the 14th international workshop on Network and operating systems support for digital audio and video
Data pipelines: enabling large scale multi-protocol data transfers
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Phoenix: Making Data-Intensive Grid Applications Fault-Tolerant
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Integrating databases and workflow systems
ACM SIGMOD Record
Grid harvest service: a performance system of grid computing
Journal of Parallel and Distributed Computing
Advanced resource connector middleware for lightweight computational Grids
Future Generation Computer Systems - Special section: Information engineering and enterprise architecture in distributed computing environments
Managing data persistence in network enabled servers
Scientific Programming - Dynamic Grids and Worldwide Computing
Job scheduling and data replication on data grids
Future Generation Computer Systems
Data driven workflow planning in cluster management systems
Proceedings of the 16th international symposium on High performance distributed computing
A distributed job scheduling and flow management system
ACM SIGOPS Operating Systems Review
Intelligent data staging with overlapped execution of grid applications
Future Generation Computer Systems
A control theoretical approach to self-optimizing block transfer in Web service grids
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Optimizing center performance through coordinated data staging, scheduling and recovery
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Dynamic service selection in workflows using performance data
Scientific Programming - Dynamic Computational Workflows: Discovery, Optimization and Scheduling
INFORM: integrated flow orchestration and meta-scheduling for managed grid systems
Proceedings of the 2007 ACM/IFIP/USENIX international conference on Middleware companion
Dynamically tuning level of parallelism in wide area data transfers
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
Designing a resource broker for heterogeneous grids
Software—Practice & Experience
BitDew: a programmable environment for large-scale data management and distribution
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Using overlays for efficient data transfer over shared wide-area networks
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Performance Evaluation of Data Management Layer by Data Sharing Patterns for Grid RPC Applications
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A new paradigm: Data-aware scheduling in grid computing
Future Generation Computer Systems
Multi-Replication with Intelligent Staging in Data-Intensive Grid Applications
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
/scratch as a cache: rethinking HPC center scratch storage
Proceedings of the 23rd international conference on Supercomputing
Proceedings of the second international workshop on Data-aware distributed computing
Balancing TCP buffer vs parallel streams in application level throughput optimization
Proceedings of the second international workshop on Data-aware distributed computing
Design and Implementation of Metadata System in PetaShare
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Journal of Network and Computer Applications
Semantic enabled metadata management in PetaShare
International Journal of Grid and Utility Computing
Scheduling data-intensive workflows on storage constrained resources
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Lessons learned from a year's worth of benchmarks of large data clouds
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Node-capability-aware replica management for peer-to-peer grids
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
HiPC'07 Proceedings of the 14th international conference on High performance computing
Overlay network management for scheduling tasks on the grid
ICDCIT'07 Proceedings of the 4th international conference on Distributed computing and internet technology
A data placement strategy in scientific cloud workflows
Future Generation Computer Systems
File-Access Characteristics of Data-Intensive Workflow Applications
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A data transfer framework for large-scale science experiments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
GatorShare: a file system framework for high-throughput data management
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Error detection and error classification: failure awareness in data transfer scheduling
International Journal of Autonomic Computing
Improving workflow fault tolerance through provenance-based recovery
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
DECO: data replication and execution CO-scheduling for utility grids
ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Simultaneous scheduling of replication and computation for bioinformatic applications on the grid
ISBMDA'05 Proceedings of the 6th International conference on Biological and Medical Data Analysis
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Moving huge scientific datasets over the Internet
Concurrency and Computation: Practice & Experience
ATLAS grid workload on NDGF resources: analysis, modeling, and workload generation
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Data transfer in advance on cluster
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Adapting scientific workflow structures using multi-objective optimization strategies
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Taming massive distributed datasets: data sampling using bitmap indices
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
StorkCloud: data transfer scheduling and optimization as a service
Proceedings of the 4th ACM workshop on Scientific cloud computing
Octopus: efficient data intensive computing on virtualized datacenters
Proceedings of the 6th International Systems and Storage Conference
A case for MapReduce over the internet
Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
SDQuery DSI: integrating data management support with a wide area data transfer protocol
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Dynamic protocol tuning algorithms for high performance data transfers
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
Todays scientific applications have huge data requirements which continue to increase drastically every year. These data are generally accessed by many users from all across the the globe. This implies a major necessity to move huge amounts of data around wide area networks to complete the computation cycle, which brings with it the problem of efficient and reliable data placement. The current approach to solve this problem of data placement is either doing it manually, or employing simple scripts which do not have any automation or fault tolerance capabilities. Our goal is to make data placement activities first class citizens in the Grid just like the computational jobs. They will be queued, scheduled, monitored, managed, and even check-pointed. More importantly, it will be made sure that they complete successfully and without any human interaction. We also believe that data placement jobs should be treated differently from computational jobs, since they may have different semantics and different characteristics. For this purpose, we have developed Stork, a scheduler for dataplacement activities in the Grid.