PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Internetworking with TCP/IP vol III (2nd ed.): client-server programming and applications BSD socket version
Remote I/O: fast access to distant storage
Proceedings of the fifth workshop on I/O in parallel and distributed systems
The impact of I/O on program behavior and parallel scheduling
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
NAS Grid Benchmarks: A Tool for Grid Space Exploration
Cluster Computing
A Data Broker for Distributed Computing Environments
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
SmartPointers: personalized scientific data portals in your hand
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Flexible and Efficient Parallel I/O for Large-Scale Multi-Component Simulations
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Globus toolkit version 4: software for service-oriented systems
NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Increasing parallelism for workflows in the grid
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
File transfer is very common in a modern distributed computing environment. Protocols such as HTTP and FTP are designed for downloading or uploading files from/to servers. Some other tools such as `secure copy' are used to transfer files among hosts securely. In this paper, the file transfer is considered in the context of connecting distributed applications, what is an output of a data producer on one node would be an input of a data consumer on another node. Intermediate files are used as a medium to connect workflow computational phases, which is a common paradigm used in grid environments. Distributed File Streamer a.k.a. DFS, as its name implies, uses data streaming to couple distributed applications. Instead of waiting for a producer application for output to transfer completely to the consumer node, DFS streams the data over the network directly to a consumer program, managing the data flow efficiently and providing a framework for partial file consumption. This paper describes the architecture of the DFS framework, gives its performance model analysis, and provides results demonstrating DFS advantages over the traditional way on several examples.