Design of a Framework for Data-Intensive Wide-Area Applications

Authors:
Michael D. Beynon;Tahsin Kurc;Alan Sussman;Joel Saltz
Affiliations:
-;-;-;-
Venue:
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Year:
2000

Citing 0
Cited 13

iFlow (poster session): a data streaming application framework based on a uniform abstraction

OOPSLA '00 Addendum to the 2000 proceedings of the conference on Object-oriented programming, systems, languages, and applications (Addendum)
Efficient Manipulation of Large Datasets on Heterogeneous Storage Systems

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Active Proxy-G: optimizing the query execution process in the grid

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Executing multiple pipelined data analysis operations in the grid

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Parallel Implementation of 4-Dimensional Haralick Texture Analysis for Disk-Resident Image Datasets

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Grid -Based Parallel Data Streaming implemented for the Gyrokinetic Toroidal Code

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Exploiting Inter-File Access Patterns Using Multi-Collective I/O

FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Multicollective I/O: A technique for exploiting inter-file access patterns

ACM Transactions on Storage (TOS)
Data-intensive computing for competent genetic algorithms: a pilot study using meandre

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Sockets direct protocol for hybrid network stacks: a case study with iWARP over 10G Ethernet

HiPC'08 Proceedings of the 15th international conference on High performance computing
A general approach to data-intensive computing using the Meandre component-based framework

Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science
Exploiting inter-file access patterns using multi-collective I/O

FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Meandre data-intensive application infrastructure: extreme scalability for cloud and/or grid computing

JSAI-isAI'10 Proceedings of the 2010 international conference on New Frontiers in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Applications that use collections of very large, distributed datasets have become an increasingly important part of science and engineering. With high performance wide-area networks become more pervasive, there is interest in making collective use of distributed computational and data resources. Recent work has converged to the notion of the Grid, which attempts to uniformly present a heterogeneous collection of distributed resources. Current Grid research covers many areas from low-level infrastructure issues to high-level application concerns. However, providing support for efficient exploration and processing of very large scientific datasets stored in distributed archival storage systems remains a challenging research issue.We have initiated an effort that focuses on developing efficient data-intensive applications in a Grid environment. In this paper, we present a framework, called filter-stream programming that represents the processing units of a data-intensive application as a set of filters, which are designed to be efficient in their use of memory and scratch space. We describe a prototype infrastructure that supports execution of applications using the proposed framework. We present the implementation of two applications using the filter-stream programming framework, and discuss experimental results demonstrating the effects of heterogeneous resources on application performance.