The grid
Papyrus: a system for data mining over local and wide area clusters and super-clusters
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
SETI@home: an experiment in public-resource computing
Communications of the ACM
An Architecture for Distributed Enterprise Data Mining
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
A Requirements Analysis for Parallel KDD Systems
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Discovery net: towards a grid of knowledge discovery
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Nimrod: a tool for performing parametrised simulations using distributed workstations
HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
A Visual Language for Internet-Based Data Mining and Data Visualization
VL '99 Proceedings of the IEEE Symposium on Visual Languages
Kleisli, a functional query system
Journal of Functional Programming
Bridging the Macro and Micro: A Computing Intensive Earthquake Study Using Discovery Net
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed data mining services leveraging WSRF
Future Generation Computer Systems - Special section: Data mining in grid computing environments
Design and implementation of a data mining grid-aware architecture
Future Generation Computer Systems - Special section: Data mining in grid computing environments
Grid-enabling data mining applications with DataMiningGrid: An architectural perspective
Future Generation Computer Systems
A grid-enabled workflow system for reservoir uncertainty analysis
CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
Heterogeneous Workflows in Scientific Workflow Systems
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
A distributed architecture for data mining and integration
Proceedings of the second international workshop on Data-aware distributed computing
How distributed data mining tasks can thrive as knowledge services
Communications of the ACM
DockFlow: Achieving interoperability of protein docking tools across heterogeneous Grid middleware
International Journal of Ad Hoc and Ubiquitous Computing
Ubiquitous knowledge discovery
Ubiquitous knowledge discovery
Languages for the net: from presentation to collaboration
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Service oriented architectures for science gateways on grid systems
ICSOC'05 Proceedings of the Third international conference on Service-Oriented Computing
Building and accessing grid services
ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Contextualised workflow execution in mygrid
EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
Distributed data mining patterns and services: an architecture and experiments
Concurrency and Computation: Practice & Experience
A virtual mart for knowledge discovery in databases
Information Systems Frontiers
Enabling cost-aware and adaptive elasticity of multi-tier cloud applications
Future Generation Computer Systems
Hi-index | 0.02 |
With the emergence of distributed resources and grid technologies there is a need to provide higher level informatics infrastructures allowing scientists to easily create and execute meaningful data integration and analysis processes that take advantage of the distributed nature of the available resources. These resources typically include heterogeneous data sources, computational resources for task execution and various application-specific services. The effort of the high performance community has so far mainly focused on the delivery of low-level informatics infrastructures enabling the basic needs of grid applications. Such infrastructures are essential but do not directly help end-users in creating generic and re-usable applications.In this paper, we present the Discovery Net architecture for building grid-based knowledge discovery applications. Our architecture enables the creation of high-level, re-usable and distributed application workflows that use a variety of common types of distributed resources. It is built on top of standard protocols and standard infrastructures such as Globus but also defines its own protocols such as the Discovery Process Mark-up Language for data flow management. We discuss an implementation of our architecture and evaluate it by building a real-time genome annotation environment on top.