Papyrus: a system for data mining over local and wide area clusters and super-clusters
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Types and programming languages
Types and programming languages
IEEE Internet Computing
An Architecture for Distributed Enterprise Data Mining
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Distributed data mining on the grid
Future Generation Computer Systems - Grid computing: Towards a new computing infrastructure
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
The design and implementation of Grid database services in OGSA-DAI: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
The Design of Discovery Net: Towards Open Grid Services for Knowledge Discovery
International Journal of High Performance Computing Applications
A taxonomy of scientific workflow systems for grid computing
ACM SIGMOD Record
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Grid-enabling data mining applications with DataMiningGrid: An architectural perspective
Future Generation Computer Systems
Efficient scheduling of scientific workflows in a high performance computing cluster
CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
Data mining using high performance data clouds: experimental studies using sector and sphere
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Workflows and e-Science: An overview of workflow system features and capabilities
Future Generation Computer Systems
SOA Design Patterns
An overview of S-OGSA: A Reference Semantic Grid Architecture
Web Semantics: Science, Services and Agents on the World Wide Web
An overview of the Open Science Data Cloud
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Towards optimising distributed data streaming graphs using parallel streams
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Federated enactment of workflow patterns
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Distributed data mining patterns and services: an architecture and experiments
Concurrency and Computation: Practice & Experience
Accelerating Biomedical Data-Intensive Applications Using MapReduce
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Modeling and optimizing large-scale data flows
Future Generation Computer Systems
Hi-index | 0.00 |
This paper presents the rationale for a new architecture to support a significant increase in the scale of data integration and data mining. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity DMI. It supports enactment of DMI processes across heterogeneous and distributed data resources and data mining services. It posits that a useful division can be made between the facilities established to support the definition of DMI processes and the computational infrastructure provided to enact DMI processes. Communication between those two divisions is restricted to requests submitted to gateway services in a canonical DMI language. Larger-scale processes are enabled by incremental refinement of DMI-process definitions often by recomposition of lower-level definitions. Autonomous evolution of data resources and services is supported by types and descriptions which will support detection of inconsistencies and semi-automatic insertion of adaptations. These architectural ideas are being evaluated in a feasibility study that involves an application scenario and representatives of the community.