Technical Note: \cal Q-Learning
Machine Learning
Efficient load balancing for wide-area divide-and-conquer applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
A History of Data-Flow Languages
IEEE Annals of the History of Computing
Bandwidth-Centric Allocation of Independent Tasks on Heterogeneous Platforms
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Dataflow Java: Implicitly Parallel Java
ACAC '00 Proceedings of the 5th Australasian Computer Architecture Conference
GridFlow: Workflow Management for Grid Computing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Grid Economy Comes of Age: Emerging Gridbus Tools for Service-Oriented Cluster and Grid Computing
P2P '02 Proceedings of the Second International Conference on Peer-to-Peer Computing
Symphony - A Java-Based Composition and Manipulation Framework for Computational Grids
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
ACM Computing Surveys (CSUR)
Application-specific scheduling for the organic grid
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
A comprehensive review of nature inspired routing algorithms for fixed telecommunication networks
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Nature-inspired applications and systems
Streamflex: high-throughput stream programming in java
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
User-friendly and reliable grid computing based on imperfect middleware
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Resource tracking in parallel and distributed applications
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Developing java grid applications with ibis
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Towards jungle computing with Ibis/Constellation
Proceedings of the 2011 workshop on Dynamic distributed data-intensive applications, programming abstractions, and systems
Hi-index | 0.00 |
In this paper we describe Maestro, a dataflow computation framework for Ibis, our Java-based grid middleware. The novelty of Maestro is that it is a self-organizing peer-to-peer system, meaning that it distributes the tasks in a flow over the available nodes based on local decisions on each node, without any central coordination. As a result, the computations are more scalable, more resilient against failing nodes, and less sensitive to communication latencies. Maestro uses a task distribution approach based on reinforcement learning, a learning mechanism where the positive outcome of a choice makes it more likely that the same choice repeated in the future. Maestro selects the most efficient node for each stage in the computation based on the observed computation and communication times. To ensure agility, the selection decisions are made as late as possible without letting the nodes fall idle. Using this task distribution algorithm, the nodes can be used efficiently, even in a heterogeneous system with failure-prone nodes communicating through high-latency connections.