The MONARC toolset for simulating large network-distributed processing systems

  • Authors:
  • Iosif C. Legrand;Harvey B. Newman

  • Affiliations:
  • California Institute of Technology, Pasadena, CA;California Institute of Technology, Pasadena, CA

  • Venue:
  • Proceedings of the 32nd conference on Winter simulation
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The next generation of High Energy Physics experiments have envisaged the use of network-distributed Petabyte-scale data handling and computing systems of unprecedented complexity. The general concept is that of a "Data Grid Hierarchy" in which the central facility at the European Laboratory for Particle Physics (CERN) in Geneva will interact and coherently manage tasks shared by and distributed amongst national "Tier1 (National) Regional Centres" situated in the US, Europe, and Asia. CERN and the Tier1 Centers will further communicate and task-share with the Tier2 Regional Centers, Tier3 centers serving individual universities or research groups, and thousands of "Tier4" desktops and small servers.The design and optimization of systems with this level of complexity requires a realistic description and modeling of the data access patterns, the data flow across the local and wide area networks, and the scheduling and workload presented by hundreds of jobs running concurrently on large scale distributed systems exchanging very large amounts of data.The simulation toolset developed within the "Models Of Networked Analysis at Regional Centers" - MONARC project provides a code and execution time-efficient design and optimisation framework for large scale distributed systems. A process-oriented approach for discrete event simulation has been adopted because it is well suited to describe various activities running concurrently, as well the stochastic arrival patterns typical of this class of simulations. Threaded objects or "Active Objects" provide a natural way to map the specific behaviour of distributed data processing (and the required flows of data across the networks) into the simulation program.This simulation program is based on Java2(™) technology because of the support for the necessary methods and techniques needed to develop an efficient and flexible distributed process oriented simulation. This includes a convenient set of interactive graphical presentation and analysis tools, which are essential for the development and effective use of the simulation system.The design elements, status and features of the MONARC simulation tool are presented. The program allows realistic modelling of complex data access patterns by multiple concurrent users in large scale computing systems in a wide range of possible architectures. Comparison between queuing theory and realistic client-server measurements is also presented.