On the effectiveness of runtime techniques to reduce memory sharing overheads in distributed Java implementations

  • Authors:
  • Marcelo Lobosco;Orlando Loques;Claudio L. de Amorim

  • Affiliations:
  • Departamento de Ciência de Computação, Universidade Federal de Juiz de Fora, Brazil;Instituto de Computação, Universidade Federal Fluminense, Rua Passo da Pátria, 156, Bloco E, 3o Andar, São Domingos, Niterói, CEP: 24210-240, Brazil;Laboratório de Computação Paralela, PESC, COPPE, UFRJ, Brazil

  • Venue:
  • Concurrency and Computation: Practice & Experience
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed Java virtual machine (dJVM) systems enableconcurrent Java applications to transparently run on clusters ofcommodity computers. This is achieved by supporting Java'sshared-memory model over multiple JVMs distributed across thecluster's computer nodes. In this work, we describe and evaluateselective dynamic diffing and lazy home allocation, two newruntime techniques that enable dJVMs to efficiently support memorysharing across the cluster. Specifically, the two proposedtechniques can contribute to reduce the overheads due to messagetraffic, extra memory space, and high latency of remote memoryaccesses that such dJVM systems require for implementing theirmemory-coherence protocol either in isolation or in combination. Inorder to evaluate the performance-related benefits of dynamicdiffing and lazy home allocation, we implemented bothtechniques in Cooperative JVM (CoJVM), a basic dJVM system wedeveloped in previous work. In subsequent work, we carried outperformance comparisons between the basic CoJVM and modified CoJVMversions for five representative concurrent Java applications(matrix multiply, LU, Radix, fast Fourier transform, and SOR) usingour proposed techniques. Our experimental results showed thatdynamic diffing and lazy home allocation significantlyreduced memory sharing overheads. The reduction resulted inconsiderable gains in CoJVM system's performance, ranging from 9%up to 20%, in four out of the five applications, with resultingspeedups varying from 6.5 up to 8.1 for an 8-node cluster ofcomputers. Copyright © 2007 John Wiley & Sons, Ltd.