High-frequency simulations of global seismic wave propagation using SPECFEM3D_GLOBE on 62K processors

  • Authors:
  • Laura Carrington;Dimitri Komatitsch;Michael Laurenzano;Mustafa M Tikir;David Michéa;Nicolas Le Goff;Allan Snavely;Jeroen Tromp

  • Affiliations:
  • San Diego Supercomputer Center, La Jolla, CA;Université de Pau, Pau, France and Institut Universitaire de France, Paris, France;San Diego Supercomputer Center, La Jolla, CA;San Diego Supercomputer Center, La Jolla, CA;Université de Pau, Pau, France;Université de Pau, Pau, France;San Diego Supercomputer Center, La Jolla, CA;California Institute of Technology, Pasadena, CA

  • Venue:
  • Proceedings of the 2008 ACM/IEEE conference on Supercomputing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

SPECFEM3D_GLOBE is a spectral-element application enabling the simulation of global seismic wave propagation in 3D anelastic, anisotropic, rotating and self-gravitating Earth models at unprecedented resolution. A fundamental challenge in global seismology is to model the propagation of waves with periods between 1 and 2 seconds, the highest frequency signals that can propagate clear across the Earth. These waves help reveal the 3D structure of the Earth's deep interior and can be compared to seismographic recordings. We broke the 2 second barrier using the 62K processor Ranger system at TACC. Indeed we broke the barrier using just half of Ranger, by reaching a period of 1.84 seconds with sustained 28.7 Tflops on 32K processors. We obtained similar results on the XT4 Franklin system at NERSC and the XT4 Kraken system at University of Tennessee Knoxville, while a similar run on the 28K processor Jaguar system at ORNL, which has better memory bandwidth per processor, sustained 35.7 Tflops (a higher flops rate) with a 1.94 shortest period. Thus we have enabled a powerful new tool for seismic wave simulation, one that operates in the same frequency regimes as nature; in seismology there is no need to pursue periods much smaller because higher frequency signals do not propagate across the entire globe. We employed performance modeling methods to identify performance bottlenecks and worked through issues of parallel I/O and scalability. Improved mesh design and numbering results in excellent load balancing and few cache misses. The primary achievements are not just the scalability and high teraflops number, but a historic step towards understanding the physics and chemistry of the Earth's interior at unprecedented resolution.