Performance of a parallel algebraic multilevel preconditioner for stabilized finite element semiconductor device modeling

  • Authors:
  • Paul T. Lin;John N. Shadid;Marzio Sala;Raymond S. Tuminaro;Gary L. Hennigan;Robert J. Hoekstra

  • Affiliations:
  • Sandia National Laboratories, P.O. Box 5800 MS 0316, Albuquerque, NM 87185-0316, USA;Sandia National Laboratories, P.O. Box 5800 MS 0316, Albuquerque, NM 87185-0316, USA;BMW-Sauber, Hinwil, Switzerland;Sandia National Laboratories, P.O. Box 969 MS 9159, Livermore, CA 94551-9159, USA;Sandia National Laboratories, P.O. Box 5800 MS 0316, Albuquerque, NM 87185-0316, USA;Sandia National Laboratories, P.O. Box 5800 MS 0316, Albuquerque, NM 87185-0316, USA

  • Venue:
  • Journal of Computational Physics
  • Year:
  • 2009

Quantified Score

Hi-index 31.47

Visualization

Abstract

In this study results are presented for the large-scale parallel performance of an algebraic multilevel preconditioner for solution of the drift-diffusion model for semiconductor devices. The preconditioner is the key numerical procedure determining the robustness, efficiency and scalability of the fully-coupled Newton-Krylov based, nonlinear solution method that is employed for this system of equations. The coupled system is comprised of a source term dominated Poisson equation for the electric potential, and two convection-diffusion-reaction type equations for the electron and hole concentration. The governing PDEs are discretized in space by a stabilized finite element method. Solution of the discrete system is obtained through a fully-implicit time integrator, a fully-coupled Newton-based nonlinear solver, and a restarted GMRES Krylov linear system solver. The algebraic multilevel preconditioner is based on an aggressive coarsening graph partitioning of the nonzero block structure of the Jacobian matrix. Representative performance results are presented for various choices of multigrid V-cycles and W-cycles and parameter variations for smoothers based on incomplete factorizations. Parallel scalability results are presented for solution of up to 10^8 unknowns on 4096 processors of a Cray XT3/4 and an IBM POWER eServer system.