Parallel finite element computation of incompressible flows

  • Authors:
  • Suresh Behara;Sanjay Mittal

  • Affiliations:
  • Department of Aerospace Engineering, Indian Institute of Technology Kanpur, Kanpur 208016, UP, India;Department of Aerospace Engineering, Indian Institute of Technology Kanpur, Kanpur 208016, UP, India

  • Venue:
  • Parallel Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A stabilized finite element formulation for three-dimensional unsteady incompressible flows is implemented on a distributed memory parallel computer. A matrix-free version of the GMRES algorithm is utilized to solve the equation systems in an implicit manner. The scalability of the computations on a 64-processor Linux cluster is evaluated for moderate to large size problems. A method for estimating the speedup for large-scale problems, where computations on a single processor is not possible, is proposed. Superlinear speedup is observed, perhaps for the first time, for a large-scale problem that is associated with more than 44 million nodes and 176 million equations. The performance of the various subactivities of the program is monitored to investigate the cause. It is found that the formation of the RHS vector and the preconditioner achieves a very high level of superlinear speedup as the number of processors increase. As a result, even though the network time for interprocessor communication increases with increase in processors, an overall superlinear speedup is realized for large-scale problems. The superlinear speedup is attributed to cache related effects. A comparison between the performance of matrix and matrix-free versions of the GMRES algorithm is carried out. It is found that for large-scale applications the matrix-free version outperforms its counterpart for reasonable dimensions of the Kyrylov subspace. The effect of mesh partitioning on the scalability is also studied. A significant reduction in communication time is observed with partitioning that leads to an overall improvement of speedup. The parallel implementation is utilized to study the wake instabilities in flow past a stationary circular cylinder at Re=150, 200 and 300. The Re=150 flow is found to be two-dimensional while mode-A and mode-B instabilities are observed at Re=200 and 300, respectively. The Re=300 flow is associated with a low frequency modulation in addition to the vortex shedding frequency.