The robustness of AGMG has been assessed on a large test suite of discrete second order elliptic PDEs, comprising
Systems size: from 5.x105 to 3.x107 unknowns
Number of nonzero entry per row: from 5. to 74.
First 2 graphics: Total wall clock time to solve the linear system in microseconds per unknown or microseconds per nonzero entry in the system matrix - vs - problem index
(problems ordered by increasing number of nonzero entry per row)
Performed on a desktop workstation with Intel XEON E5-2620 at 2.10GHz, 2017.
     |
     |
AGMG has been compared against several other state-of-the-art linear system solvers; see here for technical details.
Next 8 graphics: Total wall clock time to solve the linear system in microseconds per unknown - vs - number of unknowns
Performed on a computing node with Intel XEON L5420 processors at 2.50GHz, 2012.
Pay attention that bot scales are logarithmic.
Finite Difference     
Poisson equation in a L-shaped domain (2D)
Poisson equation in a cube (3D)
Poisson equation in a cube (3D)
|
         
Cubic (p3) Finite Element
Convection-diffusion equation in square (2D)
Poisson equation in a cube (3D)
Convection-diffusion equation in a cube (3D)
|
AGMG has also been tested on some truly large scale HPC systems.
Last 4 graphics: Total wall clock time in seconds to solve the linear stemming from the 7-point finite difference discretization of the Poisson equation in a cube.
For all weak scaling plots, the x-scale (problem size) is logarithmic whereas the y-scale (time) is not. Both scales are logarithmic for the strong scaling plot.
Top right figure: 373248 cores represents more than 80% of the whole machine JUQUEEN, which is ranked eighth in the top 500 supercomputer list of November 2013.
Weak scalability results on Intel Farm (CURIE, 2014):
   time as a function of the number of unknowns.    (fixed problem size per core)      Weak scalability results on Cray XE6 (HERMIT, 2014):    time as a function of the number of unknowns    (for different problem sizes per core)      |
Weak scalability results on IBM BG/Q (JUQUEEN, 2014):
   time as a function of the number of unknowns.    (fixed problem size per core)    Strong scalability results on Cray XE6 (HERMIT, 2014):    time as a function of the number of cores    (fixed problem size: 87.5x106 unknowns). |
We acknowledge PRACE for awarding us access to resources CURIE (Intel farm at CEA, France), JUQUEEN (IBM BG/Q at Juelich, Germany) and HERMIT (Cray XE6 at HLRS, Stuttgart, Germany).