Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
problems:vectstokes [2019/04/13 13:52]
tzwinger [Setting up a simulation]
problems:vectstokes [2019/04/13 13:53] (current)
tzwinger [IncompressibleNSVec: vectorized incompressible Navier Stokes Solver]
Line 1: Line 1:
-====== ​IncompressibleNSVec:​ vectorized ​incompressible Navier Stokes Solver ======+====== ​ ​Vectorized ​incompressible Navier Stokes Solver ​(IncompressibleNSVec) ​======
 Since March 2019, this solver is in the distribution of Elmer. It has more or less the functionality of the legacy Navier-Stokes solver, i.e., is able to use any linear solution procedure (including the library version of block-preconditioner). The difference and improvement in performance is given by the vectorized assembly routines utilizing a general way of SIMD enables bi-linear forms. Since this uses OpenMP SIMD instructions,​ it is essential to have OpenMP enabled in your compilation. Since March 2019, this solver is in the distribution of Elmer. It has more or less the functionality of the legacy Navier-Stokes solver, i.e., is able to use any linear solution procedure (including the library version of block-preconditioner). The difference and improvement in performance is given by the vectorized assembly routines utilizing a general way of SIMD enables bi-linear forms. Since this uses OpenMP SIMD instructions,​ it is essential to have OpenMP enabled in your compilation.
  
 ==== Performance improvements ==== ==== Performance improvements ====
 Testing on a Skylake high-end consumer PC with a quad-core CPU, the new solver showed a reduction of 2/3rd of the computing time for the 10km ISMIP-HOM C experiments run on a 30 x 30 x 15 (13500) node mesh partitioned into 4 partitions. As the same (direct) linear solver was used, this gain was solely achieved in the assembly part. The vectorization tremendously increases the memory throughput and hence the performance. Testing on a Skylake high-end consumer PC with a quad-core CPU, the new solver showed a reduction of 2/3rd of the computing time for the 10km ISMIP-HOM C experiments run on a 30 x 30 x 15 (13500) node mesh partitioned into 4 partitions. As the same (direct) linear solver was used, this gain was solely achieved in the assembly part. The vectorization tremendously increases the memory throughput and hence the performance.
problems/vectstokes.txt ยท Last modified: 2019/04/13 13:53 by tzwinger
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0