Howdy fellow Elmer/Ice Users,

I have a short question to experienced HPC Elmer users. The HPC I am running on has recently introduced a new many-core cluster with KNL CPUs. I have compiled Elmer/Ice with its dependencies and it runs smoothly. However, in comparison to the older cluster that uses "Haswell" nodes, the 3D Stokes simulations with a direct solver (MUMPS in this case) are about a factor 3-4 slower. Even with an iterative solver the runs are about 40% slower than on the "Haswell" CPUs.
As I understand it, the KNL CPUs run at a slower frequency (1.3 GHz) and unlike the "Haswell" CPUs do not have any L3 Cache. I guess this will certainly affect performance. Before I dive into profiling to find out where the bottlenecks are on these new CPUs, I was wondering if other Elmer/Ice users have had similar issues and if performance can be improved on by using certain compiler flags in the Elmer/Ice compilation or MUMPS compilation tailored to the new cluster architecture. Or if for example gcc compilers work better with Elmer than Intel compilers. Any help/comments is much appreciated.

Cheers, Clemens
