Error when using mpi with SaveGridData solver (grid.dat.names access issue?)

Numerical methods and mathematical models of Elmer
Post Reply
bendixen
Posts: 8
Joined: 13 Aug 2020, 14:14
Antispam: Yes

Error when using mpi with SaveGridData solver (grid.dat.names access issue?)

Post by bendixen »

I’m running Windows10 with latest version of ElmerVersion 9.0, compiled 2023-01-09

I have experienced some issues when using the SaveGridData solver for simulations using mpi. What I sometimes randomly get, is error messages stating:
-----------------------------------------------
At line 6635 of file D:/Elmerbuilder/elmerfem/fem/src/ParticleUtils.F90
Fortran runtime error: End of record
-----------------------------------------------
Or
-----------------------------------------------
At line 6388 of file D:/ElmerBuilder/elmerfem/fem/src/ParticleUtils.F90 (unit = 10, file = 'grid.dat.names')
Fortran runtime error: Bad file descriptor
--------------------------------------------------

The last line of the command window before job is aborted is:
-------------------------------------------------
CreateListForSaving: Field variables for saving
Job aborted
----------------------------------------------------
It happens more frequently the more processors I use:
1 core: never
2 cores: can’t remember having seen it
4 cores: occasionally
8 cores: very often
16 cores: almost always

My feeling is that it happens more frequently when running on a faster machine. In all cases the .vtu files for each process is written successfully as is the .pvtu file, and the data herin is valid and correct.

My suspicion is that the SaveGridData solver on each processor tries to write the same file grid.dat.names and sometimes file accesses collide and creates this error. If this is correct, is there any way to suppress the generation of grid.dat.names, since the contents does not change anyway between my simulations, and I know what’s in every row of the datafile?
kevinarden
Posts: 2237
Joined: 25 Jan 2019, 01:28
Antispam: Yes

Re: Error when using mpi with SaveGridData solver (grid.dat.names access issue?)

Post by kevinarden »

perhaps
Partition Numbering = Logical True
Optionally adds the number of partitions to the filename. This makes the benchmarking more
convenient since each case may use the same command file without conflicts.
You may have to manually merge the files back in the end
raback
Site Admin
Posts: 4812
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: Error when using mpi with SaveGridData solver (grid.dat.names access issue?)

Post by raback »

Thanx for reporting. Indeed, every process was eager to write the same file. Thanx for reporting! I just made a tentative fix on "devel" branch. -Peter
bendixen
Posts: 8
Joined: 13 Aug 2020, 14:14
Antispam: Yes

Re: Error when using mpi with SaveGridData solver (grid.dat.names access issue?)

Post by bendixen »

Hi Peter,

thx for trying to solve my issue. I was away for a while so only had time to test the solution now, using latest build (Version: 9.0 Rev: Release, Compiled: 2023-03-14).
Unfortunately, there is still problems with the SaveGridData solver, now it just fails always(!) with this error message (one report from each process):

---------------------------------------------
At line 6641 of file D:/ElmerBuilder/elmerfem/fem/src/ParticleUtils.F90
Fortran runtime error:
End of record
---------------------------------------------

I really hope you can solve the problem.

br Carsten
raback
Site Admin
Posts: 4812
Joined: 22 Aug 2009, 11:57
Antispam: Yes
Location: Espoo, Finland
Contact:

Re: Error when using mpi with SaveGridData solver (grid.dat.names access issue?)

Post by raback »

Hi Carsten,

Some time ago most of the strings were changed to allocatables. Maybe this caused issue here. This is just a tentative fix. Hope it solves the issue.
https://github.com/ElmerCSC/elmerfem/co ... a7b877f683

-Peter
Post Reply