Page 2 of 3
Re: elmersolver_mpi doesn't exit on completion
Posted: 06 Oct 2020, 18:51
by raback
Ok, it seems everybody finished but some partition was left hanging...
Re: elmersolver_mpi doesn't exit on completion
Posted: 07 Oct 2020, 09:58
by Romuald
exactly.
Re: elmersolver_mpi doesn't exit on completion
Posted: 07 Oct 2020, 10:10
by Romuald
To complement my previous report, here is what happened when I tried with 5 nodes this time.
During the run, the attached image DuringTheRun.png shows the memory and cpu usage. The second image AfterTheRun.png shows that in this case, 3 nodes did not terminate while we still got printed "Partn: The end" for all nodes.
The CPU usage for these late nodes is zero while the used memory significantly reduced. I hope it helps.
Romuald
Re: elmersolver_mpi doesn't exit on completion
Posted: 07 Oct 2020, 16:38
by Romuald
I am still investigating the case.
Instead of killing the late node(s) via the task manager, I performed a CTRL-C from the console executing the batch. I got the following by mpiexec with 4 nodes:
and with 5 nodes:
Code: Select all
mpiexec aborting job...
job aborted:
[ranks] message
[0] job terminated by the user
[1-4] terminated
---- error analysis -----
[0] on DACMPCG158
ctrl-c was hit. job aborted by the user.
It seems that node 0 does not terminate systematically.
Re: elmersolver_mpi doesn't exit on completion
Posted: 07 Oct 2020, 18:06
by raback
Hi
I added FLUSH and STOP to the code. Should not matter but worth testing...
You can find fresh installers from here:
https://www.nic.funet.fi/pub/sci/physic ... n/windows/
-Peter
Re: elmersolver_mpi doesn't exit on completion
Posted: 07 Oct 2020, 18:37
by Romuald
Thanks.
I will tell you what I got Tomorrow then.
Romuald
Re: elmersolver_mpi doesn't exit on completion
Posted: 08 Oct 2020, 18:04
by Romuald
Hi
it seems that no new installers have been compiled since 07/10 3-4 pm.
I will check tomorrow.
Best,
Romuald
Re: elmersolver_mpi doesn't exit on completion
Posted: 13 Oct 2020, 18:49
by sslone
I tested the updated version.
FLUSH and STOP did not change anything as likely expected. Jobs run and then stop as before. I had 9 subprocesses that hanged (was using 8 processors) rather than 8 closing and one staying.
-Scott
Re: elmersolver_mpi doesn't exit on completion
Posted: 23 Oct 2020, 20:15
by Romuald
For information, while my first tests were with MS-MPI 10.0, I also tried with older version 7.1 and 8.0. The problem remains.
That is very weird.
R
Re: elmersolver_mpi doesn't exit on completion
Posted: 24 Oct 2020, 16:40
by mark smith
Hi All
Since originally posting in 2016 I can confirm that I haven't found a solution other than to run the solver under Linux (Ubuntu) if a solution is found for windows I would still be interested.
Regards
Mark