Dear LIGGGHTS users!
When I run LIGGGHTS with more than one processor I get following errors, however, there is no errors but warnings when I use single processor....
1) errors for more than one processor: -
[ram@localhost stress]$ mpirun -np 4 lmp_l20 at step 2427, growing array...done!
[localhost.localdomain:7801] *** An error occurred in MPI_Wait
[localhost.localdomain:7801] *** on communicator MPI_COMM_WORLD
[localhost.localdomain:7801] *** MPI_ERR_TRUNCATE: message truncated
[localhost.localdomain:7801] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 7802 on
node localhost.localdomain exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[localhost.localdomain:07799] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[localhost.localdomain:07799] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
2) warnings for simgle processor: -
[ram@localhost stress]$ mpirun -np 1 lmp_l20 at step 2427, growing array...done!
3000 500 0.00093410233 2.5099248e-07 0.012410998
INFO: Maxmimum number of particle-tri contacts >40 at step 3056, growing array...done!
INFO: Maxmimum number of particle-tri contacts >60 at step 3060, growing array...done!
INFO: Maxmimum number of particle-tri contacts >80 at step 3060, growing array...done!
INFO: Maxmimum number of particle-tri contacts >100 at step 3060, growing array...done!
INFO: Maxmimum number of particle-tri contacts >120 at step 3064, growing array...done!
INFO: Maxmimum number of particle-tri contacts >140 at step 3064, growing array...done!
INFO: Maxmimum number of particle-tri contacts >160 at step 3068, growing array...done!
INFO: Maxmimum number of particle-tri contacts >180 at step 3068, growing array...done!
INFO: Maxmimum number of particle-tri contacts >200 at step 3068, growing array...done!
4000 500 0.0042296621 1.5791056e-05 0.75635234
Loop time of 357.064 on 1 procs for 3000 steps with 500 atoms
Pair time (%) = 0.0262652 (0.00735589)
Neigh time (%) = 217.193 (60.8273)
Comm time (%) = 0.0101499 (0.00284259)
Outpt time (%) = 0.248974 (0.0697282)
Other time (%) = 139.586 (39.0928)
Nlocal: 500 ave 500 max 500 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 0 ave 0 max 0 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 3 ave 3 max 3 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Total # of neighbors = 3
Ave neighs/atom = 0.006
Neighbor list builds = 1185
Dangerous builds = 929
Please can anybody figure out the problem?
thanks...
Ram
ckloss | Sun, 05/16/2010 - 12:39
Re: MPI crash
Hi Ram,
without more info its impossible to say what is going wrong. But it might be related to a known bug.
If you send me the input script by mail, I will have a look.
Christoph
raguelmoon | Mon, 05/17/2010 - 09:17
input file and mpi crash errors
here is crash errors for multi procs:
[ram@localhost stress]$ mpirun -np 4 lmp_fedora20 at step 4122, growing array...done!
[localhost.localdomain:6446] *** An error occurred in MPI_Wait
[localhost.localdomain:6446] *** on communicator MPI_COMM_WORLD
[localhost.localdomain:6446] *** MPI_ERR_TRUNCATE: message truncated
[localhost.localdomain:6446] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 6446 on
node localhost.localdomain exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
with this script ( can run on single proc without errors): -
(emailed to you)
Ram
ckloss | Wed, 05/19/2010 - 18:23
Re: MPI crash
Hi Ram,
try version 1.0.2 (already released) - lets see if that helps
Christoph
raguelmoon | Wed, 05/19/2010 - 18:50
Hi Christoph, There is fault
Hi Christoph,
There is fault in new version. When I build the LIGGGHTS it runs for a while and sticks in the middle and starts to run loop for infinity: -
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move.cpp > fix_move.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_meshGran.cpp > fix_meshGran.d
make[1]: Leaving directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Entering directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Warning: File `mech_param_gran.cpp' has modification time 4e+03 s in the future
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M mech_param_gran.cpp > mech_param_gran.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_tri_neighlist.cpp > fix_tri_neighlist.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move_tri.cpp > fix_move_tri.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move.cpp > fix_move.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_meshGran.cpp > fix_meshGran.d
make[1]: Leaving directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Entering directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Warning: File `mech_param_gran.cpp' has modification time 4e+03 s in the future
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M mech_param_gran.cpp > mech_param_gran.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_tri_neighlist.cpp > fix_tri_neighlist.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move_tri.cpp > fix_move_tri.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move.cpp > fix_move.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_meshGran.cpp > fix_meshGran.d
make[1]: Leaving directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Entering directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Warning: File `mech_param_gran.cpp' has modification time 4e+03 s in the future
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M mech_param_gran.cpp > mech_param_gran.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_tri_neighlist.cpp > fix_tri_neighlist.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move_tri.cpp > fix_move_tri.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move.cpp > fix_move.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_meshGran.cpp > fix_meshGran.d
make[1]: Leaving directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Entering directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Warning: File `mech_param_gran.cpp' has modification time 4e+03 s in the future
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M mech_param_gran.cpp > mech_param_gran.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_tri_neighlist.cpp > fix_tri_neighlist.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move_tri.cpp > fix_move_tri.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move.cpp > fix_move.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_meshGran.cpp > fix_meshGran.d
make[1]: Leaving directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Entering directory `/home/ram/Desktop/liggghts/src/Obj_fedora'
make[1]: Warning: File `mech_param_gran.cpp' has modification time 4e+03 s in the future
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M mech_param_gran.cpp > mech_param_gran.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_tri_neighlist.cpp > fix_tri_neighlist.d
mpic++ -g -O -DLAMMPS_GZIP -DMPICH_IGNORE_CXX_SEEK -DFFT_NONE -M fix_move_tri.cpp > fix_move_tri.d
Please can you suggest some solutions...
Ram
ckloss | Wed, 05/19/2010 - 22:41
Re: Version 1.0.2
Hi Ram,
this is not a bug but a clock skew proably due to the fact that the posting is fresh and we are in different time zones.
This issue should vanish after a few hours. If you dont want to wait you can use the "touch" command to alter the file modification date/time to your system clock
Christoph
raguelmoon | Thu, 05/20/2010 - 16:35
Hi, On single proc, new
Hi,
On single proc, new liggghts works fine as before but on cluster it slows down my computer and everything stops working. I have to restart my PC.
Ram
ckloss | Thu, 05/20/2010 - 16:50
Re: Hi, On single proc, new
Hi ram,
i can advise you check your simulation parameters (stability) and with your low number of particles, do serial caculation.
Christoph