Hi everyone
I'm trying to run a model that includes the multicontact model by Brodu et. al. (2015), in a cluster of my university.
In clueter it is installed LIGGGHTS 3.8.0 with Singularity.
The problem is that model do not run in cluster, but in my laptop it works. So I want to ask you if the multicontact model is not available to run in clusters, or is necessary to install something additional to run. The problem seams the MPI comunication doesn't work well with this model in cluster.
Any commentary could help me so much
Best regards.
mschramm | Tue, 11/27/2018 - 02:33
Error
What is the error message that you get?
LFrie | Mon, 08/22/2022 - 13:02
Hey all,
Hey all,
I am also trying to run the code and having this error.
""
Import and parallelization of mesh piston_m containing 2 triangle(s) successful
*** An error occurred in MPI_Wait
*** reported by process [522584065,1]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
""
Kind regards
mschramm | Thu, 09/01/2022 - 17:30
Cluster Support
Hello,
Have you asked your local support to see if the program was started correctly?
How did you start LIGGGHTS on the cluster?
LFrie | Thu, 09/01/2022 - 19:58
Hey thanks for replying.
Hey thanks for replying.
It was done on my desktop pc, with
mpirun -np 8 /home../lmp_auto -in in.script
LFrie | Sat, 09/03/2022 - 15:05
valgrind
small update, if you try to run it with
mpirun -np 6 valgrind /home/../src/lmp_auto < in.hydrogel_multicontact
you get this info:
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
LFrie | Sat, 09/03/2022 - 18:10
valgrind(2)
the problem is this one :
Invalid write of size 8
Address 0x12134a98 is 8 bytes after a block of size 77,120 alloc'd
Any ideas here?
....................
==25016== Invalid write of size 8
==25016== at 0xA42A65: LAMMPS_NS::FixContactPropertyAtom::pack_comm(int, int*, double*, int, int*) (fix_contact_property_atom.cpp:393)
==25016== by 0xA0FA38: LAMMPS_NS::Comm::forward_comm_variable_fix(LAMMPS_NS::Fix*) (comm.cpp:1452)
==25016== by 0xA428A8: LAMMPS_NS::FixContactPropertyAtom::do_forward_comm() (fix_contact_property_atom.cpp:372)
==25016== by 0x9C82B4: LAMMPS_NS::FixMultiContactHalfSpace::pre_force(int) (fix_multicontact_halfspace.cpp:406)
==25016== by 0x9C7526: LAMMPS_NS::FixMultiContactHalfSpace::setup_pre_force(int) (fix_multicontact_halfspace.cpp:222)
==25016== by 0x42A642: LAMMPS_NS::Modify::call_method_on_fixes(void (LAMMPS_NS::Fix::*)(int), int, int*&, int&) (modify.cpp:1526)
==25016== by 0x4249FE: LAMMPS_NS::Modify::setup_pre_force(int) (modify.cpp:377)
==25016== by 0x55DA3A: LAMMPS_NS::Verlet::setup() (verlet.cpp:174)
==25016== by 0x648C18: LAMMPS_NS::Run::command(int, char**, long) (run.cpp:212)
==25016== by 0x4D1F99: void LAMMPS_NS::Input::command_creator(LAMMPS_NS::LAMMPS*, int, char**) (input.cpp:662)
==25016== by 0x4CBC62: LAMMPS_NS::Input::execute_command() (input.cpp:645)
==25016== by 0x4C9EA5: LAMMPS_NS::Input::file() (input.cpp:256)
==25016== Address 0x12134a98 is 8 bytes after a block of size 77,120 alloc'd
==25016== at 0x4C2DE96: malloc (vg_replace_malloc.c:309)
==25016== by 0x51A022: LAMMPS_NS::Memory::smalloc(long, char const*) (memory.cpp:71)
==25016== by 0x41C672: double* LAMMPS_NS::Memory::create(double*&, int, char const*) (memory.h:79)
==25016== by 0xA11349: LAMMPS_NS::Comm::grow_send(int, int) (comm.cpp:1868)
==25016== by 0xA0EDBF: LAMMPS_NS::Comm::borders() (comm.cpp:1261)
==25016== by 0x55D91E: LAMMPS_NS::Verlet::setup() (verlet.cpp:161)
==25016== by 0x648C18: LAMMPS_NS::Run::command(int, char**, long) (run.cpp:212)
==25016== by 0x4D1F99: void LAMMPS_NS::Input::command_creator(LAMMPS_NS::LAMMPS*, int, char**) (input.cpp:662)
==25016== by 0x4CBC62: LAMMPS_NS::Input::execute_command() (input.cpp:645)
==25016== by 0x4C9EA5: LAMMPS_NS::Input::file() (input.cpp:256)
==25016== by 0x6E7820: main (main.cpp:100)
==25016==
mschramm | Sun, 09/11/2022 - 20:00
Do you have a min example
Hello,
Could you provide a min example that ends with the same error?
LFrie | Tue, 09/13/2022 - 09:27
min example
atom_style sphere
atom_modify map array
boundary p p p
newton off
hard_particles yes
communicate single vel yes
processors * 1 1
units si
read_data data/input.data
neighbor 0.0001 bin
neigh_modify delay 0
#Material properties required for new pair styles
## density, q=I/mr^2, k1, k2, kc, kt, kr, ko, .. mus, mud, mur, mu0, 8 gamma_n, gamma_t, gamma_r, gamma_o, gamma_b, gamma_br, phi_f, psi
# 2000 0.4 1e5 1e5 0 2e4 0e4 0e4 .. 0.5 0.5 0 0 .8. 1546.26490773707 309.252981547414 0 0 154.6 30.9 0.5 0
fix m1 all property/global youngsModulus peratomtype 2.58e7
fix m2 all property/global poissonsRatio peratomtype 0.3
#fix m33 all property/global kn peratomtypepair 2 0.4e8 0.4e8 0.4e8 0.4e8
fix m3 all property/global LoadingStiffness peratomtypepair 1 0.6e7
fix m4 all property/global UnloadingStiffness peratomtypepair 1 150
fix m5 all property/global coefficientPlasticityDepth peratomtypepair 1 1.0
#fix m44 all property/global kt peratomtypepair 2 100 100 100 100
fix m6 all property/global gamman peratomtypepair 1 100
fix m7 all property/global gammat peratomtypepair 1 100
fix m27 all property/global coefficientRestitution peratomtypepair 1 0.352
fix m8 all property/global coefficientFriction peratomtypepair 1 0.561
fix m9 all property/global coefficientAdhesionStiffness peratomtypepair 1 1
fix m10 all property/global pullOffForce peratomtypepair 1 0
fix m11 all property/global FluidViscosity peratomtypepair 1 0.5
fix m12 all property/global coeffFrictionStiffness peratomtypepair 1 0
fix m13 all property/global FrictionViscosity peratomtypepair 1 0.01
fix m144 all property/global coefficientRollingFriction peratomtypepair 1 0.3
# fix 1 all viscous 0.004 #scale 1 4.5
#New pair style
# AM: or no history?
#pair_style gran model hertz/stiffness tangential history
#pair_style gran model luding tangential tan_luding rolling_friction cdt surface multicontact#rolling_friction luding #rolling_friction luding
pair_style gran model hertz tangential history rolling_friction cdt surface multicontact#rolling_friction luding #rolling_friction luding
pair_coeff * *
#variable dt equal 0.000000001
variable dt equal 0.00000000000001
#0.0000000001
timestep ${dt}
fix gravi all gravity 9.81 vector 0.0 0.0 -1.0
fix 1 all continuum/weighted kernel_radius 0.01 compute stress
####compute overall stress tensor of the assembly and z component for the stress-strain curve
compute Temp all temp
compute 4 all pressure Temp virial
variable stressXX equal c_4[1]
variable stressYY equal c_4[2]
variable stressZZ equal c_4[3]
variable trace equal abs(c_4[1]+c_4[2]+c_4[3]+c_4[4]+c_4[5]+c_4[6])
#### Store final cell length for strain calculations
variable tmp equal "lz"
variable L0 equal ${tmp}
print "Initial Length, L0: ${L0}"
variable strain equal (lz-${L0})/${L0}
variable s equal v_strain
fix mc all multicontact/halfspace geometric_prefactor 1.0
#fix 1 all continuum/weighted kernel_radius 0.01 compute stress
#apply nve integration to all particles
fix integr all nve/sphere
fix ts_check all check/timestep/gran 1000 0.1 0.1
#output settings, include total thermal energy
compute 1 all erotate/sphere
thermo_style custom step atoms ke vol cpuremain cpu
thermo 1000
compute_modify thermo_temp dynamic yes
#insert the first particles so that dump is not empty
#run 600000
dump dmp all custom/vtk 10000 post/hydrogel_default*.vtk id type type x y z ix iy iz vx vy vz fx fy fz omegax omegay omegaz radius
fix 5 all print 500 "$s ${stressZZ} " file Stress_Strain_default.txt screen no title " Strain [-], Stress [MPa]" #${stressXX} ${stressYY} ${press} ${trace}
#fix press all deform 1 z delta 0 -1.0e-3 # negative strain application along z axis
fix move all deform 1 z erate -24000000 units box remap x
run 2370000
unfix move
fix move all deform 1 z erate 24000000 units box remap x
run 2370000
// Data file is reading //
LIGGGHTS data file
8 atoms
1 atom types
-0.000105 0.000305 xlo xhi
-0.000105 0.000305 ylo yhi
-0.000105 0.0003 zlo zhi
Atoms
1 1 0.0002 1541.1 0.0 0.0 0.0
2 1 0.0002 1541.1 0.0 0.0 0.0002
3 1 0.0002 1541.1 0.0 0.0002 0.0
4 1 0.0002 1541.1 0.0 0.0002 0.0002
5 1 0.0002 1541.1 0.0002 0.0002 0.0002
6 1 0.0002 1541.1 0.0002 0.0002 0.0
7 1 0.0002 1541.1 0.0002 0.0 0.0002
8 1 0.0002 1541.1 0.0002 0.0 0.0
LFrie | Tue, 09/13/2022 - 09:31
min example II
a different error you get when you run the original example from
https://github.com/CFDEMproject/LIGGGHTS-PUBLIC/tree/master/examples/LIG...
and execetute
mpirun -np 6 valgrind /home/../src/lmp_auto < in.hydrogel_multicontact
mschramm | Tue, 09/13/2022 - 17:12
why valgrind?
Hello,
I will take a look at the input script some time this week but wanted to know why you are calling valgrind?
This tests for memory leeks and will give false positives when using outside libraries.
It will also drastically slow down the execution of your script.
LFrie | Wed, 09/14/2022 - 18:45
since it is not running
Hello,
since I cannot run the script (uploaded here)( either serial neither in parallel.
The example form here
https://github.com/CFDEMproject/LIGGGHTS-PUBLIC/tree/master/examples/LIG...
in.hydrogel_multicontact
cannot run in parallel only serial.
Thank you for replying.
mschramm | Thu, 09/15/2022 - 01:55
Missing fix code in surface_multicontact
Hello,
I was able to get the seg fault.
I haven't fully tracked it down but it is due to the surface-multicontact model (I think it is in "fix_multicontact_halfspace).
Do you require the multicontact surface model over the default model?
LFrie | Thu, 09/15/2022 - 11:57
yes the problem is in fix_multicontact_halfspace
In particular the problem is related to do_forward_comm in fix_multicontact_halfspace.cpp (lines> 405-406 in my code)
...
for (contact_property_atom = contact_property_atom_vector.begin(); contact_property_atom < contact_property_atom_vector.end(); contact_property_atom++)
(*contact_property_atom)->do_forward_comm();
...
also there is this post related to this error
https://www.cfdem.com/forums/multicontact-model-issues
mschramm | Thu, 09/15/2022 - 19:10
Disabled on LIGGGHTS-PFM
Hello,
Tried the hydrogel example using LIGGGHTS-PFM and they do not support the multiconntact surface at all. This probably means that it is not a quick fix...
LFrie | Fri, 09/16/2022 - 10:23
hydrogel multi exampe is running in parallel
only if you add > store_force_contact_stress yes, in this command
fix piston all wall/gran model hertz tangential history surface multicontact store_force_contact_stress yes mesh n_meshes 1 meshes piston_m
still do not know why and with periodic boundaries is still not working