LIGGGHTS IN CLUSTER WITH SINGULARITY

Submitted by zumack on Sun, 11/25/2018 - 19:30

Hi everyone

I'm trying to run a model that includes the multicontact model by Brodu et. al. (2015), in a cluster of my university.
In clueter it is installed LIGGGHTS 3.8.0 with Singularity.
The problem is that model do not run in cluster, but in my laptop it works. So I want to ask you if the multicontact model is not available to run in clusters, or is necessary to install something additional to run. The problem seams the MPI comunication doesn't work well with this model in cluster.

Any commentary could help me so much

Best regards.

LFrie | Mon, 08/22/2022 - 13:02

Hey all,
I am also trying to run the code and having this error.
""
Import and parallelization of mesh piston_m containing 2 triangle(s) successful
*** An error occurred in MPI_Wait
*** reported by process [522584065,1]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
""
Kind regards

LFrie | Thu, 09/01/2022 - 19:58

Hey thanks for replying.
It was done on my desktop pc, with
mpirun -np 8 /home../lmp_auto -in in.script

LFrie | Sat, 09/03/2022 - 15:05

small update, if you try to run it with
mpirun -np 6 valgrind /home/../src/lmp_auto < in.hydrogel_multicontact
you get this info:

mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

LFrie | Sat, 09/03/2022 - 18:10

the problem is this one :
Invalid write of size 8
Address 0x12134a98 is 8 bytes after a block of size 77,120 alloc'd

Any ideas here?

....................

==25016== Invalid write of size 8
==25016== at 0xA42A65: LAMMPS_NS::FixContactPropertyAtom::pack_comm(int, int*, double*, int, int*) (fix_contact_property_atom.cpp:393)
==25016== by 0xA0FA38: LAMMPS_NS::Comm::forward_comm_variable_fix(LAMMPS_NS::Fix*) (comm.cpp:1452)
==25016== by 0xA428A8: LAMMPS_NS::FixContactPropertyAtom::do_forward_comm() (fix_contact_property_atom.cpp:372)
==25016== by 0x9C82B4: LAMMPS_NS::FixMultiContactHalfSpace::pre_force(int) (fix_multicontact_halfspace.cpp:406)
==25016== by 0x9C7526: LAMMPS_NS::FixMultiContactHalfSpace::setup_pre_force(int) (fix_multicontact_halfspace.cpp:222)
==25016== by 0x42A642: LAMMPS_NS::Modify::call_method_on_fixes(void (LAMMPS_NS::Fix::*)(int), int, int*&, int&) (modify.cpp:1526)
==25016== by 0x4249FE: LAMMPS_NS::Modify::setup_pre_force(int) (modify.cpp:377)
==25016== by 0x55DA3A: LAMMPS_NS::Verlet::setup() (verlet.cpp:174)
==25016== by 0x648C18: LAMMPS_NS::Run::command(int, char**, long) (run.cpp:212)
==25016== by 0x4D1F99: void LAMMPS_NS::Input::command_creator(LAMMPS_NS::LAMMPS*, int, char**) (input.cpp:662)
==25016== by 0x4CBC62: LAMMPS_NS::Input::execute_command() (input.cpp:645)
==25016== by 0x4C9EA5: LAMMPS_NS::Input::file() (input.cpp:256)
==25016== Address 0x12134a98 is 8 bytes after a block of size 77,120 alloc'd
==25016== at 0x4C2DE96: malloc (vg_replace_malloc.c:309)
==25016== by 0x51A022: LAMMPS_NS::Memory::smalloc(long, char const*) (memory.cpp:71)
==25016== by 0x41C672: double* LAMMPS_NS::Memory::create(double*&, int, char const*) (memory.h:79)
==25016== by 0xA11349: LAMMPS_NS::Comm::grow_send(int, int) (comm.cpp:1868)
==25016== by 0xA0EDBF: LAMMPS_NS::Comm::borders() (comm.cpp:1261)
==25016== by 0x55D91E: LAMMPS_NS::Verlet::setup() (verlet.cpp:161)
==25016== by 0x648C18: LAMMPS_NS::Run::command(int, char**, long) (run.cpp:212)
==25016== by 0x4D1F99: void LAMMPS_NS::Input::command_creator(LAMMPS_NS::LAMMPS*, int, char**) (input.cpp:662)
==25016== by 0x4CBC62: LAMMPS_NS::Input::execute_command() (input.cpp:645)
==25016== by 0x4C9EA5: LAMMPS_NS::Input::file() (input.cpp:256)
==25016== by 0x6E7820: main (main.cpp:100)
==25016==

LFrie | Tue, 09/13/2022 - 09:27

atom_style sphere
atom_modify map array
boundary p p p
newton off
hard_particles yes

communicate single vel yes
processors * 1 1

units si

read_data data/input.data

neighbor 0.0001 bin
neigh_modify delay 0

#Material properties required for new pair styles

## density, q=I/mr^2, k1, k2, kc, kt, kr, ko, .. mus, mud, mur, mu0, 8 gamma_n, gamma_t, gamma_r, gamma_o, gamma_b, gamma_br, phi_f, psi
# 2000 0.4 1e5 1e5 0 2e4 0e4 0e4 .. 0.5 0.5 0 0 .8. 1546.26490773707 309.252981547414 0 0 154.6 30.9 0.5 0

fix m1 all property/global youngsModulus peratomtype 2.58e7
fix m2 all property/global poissonsRatio peratomtype 0.3
#fix m33 all property/global kn peratomtypepair 2 0.4e8 0.4e8 0.4e8 0.4e8
fix m3 all property/global LoadingStiffness peratomtypepair 1 0.6e7
fix m4 all property/global UnloadingStiffness peratomtypepair 1 150
fix m5 all property/global coefficientPlasticityDepth peratomtypepair 1 1.0
#fix m44 all property/global kt peratomtypepair 2 100 100 100 100
fix m6 all property/global gamman peratomtypepair 1 100
fix m7 all property/global gammat peratomtypepair 1 100
fix m27 all property/global coefficientRestitution peratomtypepair 1 0.352
fix m8 all property/global coefficientFriction peratomtypepair 1 0.561
fix m9 all property/global coefficientAdhesionStiffness peratomtypepair 1 1
fix m10 all property/global pullOffForce peratomtypepair 1 0
fix m11 all property/global FluidViscosity peratomtypepair 1 0.5
fix m12 all property/global coeffFrictionStiffness peratomtypepair 1 0
fix m13 all property/global FrictionViscosity peratomtypepair 1 0.01
fix m144 all property/global coefficientRollingFriction peratomtypepair 1 0.3
# fix 1 all viscous 0.004 #scale 1 4.5
#New pair style
# AM: or no history?
#pair_style gran model hertz/stiffness tangential history
#pair_style gran model luding tangential tan_luding rolling_friction cdt surface multicontact#rolling_friction luding #rolling_friction luding
pair_style gran model hertz tangential history rolling_friction cdt surface multicontact#rolling_friction luding #rolling_friction luding
pair_coeff * *

#variable dt equal 0.000000001
variable dt equal 0.00000000000001
#0.0000000001
timestep ${dt}

fix gravi all gravity 9.81 vector 0.0 0.0 -1.0

fix 1 all continuum/weighted kernel_radius 0.01 compute stress

####compute overall stress tensor of the assembly and z component for the stress-strain curve
compute Temp all temp
compute 4 all pressure Temp virial
variable stressXX equal c_4[1]
variable stressYY equal c_4[2]
variable stressZZ equal c_4[3]
variable trace equal abs(c_4[1]+c_4[2]+c_4[3]+c_4[4]+c_4[5]+c_4[6])

#### Store final cell length for strain calculations
variable tmp equal "lz"
variable L0 equal ${tmp}
print "Initial Length, L0: ${L0}"

variable strain equal (lz-${L0})/${L0}
variable s equal v_strain

fix mc all multicontact/halfspace geometric_prefactor 1.0
#fix 1 all continuum/weighted kernel_radius 0.01 compute stress
#apply nve integration to all particles
fix integr all nve/sphere
fix ts_check all check/timestep/gran 1000 0.1 0.1

#output settings, include total thermal energy
compute 1 all erotate/sphere
thermo_style custom step atoms ke vol cpuremain cpu
thermo 1000
compute_modify thermo_temp dynamic yes

#insert the first particles so that dump is not empty

#run 600000
dump dmp all custom/vtk 10000 post/hydrogel_default*.vtk id type type x y z ix iy iz vx vy vz fx fy fz omegax omegay omegaz radius
fix 5 all print 500 "$s ${stressZZ} " file Stress_Strain_default.txt screen no title " Strain [-], Stress [MPa]" #${stressXX} ${stressYY} ${press} ${trace}
#fix press all deform 1 z delta 0 -1.0e-3 # negative strain application along z axis
fix move all deform 1 z erate -24000000 units box remap x
run 2370000
unfix move
fix move all deform 1 z erate 24000000 units box remap x
run 2370000

// Data file is reading //

LIGGGHTS data file

8 atoms

1 atom types

-0.000105 0.000305 xlo xhi
-0.000105 0.000305 ylo yhi
-0.000105 0.0003 zlo zhi

Atoms

1 1 0.0002 1541.1 0.0 0.0 0.0
2 1 0.0002 1541.1 0.0 0.0 0.0002
3 1 0.0002 1541.1 0.0 0.0002 0.0
4 1 0.0002 1541.1 0.0 0.0002 0.0002
5 1 0.0002 1541.1 0.0002 0.0002 0.0002
6 1 0.0002 1541.1 0.0002 0.0002 0.0
7 1 0.0002 1541.1 0.0002 0.0 0.0002
8 1 0.0002 1541.1 0.0002 0.0 0.0

mschramm | Tue, 09/13/2022 - 17:12

Hello,
I will take a look at the input script some time this week but wanted to know why you are calling valgrind?
This tests for memory leeks and will give false positives when using outside libraries.
It will also drastically slow down the execution of your script.

LFrie | Thu, 09/15/2022 - 11:57

In particular the problem is related to do_forward_comm in fix_multicontact_halfspace.cpp (lines> 405-406 in my code)
...
for (contact_property_atom = contact_property_atom_vector.begin(); contact_property_atom < contact_property_atom_vector.end(); contact_property_atom++)
(*contact_property_atom)->do_forward_comm();
...

also there is this post related to this error
https://www.cfdem.com/forums/multicontact-model-issues

LFrie | Fri, 09/16/2022 - 10:23

only if you add > store_force_contact_stress yes, in this command

fix piston all wall/gran model hertz tangential history surface multicontact store_force_contact_stress yes mesh n_meshes 1 meshes piston_m

still do not know why and with periodic boundaries is still not working