liggghts simulation stops without error message

Submitted by kodai on Tue, 09/21/2021 - 19:03

hello
i m trying to simulate particles inside a granulator, so i have launched a experimental design that contains 3125 simulation, approximately half of simulations stops before reaching final time step without showing any error message.
i have attached the script of liggghts, and log of simulation.
this happens very often to me and i stuck in this problem, so if any one can please help.

that was the message error from slurm
Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7fc56bf2159c)
==== backtrace (tid: 841536) ====
0 0x00000000000212fe ucs_debug_print_backtrace() debug/debug.c:653
1 0x0000000000012b20 .annobin_sigaction.c() sigaction.c:0
2 0x0000000000a04f0a LAMMPS_NS::Neighbor::bin_atoms() ???:0
3 0x0000000000956541 LAMMPS_NS::Neighbor::granular_bin_no_newton() ???:0
4 0x00000000009fddbb LAMMPS_NS::Neighbor::build() ???:0
5 0x000000000059e780 LAMMPS_NS::Verlet::run() ???:0
6 0x00000000009173bf LAMMPS_NS::Run::command() ???:0
7 0x00000000004a1f43 LAMMPS_NS::Input::command_creator() ???:0
8 0x000000000049fda2 LAMMPS_NS::Input::execute_command() ???:0
9 0x00000000004a04e5 LAMMPS_NS::Input::file() ???:0
10 0x0000000000414f4c main() ???:0
11 0x00000000000237b3 __libc_start_main() ???:0
12 0x000000000041532e _start() ???:0
=================================

Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.

I have attached input files stl.
note :change extension to tar.xz to open stl files

AttachmentSize
Plain text icon in.1732.txt14.49 KB
Plain text icon in.2435.txt14.5 KB
Plain text icon in.2481.txt14.5 KB
Binary Data stl_files.tar_.gz14.04 KB

mschramm | Tue, 09/28/2021 - 15:34

Hello,
To allow the script to run faster, I changed your particle dimensions by a factor of 10 and altered the particle rate to allow it to run. 6000 particles were inserted.
With this, I found no issue in the script and it ran to completion.

I did notice before changing particle sizes, your script consumed 4 GB of memory per core, so with the original 32 cores you specified in the script, that would be 128 GB of memory. This is high and I would suggest manually setting the bin sizes via the neigh_modify command.