cfdemSolverIB Segfault in buildLabelHashSet for high resolution

Submitted by shepherd on Tue, 04/23/2024 - 00:31

Hello,

I keep getting a segmentation fault when I start approaching 40 cells / diameter. Under 30 cells / diameter is fine, and I have this issue both on my laptop and the cluster at my univesity. I'm running this with cfdem-pfm, though I recall having what initially appeared to be the same problem with cfdem-public. I've narrowed down the issue to the buildLabelHashSet function in IBvoidfraction.C. To determine this, I put an info warning around when it's called:


Info << "Approaching buildLabelHashSet, line 174 IBVoidFraction.C" << endl;
buildLabelHashSet(index, minPeriodicParticlePos, particleCenterCellID, hashSett, true);
Info << "Passed buildLabelHashSet, line 174 IBVoidFraction.C" << endl;

and when I run the case with a fine resolution, I get this error message:

Total # of neighbors = 0
Ave neighs/atom = 0
Neighbor list builds = 0
Dangerous builds = 0
LIGGGHTS finished
nr particles = 1
- findCell()
findCell done.
- setvoidFraction()
Approaching buildLabelHashSet, line 174 IBVoidFraction.C
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 3 with PID 0 on node (computer info) exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

I'm not quite sure what to do about it. Any assistance is appreciated.

Thank you.

shepherd | Thu, 05/30/2024 - 17:21

I have resolved the issue. It's due to the recursive function causing a stack overflow. My solution that appears to be working at the moment is to adjust the command, per Valgrind's instructions, in Line 631 in etc/functions.sh to mpirun -np $nrProcs $debugMode --main-stacksize=xx $solverName -parallel 2>&1 | tee -a $logpath/$logfileName where xx is the number of bytes you want the stack size to be (eg: xx=33554432 for 32MB). Also, here's a MB to byte converter if that is helpful to anyone coming across this issue.

shepherd | Thu, 05/30/2024 - 17:46

Correction: the solution I posted with the flag only works in debug mode. I am still seeking a solution that works outside of debug mode.