error while running liggghts run script on cluster computer

Submitted by Roesch on Wed, 03/11/2020 - 10:31

Hello!

I run my cfdem-simulation on my local computer without any issues. But if I try to run it on our cluster I get following error while the liggghts run script is running:
# Physical boundaries #
#####################################################
fix reactor all mesh/surface file ../DEM/STL/walls.stl heal auto_remove_duplicates precision 1e-7 type 1 curvature 1e-6
[1] #0 [3] [2] #0 [4] #0 [5] #0 [6] #0 [7] #0 [8] #0 [9] #0 [10] #0 [11] #0 [0] #0 Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)#0 Foam::error::printStack(Foam::Ostream&) at ??:?
[1] #1 Foam::sigFpe::sigHandler(int) at ??:?
[3] #1 Foam::sigFpe::sigHandler(int) at ??:?
[0] #1 Foam::sigFpe::sigHandler(int) at ??:?
[11] #1 Foam::sigFpe::sigHandler(int) at ??:?
[2] #1 Foam::sigFpe::sigHandler(int) at ??:?
[4] #1 Foam::sigFpe::sigHandler(int) at ??:?
[5] #1 Foam::sigFpe::sigHandler(int) at ??:?
[7] #1 Foam::sigFpe::sigHandler(int) at ??:?
[8] #1 Foam::sigFpe::sigHandler(int) at ??:?
[9] #1 Foam::sigFpe::sigHandler(int) at ??:?
[10] #1 Foam::sigFpe::sigHandler(int) at ??:?
[6] #1 Foam::sigFpe::sigHandler(int) at ??:?
[3] #2 ? at ??:?
[1] #2 ? at ??:?
[6] #2 ? in "/usr/lib64/libc.so.6"
[3] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int) at ??:?
[11] #2 ? at ??:?
[4] #2 ? at ??:?
[5] #2 ? at ??:?
[7] #2 ? at ??:?
[8] #2 ? at ??:?
[0] #2 ? at ??:?
[9] #2 ? at ??:?
[10] #2 ? at ??:?
[2] #2 ? in "/usr/lib64/libc.so.6"
[1] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int) in "/usr in "/usr/lib64 in "/lib64/libc.so.6"
/libc.so.6"
/usr/lib64/libc.so.6"
in "/us in "r/lib64/libc.so.6"/usr/lib64/libc
.so.6"
in "/usr/lib64/libc.so.6"
[4] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int) in "/usr/lib64/libc.so.6"
[0] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int) in "/usr/lib64/libc.so.6"
[8] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int)[10] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int)[7] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int)[5] # in "/usr/lib64/libc.so.6"
3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int) in "/usr/lib64/libc.so.6"
[2] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int)[9] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int)[11] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int)[6] #3 LAMMPS_NS::SurfaceMesh<3, 5>::addElement(double**, int) at ??:?
[1] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[7] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[3] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
at ??:?
at ??:?
at ??:?
[4] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool)[10] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[5] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[11] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[2] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[9] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[8] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool)[0] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool)[6] #4 LAMMPS_NS::TrackingMesh<3>::popElemFromBuffer(double*, int, bool, bool, bool) at ??:?
[1] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[3] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[6] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[0] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[4] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[10] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
at ??:?
[5] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[11] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[2] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[9] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[8] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*)[7] #5 LAMMPS_NS::MultiNodeMeshParallel<3>::restart(double*) at ??:?
[1] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*) at ??:?
[3] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*) at ??:?
[1] #7 LAMMPS_NS::Input::execute_command() at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
[0] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*)[6] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*)[4] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*) at ??:?
[7] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*)[10] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*) at ??:?
[11] #[5] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*)[2] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*)[9] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*)6 LAMMPS_NS::Modify::add_fix(int, char**, char*) at ??:?
[8] #6 LAMMPS_NS::Modify::add_fix(int, char**, char*) at ??:?
[3] #7 LAMMPS_NS::Input::execute_command() at ??:?
[1] #8 LAMMPS_NS::Input::file() at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
[8] #7 LAMMPS_NS::Input::execute_command()[2] #7 LAMMPS_NS::Input::execute_command()[0] #7 LAMMPS_NS::Input::execute_command() at ??:?
[11] #7 LAMMPS_NS::Input::execute_command()[6] #7 LAMMPS_NS::Input::execute_command() at ??:?
[7] #7 LAMMPS_NS::Input::execute_command() at ??:?
[10] #7 LAMMPS_NS::Input::execute_command() at ??:?
[9] #7 LAMMPS_NS::Input::execute_command() at ??:?
[5] #7 LAMMPS_NS::Input::execute_command()[4] #7 LAMMPS_NS::Input::execute_command() at ??:?
[3] #8 LAMMPS_NS::Input::file() at ??:?
[1] #9 LAMMPS_NS::Input::file(char const*) at ??:?
[3] #9 LAMMPS_NS::Input::file(char const*) at ??:?
[1] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
[5] #8 [11] #8 LAMMPS_NS::Input::file()LAMMPS_NS::Input::file()[6] #8 LAMMPS_NS::Input::file()[2] #8 LAMMPS_NS::Input::file()[4] #8 LAMMPS_NS::Input::file()[7] #8 LAMMPS_NS::Input::file()[0] #8 LAMMPS_NS::Input::file()[9] #8 LAMMPS_NS::Input::file()[8] #8 LAMMPS_NS::Input::file()[10] #8 LAMMPS_NS::Input::file() at ??:?
[1] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[3] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[1] #12 Foam::dataExchangeModel::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[3] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[1] #13 Foam::cfdemCloud::cfdemCloud(Foam::fvMesh const&) at ??:?
[3] #12 Foam::dataExchangeModel::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
at ??:?
[11] #9 [5] #9 LAMMPS_NS::Input::file(char const*)LAMMPS_NS::Input::file(char const*)[4] #9 LAMMPS_NS::Input::file(char const*)[7] #9 LAMMPS_NS::Input::file(char const*)[2] #9 LAMMPS_NS::Input::file(char const*)[6] #9 LAMMPS_NS::Input::file(char const*) at ??:?
[9] #9 LAMMPS_NS::Input::file(char const*)[10] #9 LAMMPS_NS::Input::file(char const*)[0] #9 LAMMPS_NS::Input::file(char const*)[8] #9 LAMMPS_NS::Input::file(char const*) at ??:?
[1] #14 at ??:?
[3] #13 Foam::cfdemCloud::cfdemCloud(Foam::fvMesh const&)main at ??:?
[1] #15 __libc_start_main at ??:?
[3] #14 in "/usr/lib64/libc.so.6"
[1] #16 main? at ??:?
[8] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[9] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[2] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[7] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[5] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[4] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[11] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[10] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[6] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[0] #10 Foam::twoWayMPI::twoWayMPI(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[3] #15 __libc_start_main at ??:?
[node227:94474] *** Process received signal ***
[node227:94474] Signal: Floating point exception (8)
[node227:94474] Signal code: (-6)
[node227:94474] Failing at address: 0x27860001710a
[node227:94474] [ 0] /usr/lib64/libc.so.6(+0x35250)[0x2b47b2976250]
[node227:94474] [ 1] /usr/lib64/libc.so.6(gsignal+0x37)[0x2b47b29761d7]
[node227:94474] [ 2] /usr/lib64/libc.so.6(+0x35250)[0x2b47b2976250]
[node227:94474] [ 3] /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS11SurfaceMeshILi3ELi5EE10addElementEPPdi+0x3ea)[0x2b47b7c0eada]
[node227:94474] [ 4] /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS12TrackingMeshILi3EE17popElemFromBufferEPdibbb+0xfac)[0x2b47b7c0e14c]
[node227:94474] [ 5] /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS21MultiNodeMeshParallelILi3EE7restartEPd+0x116)[0x2b47b7bffd46]
[node227:94474] [ 6] /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS6Modify7add_fixEiPPcS1_+0x973)[0x2b47b7ae0f23]
[node227:94474] [ 7] /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS5Input15execute_commandEv+0x1a07)[0x2b47b7aa2447]
[node227:94474] [ 8] in "/usr/lib64/libc.so.6"
[3] #16 /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS5Input4fileEv+0x169)[0x2b47b7aa03a9]
[node227:94474] [ 9] /home/y0072912/LIGGGHTS/LIGGGHTS-iPAT/src-build/libliggghts.so(_ZN9LAMMPS_NS5Input4fileEPKc+0x6a)[0x2b47b7aa01ca]
[node227:94474] [10] /home/y0072912/CFDEM/CFDEMcoupling-iPAT/platforms/linux64GccDPInt32Opt/lib/liblagrangianCFDEM-19.02-4.x.so(_ZN4Foam9twoWayMPIC1ERKNS_10dictionaryERNS_10cfdemCloudE+0x2a7)[0x2b47b100cdf7]
[node227:94474] [11] /home/y0072912/CFDEM/CFDEMcoupling-iPAT/platforms/linux64GccDPInt32Opt/lib/liblagrangianCFDEM-19.02-4.x.so(_ZN4Foam17dataExchangeModel31adddictionaryConstructorToTableINS_9twoWayMPIEE3NewERKNS_10dictionaryERNS_10cfdemCloudE+0x2e)[0x2b47b100d67e]
[node227:94474] [12] /home/y0072912/CFDEM/CFDEMcoupling-iPAT/platforms/linux64GccDPInt32Opt/lib/liblagrangianCFDEM-19.02-4.x.so(_ZN4Foam17dataExchangeModel3NewERKNS_10dictionaryERNS_10cfdemCloudE+0x22d)[0x2b47b100198d]
[node227:94474] [13] /home/y0072912/CFDEM/CFDEMcoupling-iPAT/platforms/linux64GccDPInt32Opt/lib/liblagrangianCFDEM-19.02-4.x.so(_ZN4Foam10cfdemCloudC2ERKNS_6fvMeshE+0xd3d)[0x2b47b0e7524d]
[node227:94474] [14] at ??:?
cfdemSolverPisoIB(main+0xd51)[0x4246a1]
[node227:94474] [15] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b47b2962b35]
at ??:?
[11] #11 [node227:94474] [16] cfdemSolverPisoIB[0x42a0bb]
[node227:94474] *** End of error message ***
Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&)[8] #11 at ??:?
[0] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[9] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[2] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[7] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&)Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[10] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[6] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[4] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&) at ??:?
[5] #11 Foam::dataExchangeModel::adddictionaryConstructorToTable::New(Foam::dictionary const&, Foam::cfdemCloud&)?--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 94474 on node node227 exited on signal 8 (Floating point exception).

I compiled the same version of cfdem on both local and cluster computer. Moreover I tried the same number and distribution of processors. I don't know where the problem is located.

I added the log of my local computer for comparison.

AttachmentSize
Plain text icon log_liggghts.run_local.txt13.59 KB

Lowered | Mon, 03/23/2020 - 12:41

Hi,
can you provide more information about the cluster system you are using and how to submit jobs to the cluster? Some clusters require some kind of batch script ...

Kind regards
Henrik