LIGGGHTS Won't Run on Multiple Grid Nodes SGE

Submitted by spen_CM on Fri, 06/21/2019 - 17:42

Hi all,

We are trying to run a LIGGHTS simulation on multiple 64-core nodes in a grid controlled using Sun Grid Engine. We are using mpirun and requesting a job with 128 cores on the grid. Using the -np $NSLOTS flag to mpirun, the scheduler reports that these grid slots are properly allocated, and LIGGGHTS sets a processor layout of the right size, but checking core usage reveals that all of the work is being done by threads on one machine. WIthout the -np $NSLOTS flag, LIGGGHTS reports an error that the number of physical processors differs from the requested number of processors. We were wondering if anyone has experience running LIGGGHTS sims using MPI on SGE and has any pointers. The job is running in a parallel environment with a fill_up scheduler requesting 128 cores with qsub.