Scalability issues and 'Outpt time'

Submitted by AnBrito on Wed, 11/27/2019 - 11:07

Hello and good morning/afternoon,
I have a few questions related to the timings LIGGGHTS outputs to the screen/log file at the end of a run as well as some scalability issues im facing.
So i'm running granular simulations of around 500 000 particles discharging in an hopper. I ran the simulation 10 and 100 timesteps with different number of cores to gauge scalability. I used the most recent LIGGGHTS version in Ubuntu 18.04LTS.
The results were not in line with what was expected since the total simulation time (as given by the liggghts log file) did not scale linearly in a log-log scale as expected. (totaltime.png)
A quick search through the log file shows that the "Outpt time" is 0.1% of the total time increasing to 60% with increasing number of cores! Discounting the output time from the totaltime gives us better results, the scaling is linear up to 16 cores. (nooutpt.png)
Questions:
1) What is outpt time? It has something to do with processor communication for sure, but i checked if was doing any uncessary computation that may require interprocessor comms. Thermo is not outputing anything to the screen except particles, cpu and spcpu. I even tried turning off logging ('log none'), backgrounding the log etc but it is always the same. Results are the same with different machines and increasing the number of timesteps ran from 10 to 100 changed nothing.
2) Why is my scalability so bad even without outpt time? Stagnation at aroung 16 cores is something i wasnt expecting at all. Please notice that i tried different slices of the domain (2x4x4 vs 4x4x2 for example). The results shown are the best ones. I was expecting it to scale at least until 32 cores. The official benchmarks scale until 100 i think.
Thanks you very much for the help
A.B.

AttachmentSize
Image icon totaltime.png50.4 KB
Image icon nooutpt.png50.55 KB