solvers get stuck

Dear CFDEMers,
My test case ran into a trouble: the solver got stucked. I setup my case to solve a bubble bed case with 200 thousands particles. The solver ran quite well until I stoped it and make some refinement to the grids and increase the parallel-run processors numbers. After that no matter how I change these combition the solver just ran first step and it was stuck at the output line:

" Time = 0.0001

Courant Number mean: 0.000166667 max: 0.05
- evolve()
Starting up LIGGGHTS
Executing command: 'run 100 '
run 100 WARNING: Dump mesh/gran/VTK contains a mesh with no velocity allocated, will not dump velocity
Setting up run ...
Memory usage per processor = 151.178 Mbytes
Step Atoms KinEng 1 ts[1] ts[2] heattran Volume
100002 202600 1.3721859e-07 2.0860851e-08 0 0 1.6141043 0.000256
100102 202600 1.375165e-07 2.085135e-08 0 0 1.6140989 0.000256
Loop time of 24.8778 on 24 procs for 100 steps with 202600 atoms

Pair time (%) = 4.18553 (16.8244)
Neigh time (%) = 0.0976603 (0.39256)
Comm time (%) = 2.17667 (8.74945)
Outpt time (%) = 0.00425414 (0.0171002)
Other time (%) = 18.4137 (74.0165)

Nlocal: 8441.67 ave 29146 max 0 min
Histogram: 12 6 0 0 0 0 0 0 0 6
Nghost: 15834 ave 51183 max 902 min
Histogram: 12 2 4 0 0 0 0 0 2 4
Neighs: 996260 ave 3.43806e+06 max 0 min
Histogram: 12 6 0 0 0 0 0 0 0 6

Total # of neighbors = 23910234
Ave neighs/atom = 118.017
Neighbor list builds = 1
Dangerous builds = 0
LIGGGHTS finished

timeStepFraction() = 1
Total particle volume neglected: 0
"

I used to run into the same trouble before and simplely solve it by restarting the calculation again. But this time nothing worked it out. I wonder whteher I change someting wrong? have you, the forumers, ever run in to the same trouble before?
I am looking forward your suggestions. Thank you.

Forums:

CFDEM®coupling - User Forum

cgoniva | Tue, 04/24/2012 - 18:16

Hi, did the simulation simply

Hi,

did the simulation simply stop?

Which kind of locateModel did you use?
engineSearch can get stuck with treeSearch on; (you might try treeSearch off;)

Cheers,
Chris

ngcw1986 | Wed, 04/25/2012 - 03:16

Thank you for your reply,

Thank you for your reply, cgoniva!
My case is not stop, the cpu is busy and the memory is getting larger and larger until the system crash. As you pointed out, my "enginSearch" is "treeSearch on". I will turn it off and try more. Thank you!

ngcw1986 | Thu, 04/26/2012 - 04:38

bad news, trial and failure.

Dear Chris and all,
I turn the option "treeSearch on" off in "enginSearch", but nothing changed. The case is just stuck at the same point
" timeStepFraction() = 1
Total particle volume neglected: 1.12752e-11"
and the next "evolve done" is processing for a whole day. I found that the memory is out of use. The virtual memory is allocated more than 30% and the cpus are temporary "leisure" waiting for the data to write to the harddisk. I am sure that our memory is large enough to store such number of particles. To avoid another system crash, I turn off the solver by "Ctrl+C" but I have to kill the solver progress on some nodes. To my opinion, the memory allocation is unnormal, but I don't know whether it is a bug of the solver or there are something wrong with my case setup.

For the unnormal memory usage I change the "atom modify" from "map array" to "map hash" expecting to optimise performance. But the case just crashed when I was using more than one node to do parallel computing. I wonder whether it is compatible with the distributed memory clusters or something more need to be changed when this option is choosen.

Further more I set up another case based on the previous calculated one. After the set up of the case is all done the solver just crashed and output this:

timeStepFraction() = 1
Total particle volume neglected: 4.14156e-14
[1] #0 [2] Foam::error::printStack(Foam::Ostream&)[3] #0 Foam::error::printStack(Foam::Ostream&)#0 Foam::error::printStack(Foam::Ostream&)--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.

The process that invoked fork was:

Local host: node21 (PID 28456)
MPI_COMM_WORLD rank: 3

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
in "/home/eric/OpenFOAM/OpenFOAM-2.0.1/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[2] #1 Foam::sigFpe::sigHandler(int) in "/home[15] #0 Foam::error::printStack(Foam::Ostream&)/eric/OpenFOAM/OpenFOAM-2.0.1/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[1] #1 Foam::sigFpe::sigHandler(int) in "/home/eric/OpenFOAM/OpenFOAM-2.0.1/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[3] #1 Foam::sigFpe::sigHandler(int) in "/home/eric/OpenFOAM/OpenFOAM-2.0.1/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[2] #2

Well, misfortunes never come alone, it seems parallel computing has something wrong. This time I was totally messed up. I am looking forward to your suggestions! Thank you!

cgoniva | Mon, 04/30/2012 - 09:21

Hi, could you please try to

Hi,

could you please try to use model type B (with Archimedes and without gradPForce and viscForce) and see whether the error still remains.

There was once a problem with OF's "interpolationCellPoint ...Interpolator_(...Field) ;" command in either gradPForce.C or viscForce.C

Cheers,
Chris

ngcw1986 | Tue, 05/01/2012 - 04:54

Dear Chris, Thank you for

Dear Chris,

Thank you for your replay! You are right, that is the key point as you pointed out! When I change the model type to B, it works "fine".

Well, I have another question about how to improve the simulation efficnency. As this test case with 202 thousands pariticle got stuck, I set up a new one with 162 thousands particles. I run this case on two and three nodes separately on my lab cluster(each node has two intel E5335 cpu, 8cores, 2.0Ghz with 12GB shared memory, connected with infiniband). To my surprise, the real time needed for a simulatin step is almost the same, about 40s! Most time spent each step was on DEM calculatin as I observed. I knew that liggghts is quite efficient, and I thought I optimised the CFD calculation efficiency well. I will do more test on parallel efficienty and I hope you can give me some suggestions!

About efficiency, I am in dilemma. For the subject I studed, the dimensions of the simulation box could not be too large because the particles numbers may easily exceed our computation capacity. Meanwhile the cells could not be too small for it has to hold a certain number of particles. So the overall cell number is relatively small causing a very bad CFD parellel performance, expecially in strong phase-coupling circumstance where pressure is hard to convergence. GAMG solver is a good choice but unfortunately its efficiency scales up badly when cell number per processor are small. At the DEM side, on the contrary, more processors means less time consumption. I think the balance point is very important in cfdem simulations and may you give me some hint? Thank you!

cgoniva | Wed, 05/02/2012 - 08:21

Hi, glad to hear that it

Hi,

glad to hear that it works fine now! Although, I do not understand why this interpolation is causing these problems...
It will be important fo get that fixed, as interpolation can have an enormous effect!

Concerning the performance - I know your problem. Using too many processors will give no more speed up as CFD simulation domain will be too small to be efficient.

For the next release (hopefully soon) there will be a tool to examine the time spent for different calculation steps - this should help us improve the performance.

Cheers,
Chris

ngcw1986 | Wed, 05/02/2012 - 10:51

What a good news!

Dear Chris,

Thank you for your qiuck reply and the good news! I think it is time for me to get more experiences on cfdem and prepare a warm hug to the next release!

Best wishes and good luck!

Forums:

Hi, did the simulation simply

Thank you for your reply,

bad news, trial and failure.

Hi, could you please try to

Dear Chris, Thank you for

Hi, glad to hear that it

What a good news!

For full access including downloads and forums, please register