Hi!
Is there some upper limit from the LIGGGHTS to the one
and page
values? If I increase the page
value too much, I get memory error, when run
is executed. If this is not an LIGGGHTS related, do you know is this Linux or hardware related issue? I'm using 64-bit Linux on distributed memory cluster with 8GB RAM on each node, and the LIGGGHTS is executed parallel in a distributed way.
The reason for using high values for these parameters, is because I get quite often these "malloc(): memory corruption"-errors. And in the docs it said that "LAMMPS can crash without an error message if the number of neighbors for a single particle is larger than the page setting". Below is the output of the latest error (for which I used the one=5000 and page=1000*one=5000000):
*** glibc detected *** /opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora: malloc(): memory corruption: 0x00000000143a8160 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3451a72fae]
/lib64/libc.so.6(__libc_malloc+0x6e)[0x3451a74cde]
/opt/share/apps/openmpi-1.5.3/lib/openmpi/mca_coll_tuned.so[0x2af33c1df589]
/opt/share/apps/openmpi-1.5.3/lib/libmpi.so.1(PMPI_Allreduce+0x1f9)[0x2af3381746a9]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(_ZN9LAMMPS_NS15FixTriNeighlist9pre_forceEi+0x6f7)[0x59c409]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(_ZN9LAMMPS_NS6Modify9pre_forceEi+0x43)[0x5d305f]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(_ZN9LAMMPS_NS6Verlet3runEi+0x233)[0x69425f]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(_ZN9LAMMPS_NS3Run7commandEiPPc+0x668)[0x6743cc]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(_ZN9LAMMPS_NS5Input15execute_commandEv+0xfb1)[0x5c077d]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(_ZN9LAMMPS_NS5Input4fileEv+0x2b6)[0x5c12c6]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(main+0x5f)[0x5c8cef]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3451a1d994]
/opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora(__gxx_personality_v0+0x341)[0x4713e9]
======= Memory map: ========
00400000-0071f000 r-xp 00000000 00:1b 94147262 /opt/share/apps/LIGGGHTS/liggghtsdev-1.4.4/src/lmp_fedora
.
.
.
- Timo
ckloss | Thu, 10/06/2011 - 09:52
Hi Timo, >> I get quite often
Hi Timo,
>> I get quite often these "malloc(): memory corruption"-errors
Can you post these please, I do not think that your issues are related to the one and page settings
Thanks, Christoph
tkulju | Thu, 10/06/2011 - 10:31
Hi Christoph! I sent you the
Hi Christoph!
I sent you the error output and script through e-mail. And one thing, which has been puzzling me; are the
INFO: Maxmimum number of particle-tri neighbors >380 at step 108568, growing array...done!
INFO: Maxmimum number of particle-tri contacts >12 at step 108581, growing array
lines crucial? Because I get them quite a lot, especially the "particle-tri neighbors" ones.
- Timo
msbentley | Thu, 10/06/2011 - 11:47
I also wondered about this...
I was also wondering this - I have a long, narrow cylinder in which the triangles making the mesh are (a) rather elongated (running the whole length of the cylinder) and (b) give the above errors due to many particle-triangle contacts.
Is it better to re-mesh with smaller triangles? And are the issues related to computational efficiency, or are the results in fact then wrong?
Thanks!
Mark
ckloss | Thu, 10/06/2011 - 12:00
Hi Mark and Timo, elongated
Hi Mark and Timo,
elongated triangles are numerically bad (as are e.g. elongated cells for CFD). I would advise you to use a mesher like gmsh (which can generate such a mesh from iges or step) instead of using CAD data directly
Christoph
msbentley | Mon, 10/10/2011 - 13:05
Thanks for the suggestion!
Great, thanks - I've re-created my (very simple) geometry directly in gmsh, and exported to an STL and it's working better now :)
tkulju | Thu, 10/20/2011 - 07:36
Hi Christoph & Mark!
Hi Christoph & Mark!
Thanks!! I remeshed the surface of my geometry with gmsh and Salome. After importing the better mesh and modifying the simulation parameters (particle young modulus and cohesion) a bit, my case is working much better now.
- Timo
tkulju | Mon, 10/24/2011 - 10:12
Dangerous builds
Hi!
Now I'm getting "Dangerous builds"-messages on a regular basis after the simulation has been running a while. Any Idea how to localize this? Or is it a issue to concern about?
My Rayleigh and Hertz times are 0.036936609 and 0.021932516, so I don't think it's a time step issue.
- Timo
ckloss | Mon, 10/24/2011 - 10:18
2 possible reasons are: +
2 possible reasons are:
+ time-step is too large (or skin too short) so that particles travel further than skin in a time-step (you should get a warning in this case)
+ particles that are inserted instantaneously overlap a CAD wall
Christoph
tkulju | Tue, 10/25/2011 - 14:46
Hi! The message appears
Hi!
The message appears usually at different times than the particles are inserted and I don't get any warning message about the skin size. Could the too dense or coarse wall size be the reason?
For debugging purposes, is there a way to get the coordinates, where this "dangerous builds" happens?
- Timo
ckloss | Tue, 10/25/2011 - 23:30
Hi Timo, >>For debugging
Hi Timo,
>>For debugging purposes, is there a way to get the coordinates, where this "dangerous builds" happens?
I have a debugging possibility in my version but as I am out of office I cant send it to u unfortunately...
if you could narrow down the issue a bit (small testcase) and send it to me, I can have a look as I return
for now, you could change the skin size (make it larger) and see if that resolves the issue
Christoph
tkulju | Fri, 10/28/2011 - 07:10
Hi Christoph! It seems that
Hi Christoph!
It seems that the problem is in my rotating .stl-geometry. So it is somehow related with the particle-wall interaction with the rotating mesh. Or my mesh is just too bad... If you could send me the debug version it would be nice. Or advice, how i could get the coordinates. I've noticed that in fix_tri_neighlist.cpp at line 340-341 these warnings appears, so appending there the coordinate information would help to locate the problem.
I'll try to create a simpler test case, which I could send to you. But it may take some time..
- Timo
ckloss | Fri, 10/28/2011 - 12:10
ok, as you have a testcase,
ok, as you have a testcase, feel free to send it to me
Cheers, Christoph
tkulju | Fri, 10/28/2011 - 14:56
Hi Christoph! Ok, now I have
Hi Christoph!
Ok, now I have a small test case. Although I wasn't able to reproduce the "Dangerous build" warning, I have another question. As my geometry consists of 2 .stl surfaces, where other one is moving and other one is static, is it a bad if they overlap? I'll send you soon of an example of this issue.
- Timo
ckloss | Fri, 10/28/2011 - 20:10
no that should not be a
no that should not be a problem if they overlap.
Christoph
tkulju | Tue, 11/01/2011 - 09:37
Hi Christoph! I think this is
Hi Christoph!
I think this is something to do with parallelization. I ran the same case (rotating mesh with Hertzian cohesion), and at serial version I didin't get this " WARNING: Dangerous build in triangle neighbor list."-message. So for my knowledge/intuition it has something to do with inter-processor communication, so should I increase/set the
cutoff
value?- Timo
ckloss | Tue, 11/01/2011 - 10:32
well increasing the cutoff
well increasing the cutoff value could help. anyway, this behavior is not to be expected so I would have a look at this. is this the case you sent me a few days ago?
Christoph
tkulju | Tue, 11/01/2011 - 11:38
Hi! The case I sent you was a
Hi!
The case I sent you was a bad imitation of my real case. I'll try create a better one and send it to you.
- Timo
tkulju | Fri, 11/04/2011 - 14:07
Hi! I haven't been able to
Hi!
I haven't been able to reproduce it in any of the simpler geometries, but the thing seems to be mesh-related. By modifying the mesh, it seems to go away. My geometry contains lots of sharp edges and high curvature. The "failed" meshes contained some long and narrow i.e. high aspect ratio faces , which in my understanding is bad.
Could it be possible in gran/mesh command include some information about aspect ratio or do you know how I can get the information e.g. in gmsh?
- Timo
ckloss | Sun, 11/06/2011 - 14:02
The aspect ratio is checked
The aspect ratio is checked in fix mesh/gran, and a warning is issued if the aspect ratio is >10 I think. The code is a bit of a mess (we are currently re-writing it), but if you search for error->warning in fix_mesh_gran.cpp, you should be able to find the place in the code.
Cheers, Christoph