Hi,
HAPPY NEW YEAR ;)
I have successfully compiled balance.cpp and balance.h and then I recompiled CFDEM to introduce balance command to it. Now static load balancing command is activated in LIGGGHTS-PUBLIC and It works very well. Also I check it on the Cluster and everything seems fine.
But when I tried to run CFDEM I faced with the following errors. Then I separably checked ligghts in two state : 1. without restart file 2. with restart file. It seems when information read from restart file liggghts crashes.
Any one has any idea ???
Thanks,
Ebrahim
Error :
# granular walls
#================================================================================================
#fix forwardwall all mesh/surface file ../DEM/geometery/forwardwall.stl type 1 heal auto_remove_duplicates surface_vel 0. 0. 0. curvature 1e-5
fix splineroof all mesh/surface file ../DEM/geometery/splineroof.stl type 1 heal auto_remove_duplicates surface_vel 0. 0. 0. curvature 1e-5
[n06-98:23087] *** Process received signal ***
[n06-98:23087] Signal: Segmentation fault (11)
[n06-98:23087] Signal code: Address not mapped (1)
[n06-98:23087] Failing at address: 0x8
[n06-98:23089] *** Process received signal ***
[n06-98:23089] Signal: Segmentation fault (11)
[n06-98:23089] Signal code: Address not mapped (1)
[n06-98:23089] Failing at address: 0x8
[n06-98:23083] *** Process received signal ***
[n06-98:23083] Signal: Segmentation fault (11)
[n06-98:23083] Signal code: Address not mapped (1)
[n06-98:23083] Failing at address: 0x8
Resetting global state of Fix splineroof Style mesh/surface from restart file info
fix bottomwall all mesh/surface file ../DEM/geometery/bottomwall.stl type 2 heal auto_remove_duplicates surface_vel 0. 0. 0. curvature 1e-5
Resetting global state of Fix bottomwall Style mesh/surface from restart file info
fix wall all wall/gran/hertz/history mesh n_meshes 2 meshes splineroof bottomwall
Resetting global state of Fix tracker_splineroof Style contacthistory/mesh from restart file info
[n06-98:23089] [ 0] /lib64/libpthread.so.0(+0xf500) [0x2b4d3de2f500]
[n06-98:23089] [ 1] liggghts(_ZN9LAMMPS_NS17FixContactHistory14unpack_restartEii+0x1bc) [0x5b5bfc]
[n06-98:23089] [ 2] liggghts(_ZN9LAMMPS_NS6Modify7add_fixEiPPcS1_+0xf21) [0x6ecad1]
[n06-98:23089] [ 3] liggghts(_ZN9LAMMPS_NS14FixMeshSurface20createContactHistoryEi+0x109) [0x5e1f69]
[n06-98:23089] [ 4] liggghts(_ZN9LAMMPS_NS11FixWallGran11post_createEv+0x98) [0x67bec8]
[n06-98:23089] [ 5] liggghts(_ZN9LAMMPS_NS6Modify7add_fixEiPPcS1_+0xfe6) [0x6ecb96]
[n06-98:23089] [ 6] liggghts(_ZN9LAMMPS_NS5Input15execute_commandEv+0xe13) [0x6ae393]
[n06-98:23089] [ 7] liggghts(_ZN9LAMMPS_NS5Input4fileEv+0x520) [0x6aea30]
[n06-98:23089] [ 8] liggghts(main+0x4b) [0x6bdd0b]
[n06-98:23089] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) [0x2b4d3e05ecdd]
[n06-98:23089] [10] liggghts() [0x487bf9]
[n06-98:23089] *** End of error message ***
reply
e.derakhshani | Mon, 12/31/2012 - 19:50
Hi,
Hi,
I made the following changes in the LIGGGHTS and temporarily static load balancing works with coupling method.
A quick fix would be to add :
1. in the multi_node_mesh_parallel.h (thx Christoph):
virtual bool addElement(double **nodeToAdd,int lineNumb) = 0;
2. in the multi_node_mesh_parallel_buffer_I.h (thx Christoph):
- this->addElement(nodeTmp.begin()[0]);
+ this->addElement(nodeTmp.begin()[0],-1);
3. In the tracking_mesh_I.h (thx Ajith ):
Line 65: bool TrackingMesh::addElement(double **nodeToAdd,int lineNumb)
Line 69: if(MultiNodeMeshParallel::addElement(nodeToAdd,lineNumb))
4. in the multi_node_mesh_parallel_I.h (thx Ajith):
Line 104: bool MultiNodeMeshParallel::addElement(double **nodeToAdd,int lineNumb)
But in my case when I am trying to run on the cluster with more than 8 processor the same errors which already mentions is produced !!!
Any idea ?
Regards,
Ebrahim
e.derakhshani | Wed, 01/02/2013 - 03:49
Upgrading static load balancing to Dynamic
Hi,
In the "liggghtsCommands" I have added the following command and by this trick Static Load balancing is upgraded to Dynamic load Balancing.
executeProps1
{
command
(
balance
dynamic
xz
$couplingInterval
$couplingInterval
out
tmp.balance
);
runFirst off;
runLast off;
runEveryCouplingStep on;
runEveryWriteStep off;
}
But I don't know how to add number in the command part !!!
At the mean time for checking command I am using "$couplingInterval" instead of Niter & thresh in the balance command :)
Any one can help me ?
Regards,
Ebrahim
PS : My earlier questions are still open. I have recommend to read them if you would like to decrease your computational time.
e.derakhshani | Thu, 01/03/2013 - 01:25
Hi,
Hi,
I made some change in the execute.C and the problem solved.
But that method is only temporary solution.
I am still looking forward to know more how symbols and numbers can be added to the command part ?
Regards,
Ebrahim
ckloss | Wed, 01/02/2013 - 23:54
Hi Ebrahim,
Hi Ebrahim,
which version of LIGGGHTS are you using and what modifications to the code did you make?
Christoph
e.derakhshani | Thu, 01/03/2013 - 01:17
Hi Christoph,
Hi Christoph,
I am using the following versions:
LIGGGHTS-PUBLIC 2.2.3
CFDEMcoupling version: cfdem-2.4.4
I should mention I know load balancing is not active on LIGGGHTS but I have activated my self. I did copy -past balance.cpp & balance.h in the src directory and then recompiled LIGGGHTS.
At the mean time it is working very well and I also did some benchmarks and computational time increase so significant.
But it does not work with more than 8 Proc. in my case and when I decrease mesh size the following error appears :
ERROR: Mesh elements have been lost (multi_node_mesh_parallel_I.h:610)
Best regards,
Ebrahim
ckloss | Thu, 01/03/2013 - 13:51
>>I should mention I know
>>I should mention I know load balancing is not active on LIGGGHTS but I have activated my self.
>>I did copy -past balance.cpp & balance.h in the src directory and then recompiled LIGGGHTS.
the balance command has been intentionally removed as it doesn't work with some LIGGGHTS features.
You can try to use it, but on your own risk
Cheers,
Christoph
e.derakhshani | Thu, 01/03/2013 - 14:18
>> it doesn't work with some
>> it doesn't work with some LIGGGHTS features.
Do you know why it does not work with more than 8 processor ?
Is it related to the LIGGGHTS features ?
Regards,
Ebrahim
e.derakhshani | Fri, 01/04/2013 - 23:34
Running liggghts with mpi
Hi,
When I am trying to run liggghts with mpi (more than 8 processor) I get this error :
fix wall all wall/gran/hertz/history mesh n_meshes 1 meshes bottomwall
Resetting global state of Fix tracker_bottomwall Style contacthistory/mesh from restart file info
[n06-98:21213] *** Process received signal ***
[n06-98:21213] Signal: Segmentation fault (11)
[n06-98:21213] Signal code: cfd
(1)
[n06-98:21213] Failing at address: 0x8
[n06-98:21213] [ 0] /lib64/libpthread.so.0(+0xf500) [0x2aed80707500]
[n06-98:21213] [ 1] liggghts(_ZN9LAMMPS_NS17FixContactHistory14unpack_restartEii+0x1bc) [0x5b5bfc]
[n06-98:21213] [ 2] liggghts(_ZN9LAMMPS_NS6Modify7add_fixEiPPcS1_+0xf21) [0x6ecad1]
[n06-98:21213] [ 3] liggghts(_ZN9LAMMPS_NS14FixMeshSurface20createContactHistoryEi+0x109) [0x5e1f69]
[n06-98:21213] [ 4] liggghts(_ZN9LAMMPS_NS11FixWallGran11post_createEv+0x98) [0x67bec8]
[n06-98:21213] [ 5] liggghts(_ZN9LAMMPS_NS6Modify7add_fixEiPPcS1_+0xfe6) [0x6ecb96]
[n06-98:21213] [ 6] liggghts(_ZN9LAMMPS_NS5Input15execute_commandEv+0xe13) [0x6ae393]
[n06-98:21213] [ 7] liggghts(_ZN9LAMMPS_NS5Input4fileEv+0x520) [0x6aea30]
[n06-98:21213] [ 8] liggghts(main+0x4b) [0x6bdd0b]
[n06-98:21213] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) [0x2aed80936cdd]
[n06-98:21213] [10] liggghts() [0x487bf9]
[n06-98:21213] *** End of error message ***
I think it is related to the fix wall/gran/hertz/history !!!
Does any one know anything about the reason of this error ?
Thanks,
Ebrahim
ckloss | Fri, 01/11/2013 - 09:31
Static load balancing is not
>>I did copy -past balance.cpp & balance.h in the src directory and then recompiled LIGGGHTS.
Static load balancing is not part of the LIGGGHTS-PUBLIC release (for good reasons). If you add files to the release and compile it, you do this at your own risk
Cheers,
Christoph