[critical bug] Illegal situation in MultiNodeMesh

richti83's picture
Submitted by richti83 on Thu, 11/06/2014 - 13:21

LIGGGHTS 3.0.5 introduces a bug in multi_node_mesh_I.h when using mesh mover rotate in multicore mode.
Simply run examples/LIGGGHTS/Tutorials_public/movingMeshGran with np 12 or more to reproduce this issue.

ERROR on proc 0: Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:327)
ERROR on proc 2: Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:327)
ERROR on proc 1: Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:327)
ERROR on proc 3: Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:327)

Hope this will be fixed soon,
Best,
Christian.

aaigner's picture

aaigner | Thu, 11/06/2014 - 16:53

Hi Christian,

thanks for the hint. I opened an issue and have a look on it.

Bests,
Andreas

ckloss's picture

ckloss | Fri, 11/07/2014 - 12:38

Hi Christian,

thanks also from my side. We'll follow up with a solution shortly (next week)

Best wishes
Christoph

ckloss's picture

ckloss | Tue, 11/11/2014 - 11:33

Hi Christian,

should be fixed with 3.0.6 released just now!

best wishes
Christoph

RobertG | Mon, 11/17/2014 - 17:08

Dear Chrisroph,
I have LIGGGHTS 3.0.6 installed and have still the same problem, as you can see:

>>--------------------------------------------------------------------------
>>MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>with errorcode 1.

>>NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>You may or may not see output from other processes, depending on
>>exactly when Open MPI kills them.
>>--------------------------------------------------------------------------
>>LIGGGHTS (Version LIGGGHTS-PUBLIC 3.0.6, compiled 2014-11-17-15:11:00 by robert based on LAMMPS 23 Nov 2013)
>>Created orthogonal box = (-1.556 -0.1 -0.1) to (0.086 0.1 0.165)
>> 1 by 1 by 1 MPI processor grid
>>0 atoms in group grp1
>>0 atoms in group grp2
>>Fix particledistribution/discrete (id pdd): distribution based on mass%:
>> pts: d=1.000000e-03 (max. bounding sphere) mass%=100.000000%
>>Fix particledistribution/discrete (id pdd): distribution based on number%:
>> pts: d=1.000000e-03 (max. bounding sphere) number%=100.000000%
>>Fix particledistribution/discrete (id pdd2): distribution based on mass%:
>> pts2: d=1.400000e-03 (max. bounding sphere) mass%=100.000000%
>>Fix particledistribution/discrete (id pdd2): distribution based on number%:
>> pts2: d=1.400000e-03 (max. bounding sphere) number%=100.000000%

>>Reading STL file 'CAD/Wand2.stl'

>>Reading STL file 'CAD/Welle_l.stl'

>>Reading STL file 'CAD/Welle_r.stl'
>>ERROR on proc 0: Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:327)

when I run it without the move/mesh command it works and shows this.

>>LIGGGHTS (Version LIGGGHTS-PUBLIC 3.0.6, compiled 2014-11-17-15:11:00 by robert based on LAMMPS 23 Nov 2013)
>>Created orthogonal box = (-1.556 -0.1 -0.1) to (0.086 0.1 0.165)
>> 1 by 1 by 1 MPI processor grid
>>0 atoms in group grp1
>>0 atoms in group grp2
>>Fix particledistribution/discrete (id pdd): distribution based on mass%:
>> pts: d=1.000000e-03 (max. bounding sphere) mass%=100.000000%
>>Fix particledistribution/discrete (id pdd): distribution based on number%:
>> pts: d=1.000000e-03 (max. bounding sphere) number%=100.000000%
>>Fix particledistribution/discrete (id pdd2): distribution based on mass%:
>> pts2: d=1.400000e-03 (max. bounding sphere) mass%=100.000000%
>>Fix particledistribution/discrete (id pdd2): distribution based on number%:
>> pts2: d=1.400000e-03 (max. bounding sphere) number%=100.000000%

>>Reading STL file './CAD/Wand2.stl'

>>Reading STL file './CAD/Welle_l.stl'

>>Reading STL file './CAD/Welle_r.stl'
>>Setting up run ...
>>Mesh cad1: 153 elements have high aspect ratio (angle < 0.500000 °)
>>WARNING: Fix mesh: Mesh contains highly skewed element, moving mesh (if used) will not parallelize well (../surface_mesh_I.h:481)
>>Import and parallelization of mesh cad1 containing 1032 triangle(s) successful
>>Import and parallelization of mesh cad2 containing 38888 triangle(s) successful
>>Import and parallelization of mesh cad3 containing 40074 triangle(s) successful
>>INFO: Resetting random generator for region factory
>>INFO: Particle insertion ins1: 811.142798 particles every 1000 steps - particle rate 81114.279786 (mass rate 0.333400)
>>INFO: Resetting random generator for region factory2
>>INFO: Particle insertion ins2: 55.243536 particles every 1000 steps - particle rate 5524.353585 (mass rate 0.002778)
>>Memory usage per processor = 84.7243 Mbytes
>>Step Atoms KinEng CPU
>> 0 0 -0 0

Everything worked with version 3.0.3 pretty fine.
I hope you can help me.

Best regards
RobertG

RobertG | Thu, 11/20/2014 - 15:20

I have got an example for you.
Run it with the comand
mpirun.openmpi -np 4 LIGGGHTS < in.error

use the continious mixer .stl-files from the tutorial.

Hope that helps getting the error.

Best regards
RobertG

### Continuous Blending Mixer Simulation
### This simulation involves inserting a stream of particles at one end of a continuous
### blending mixer, conveying and mixing the material along the bed of the mixer, and then
### discharging the material through a chute at the other end of the unit.

### Initialization

# Preliminaries
units si
atom_style sphere
boundary m m m
newton off
communicate single vel yes

neighbor 0.002 bin
neigh_modify delay 0

# Declare domain
# x1 x2 y1 y2 z1 z2
region reg block -0.320 0.320 -1.960 1.460 -0.290 0.335 units box
create_box 2 reg

### Setup

# Material and interaction properties required
fix m1 all property/global youngsModulus peratomtype 2.1e7 1e7
fix m2 all property/global poissonsRatio peratomtype 0.27 0.035
fix m3 all property/global coefficientRestitution peratomtypepair 2 0.5 0.051 0.051 0.051
fix m4 all property/global coefficientFriction peratomtypepair 2 0.12 0.5 0.5 0.34
fix m5 all property/global coefficientRollingFriction peratomtypepair 2 0.02 0.06 0.06 0.08

# Particle insertion
region factory block -0.225 0.225 -1.650 -1.450 0.3 0.33 units box
region factory2 block -0.225 0.225 -1.650 -1.450 0.26 0.29 units box

fix pts all particletemplate/sphere 1 atom_type 1 density constant 7850 &
radius constant 0.0005
fix pdd all particledistribution/discrete 14127 1 pts 1.0

fix pts2 all particletemplate/sphere 1 atom_type 2 density constant 350 &
radius constant 0.0007
fix pdd2 all particledistribution/discrete 14131 1 pts2 1.0

fix ins all insert/rate/region seed 132412 distributiontemplate pdd &
nparticles 10000000 massrate 1.66666667 insert_every 1000 &
overlapcheck yes vel constant 0. 0. -0.5 region factory ntry_mc 10000

fix ins2 all insert/rate/region seed 132413 distributiontemplate pdd2 &
nparticles 10000000 massrate 1.66666667 insert_every 1000 &
overlapcheck yes vel constant 0. 0. -0.5. region factory2 ntry_mc 10000

# Import mesh from cad:
fix cad1 all mesh/surface file trough2.stl type 1 scale 0.001 curvature 1e-5
fix cad2 all mesh/surface file left_shaft2.stl type 1 scale 0.001 curvature 1e-5
fix cad3 all mesh/surface file right_shaft2.stl type 1 scale 0.001 curvature 1e-5

# Use the imported mesh as granular wall
fix mixer all wall/gran model hertz tangential history mesh n_meshes 3 meshes cad1 cad2 cad3 #rolling_friction cdt

# Define the physics
pair_style gran model hertz tangential history rolling_friction cdt
pair_coeff * *

### Detailed settings

# Integrator
fix integrate all nve/sphere

# Gravity
fix grav all gravity 9.81 vector 0.0 0.0 -1.0

# Timestep (keep < 20% T_Rayleigh)
timestep 0.00001

# Thermodynamic output settings
thermo_style custom step atoms ke cpu
thermo 1600
thermo_modify lost ignore norm no

# Rotate the shafts
fix movecad1 all move/mesh mesh cad2 rotate origin 0. -0.0175 0 &
axis 1. 0. 0. period 0.5 # 120 RPM
fix movecad2 all move/mesh mesh cad3 rotate origin 0. 0.0175 0 &
axis 1. 0. 0. period 0.5 # 120 RPM

# Check time step and initialize dump file
fix ctg all check/timestep/gran 1 0.01 0.01
run 1
unfix ctg

# Create imaging information
#dump dumpstl1 all mesh/stl 100 */res/trough*.stl cad1
#dump dumpstl2 all mesh/stl 100 */res/shaft*.stl cad2 cad3
#dump dmp all custom 100 */res/part*.mesh id type type x y z ix iy iz vx vy vz fx fy fz omegax omegay omegaz radius

### Execution and further settings

# Run to 6 sec to equilibrate system
run 10000000 upto

pfalkingham | Thu, 11/27/2014 - 16:09

I've been getting this bug for a while - but had originally thought it was my own input scripts that were broken.

I'm now running 3.0.6 and I'm still getting the same error (Illegal situation in MultiNodeMesh::registerMove), but I think it's only when I'm starting up from a restart file (the example mesh-mover seems to work fine). if move/mesh commands are removed, simulation runs fine (albeit with meshes not moving!)

RobertG | Mon, 12/01/2014 - 10:49

Hello pfalkingham,
I'm getting it the hole time.

Best Regards
RobertG

pfalkingham | Fri, 11/28/2014 - 13:21

I've two input files (plus STLS), an initiation file and a post-setup file. No errors with the initiation file, but post-setup file errors out with 'Illegal situation in MultiNodeMesh' (both with and without the meshes present in the initial file.

Hope these are useful (even if it's just pointing out if I'm being dim about something):

initiation file: http://1drv.ms/1txLECa
post-setup run file: http://1drv.ms/1txLHy3
mesh 1: http://1drv.ms/1txLNpq
mesh2: http://1drv.ms/1txLMBE

aaigner's picture

aaigner | Fri, 12/19/2014 - 11:04

Hi guys,

unfortunately we introduced another bug by fixing the original one. That's why the fix mesh/surface/stress/servo is broken at all and to use the move/mesh command you need a short work around.
Just add a run 0 before your first move/mesh and it should work. I tested it with your example, pfalkingham. Thanks for that.

We will fix this bug in the next release. (as soon as possible)

Best regards
Andreas

m.fatahi's picture

m.fatahi | Sat, 12/27/2014 - 15:06

Hi Andreas,
I compiled LIGGGHTS 3.0.6 and when I ran my simulation case with a move/mesh command the same error appeared. without move/mesh command it worked.
I tried to add a run 0 before move/mesh but it did not work.
where did you mean exactly to add run 0?
I think I should compile earlier version of LIGGGHTS for example 3.0.3.
Regards
Mohammad

litomec | Mon, 12/29/2014 - 19:36

Hi, I have the same problem, I have version 3.0.6 of liggghts, and throws the error "Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:327)"

I tried to run 0 and yields the same problem.

They found a solution?

Regards

Franklin Bettencourt | Wed, 11/16/2016 - 22:13

I have the same issue with liggghts version 3.3.1 I am trying to run it in parallel on a cluster but it keeps failing with the following error

ERROR on proc 1: Illegal situation in MultiNodeMesh::registerMove (../multi_node_mesh_I.h:352)