noob vs lpp, huge file

Submitted by tdl on Tue, 02/28/2012 - 14:53

Dear all,
I have performed a test simulation that produced an enormous dump file (8gb).
The problem is lpp hangs and python goes to "sleep" after a few minutes of
intense calculations. No file gets written at all.
I was wondering if it is my own fault for writing just one large dump file, or
something? Can somebody give me some advice?
Thank you,
AS

Philippe's picture

Philippe | Tue, 02/28/2012 - 16:03

Yep, it is your fault ;) lpp is not designed for processing one large dump file but for converting a set of files, and a dump file with 8GB is very likely not to fit into the RAM of your computer, so it is a no-go anyway.

tdl | Tue, 02/28/2012 - 16:32

ah bugger :-p
I had already the idea I was doing it wrong.
Will rerun with multiple dumps... nice things
to learn :-)
PS: can the big hulk be salvaged?

tdl | Wed, 02/29/2012 - 11:38

Thank you, actually I could salvage the test file following the instructions on the page you linked.

In my data set the dump is refreshed at each N timesteps, therefore each new block starts with an heading "ITEM: TIMESTEPS".
I used this keyword with awk to write this one-liner that splits one input file at each occurrence of the keyword.

Suppose your dump file is called "test.dat", then this command splits your file in smaller tempN files progressively numbered

cat test.dat | awk -v RS="ITEM: TIMESTEP" 'NR > 1 { print RS $0 > "temp" (NR-1); close("temp" (NR-1)) }'

Besides, I am running other test and I have noticed that writing smaller dump files makes the simulation run faster than having one single large dump (i am speaking of really large dump files here).

Hope this can help others :-)
Thanks
AS

sdalbign | Sun, 06/17/2012 - 01:36

I missed something out there !
I obtain a big dump file (26GB), and I agree with you guys, let's split it into several files !
But how to directly dump multiple files with liggghts while running ???

Philippe's picture

Philippe | Wed, 06/27/2012 - 12:22

Use a dump command like

dump 4a all custom 100 dump.myforce.* id type x y vx fx

The "*" will be replaced by the time step number - for details, please check the documentation.