Error: compute_multiplcities

General discussion of the Cambridge quantum Monte Carlo code CASINO; how to install and setup; how to use it; what it does; applications.
Post Reply
varelse
Posts: 44
Joined: Mon Jun 10, 2013 10:17 pm

Error: compute_multiplcities

Post by varelse »

I encountered the following error after about 5000 DMC steps.
ERROR : COMPUTE_MULTIPLICITIES
A computed multiplicity is bigger than the largest representable integer on
this machine. Likely you have a really bad trial function.

Is it really that bad? For smaller timesteps i don't get such problems (although the calculations are in progress). If not, what can be the reason of such error?
You may need to take a look on input and out files.
input
out
correlation.data
gwfn.data
Mike Towler
Posts: 240
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error: compute_multiplcities

Post by Mike Towler »

Hi Blazej,

Your links don't point to the right files..? (input --> out, out--> correlation.data etc.., so that we end up missing the input file)

Mike
varelse
Posts: 44
Joined: Mon Jun 10, 2013 10:17 pm

Re: Error: compute_multiplcities

Post by varelse »

Ok, here is the input. Btw I tried to attach it to the post, but the forum does not like the extensions.
Mike Towler
Posts: 240
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error: compute_multiplcities

Post by Mike Towler »

Hi Blazej,
..the forum does not like the extensions.
It does, but you have to gzip them as it believes that plain text files could be malware (see the 'How to use these forums' post in 'General Announcements').
Is it really that bad? For smaller timesteps i don't get such problems (although the calculations are in progress). If not, what can be the reason of such error?
Your run is far too long for me to run on 16 cores, but it looks like a perfectly ordinary population catastrophe to me (ordinary catastrophes! you don't hear that very often..).

Can you post the graphs obtained from running 'graphdmc' on the dmc.hist file so we can verify?

Although population catastrophes shouldn't occur under normal circumstances, they can be made more likely by various things - such as certain non-local pseudopotentials, non cusp-corrected Gaussian basis, too-severely truncated localized orbitals - none of these apply to you.

Another possibility is inadequate trial wave functions resulting from an optimization procedure that wasn't done properly. but even if it was done properly, note that the likelihood of population explosions can be reduced if you use unreweighted variance minimization rather than energy minimization to optimize the wave function. The reason for this is related to the fact that energy minimisation doesn't care much about what happens near nodes, since those regions do not contribute much to the energy expectation value. However, the divergent local energies there make a big difference to the stability of the DMC algorithm.

Also, even if none of the above were true, and you had a very low probability (let's say one move in five million) of encountering a catastrophe, the fact that you're running a ten million move simulation (!) means it's extremely likely to happen almost at once (cf. Infinite Improbability Drive..). Also, catastrophes are much more likely to happen for large timesteps (and you say it doesn't happen for small timesteps). Ergo.. use smaller ones. - but that said your current value of 0.003 doesn't seem excessive..

This ought to be manageable if you turned on automatic block-resetting (which detects when a catastophe occurs and rewinds backwards to an earlier point in order to try again with a different random number sequence) but you don't appear to have done this. See the section in the manual about 'Automatic block resetting' or type 'casinohelp dmc_trip_weight' and have a go.

The compute_multiplicities error message could possibly be a bit more clear - I'll try to tidy it up..

Some minor notes:

(1) can you really only afford 16 cores? That's not normally considered enough to do a 144 electron system... No wonder you want to do 10000000 moves!

(2) Blocks are more for your convenience than the computers (except, of course, for block_resetting), and having too many of them will slow down the code considerably, as it has to write to disk at the end of every block.. You might in fact like to switch over too my new 'BLOCK TIME' system that I implemented last week, where you can specify the approximate time interval that should separate blocks rather than by saying how many moves consitutute a block as currently.. I may even remove vmc_nblock, dmc_equil_nblock, and dmc_stats_nblock from the example input files in favour of block_time.. Year Zero, or what?

M.
varelse
Posts: 44
Joined: Mon Jun 10, 2013 10:17 pm

Re: Error: compute_multiplcities

Post by varelse »

Ok, I attach the graphs.
Mike Towler wrote: but it looks like a perfectly ordinary population catastrophe to me (ordinary catastrophes! you don't hear that very often..
Why it didn't throw a standard "population explosion" error, as it usually does, (I am just curious - did something else happen because of population explosion)?
Mike Towler wrote:(1) can you really only afford 16 cores? That's not normally considered enough to do a 144 electron system... No wonder you want to do 10000000 moves!
. On this machine yes, but usually I do it on Supernova cluster with many more, I just decided to run a trial calculation for this timestep while Supernova was busy. In fact I didn't want to do 10 000 000 moves, it remainded from some another input which I copied and changed parameters, I'll just kill it when necessary, 10 000 000 would last for years.
Mike Towler wrote:(2) Blocks are more for your convenience than the computers (except, of course, for block_resetting), and having too many of them will slow down the code considerably, as it has to write to disk at the end of every block.
.
Ok, I should use less of them, I made that many to get frequent updates of calculation, and also, if I got a population explosion, to restart the calculation from possibly furthest point, not to have to repeat much of it.
Attachments
graphdmc.png
graphdmc.png (20.76 KiB) Viewed 24565 times
Mike Towler
Posts: 240
Joined: Thu May 30, 2013 11:03 pm
Location: Florence
Contact:

Re: Error: compute_multiplcities

Post by Mike Towler »

Hi Blazej,
Why it didn't throw a standard "population explosion" error, as it usually does
OK - for a given configuration, it moves the electrons, works out the branching factor from the local energies, reference energy, and effective time step. It then computes the multiplicity (the number of copies of this config that will continue to the next iteration) as INT ( random number + branching factor). Before it does the conversion from real to integer type represented by INT, it checks that the result will not be bigger than the largest representable integer on the machine (which for regular 32-bit integers is 2147483647 or something like that). In your case, it is bigger, and that's why it's moaning.

So in some sense, it's catching the population explosion before it starts, because obviously something must have gone wrong if a config wants to take 2 billion copies of itself, and you won't want to let it even begin to try doing that.. (this is why you can't see it in the graphdmc plot). Normal population explosions are detected by imposing a hard limit -- 5 times the target weight for CASINO, if I remember correctly -- on the iteration weight (total population), but it does need to be able to represent the number in memory before it can do that check! You can manually define the factor 5 as dmc_trip_weight in input (usually we recommened it being set to be 2-3 times the target population dmc_target_weight, and if you do this the automatic block resetting is automatically turned on so that it can attempt recovery from the explosion.

I have to say I'm not quite sure why it's allowing the multiplicity to get that big, given there are automatic limiters on the local energy and so on. It might be worth repeating exactly the same calculation with some if(block==45) write statements at the appropriate point..
Ok, I should use less of them, I made that many to get frequent updates of calculation, and also, if I got a population explosion, to restart the calculation from possibly furthest point, not to have to repeat much of it.
OK - so why are you not using setting dmc_trip_weight > 0 in input to turn on the automatic protection?

Use block_time! I need someone to test it for me anyway..
10 000 000 would last for years.
In CASINO's defence, at the rate it was going, 10 million moves would have completed in two weeks or thereabouts.

M.
Post Reply