[Beowulf] Barcelona vs. Woodcrest, computational chemistry research

Guilherme Menegon Arantes garantes at iq.usp.br
Thu Sep 27 14:19:31 PDT 2007

Sep 27, 2007 at 12:01:32PM -0700, richard.walsh at comcast.net wrote:
> Gaussian and GAMESS algs. typically do not store integrals unless they fit into memory.  It has been more efficient to just recompute them than to store them.  

For some reason my previous email did not make it through the list yet,
but I will keep answering this thread, wishing my posts will eventually


You are only right about a SCF calculation. This method of recomputing
(ERI) integrals in each SCF step is called Direct SCF by the quantum 
chemistry community.

However, Andrew (IIRC the orginal poster) told us he will be running
multi-reference jobs. It is different business here because you will be
carrying some sort of Configuration Interaction (CI) and, even though
there is also a direct CI method, you do have to calculate ERI 
integrals, transform and store them.

This transformation will fit in core only for the very small problems,
and usually you already need some decent disk-IO in this step. Actually
depending on the size and details of calculation, matrices become so
large that you need 100s of Gb, so fast disk-IO is a must.

> So if you were working on smaller systems at modest levels of theory you could put all the integrals in core and bandwidth would be very important. 

The smaller systems/problems that would fit in core (and finish in a
matter of seconds) would probably take minutes to finish doing a direct
(or semi-direct) method (recomputing), so no one needs a fancy 
workstation for them nowadays.

> If you were working with larger systems and higher levels of theory you probably would have to resort to recomuting the integrals and therefore would be less concerned about bandwidth and more focused on clock and pure floating point.   Your probably job mix and benchmarking it will be key in making the right choice.

Remember that scalling with most QC methods is very steep (from O(N**3) 
to O(N**7))! So, depending on the size and level, even direct computations
will not sufice and disk-IO will be very important as well. 



Guilherme Menegon Arantes, PhD       São Paulo, Brasil

More information about the Beowulf mailing list