charmm scalability on 2.4 kernels
bogdan.costescu at iwr.uni-heidelberg.de
Tue Jan 8 10:32:18 PST 2002
On Mon, 7 Jan 2002, Greg Lindahl wrote:
> On Mon, Jan 07, 2002 at 07:22:39PM +0100, Tru wrote:
> > I think the bad speedup comes from dual VS single cpu nodes
> > regarding parallel behaviour of CHARMM.
> If so, that's easy enough to check: You can run only one process on a
> dual cpu node, for benchmarking purposes.
That is actually what I have observed during the last 3 years of running
different versions of kernels, MPI libraries and CHARMM. Running using
only one transport (TCP or shared mem) is always better than mixing them,
f.e (using LAM-6.5.6):
CPUs nodes real time (min) transports
4 4 5.95 TCP
4 2 7.08 TCP+USYSV
As you can see, the difference is quite significant.
With 8 single CPU nodes over FE, the scalability of PME goes down to 50%;
using Myrinet (with SCore), it's around 75% - so the algorithm is not
quite Beowulf-friendly. However, I haven't noticed any significant change
in scalability between runs with 2.2.x and 2.4.x kernels.
I obtained a behaviour similar with that from the graphs when I used TCP
as IPC on the same node instead of shared memory. Apropos, could the
zero-copy kernel stuff be used to improve this situation ?
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De
More information about the Beowulf