[Beowulf] Parallel application performance tests

Tony Ladd ladd at che.ufl.edu
Tue Nov 28 10:27:10 PST 2006

I have recently completed a number of performance tests on a Beowulf
cluster, using up to 48 dual-core P4D nodes, connected by an Extreme
Networks Gigabit edge switch. The tests consist of single and multi-node
application benchmarks, including DLPOLY, GROMACS, and VASP, as well as
specific tests of network cards and switches. I used TCP sockets with
OpenMPI v1.2 and MPI/GAMMA over Gigabit ethernet. MPI/GAMMA leads to
significantly better scaling than OpenMPI/TCP in both network tests and in
application benchmarks. The overall performance of the MPI/GAMMA cluster on
a per cpu basis was found to be comparable to a dual-core Opteron cluster
with an Infiniband interconnect. The DLPoly benchmark showed similar scaling
to those reported for an IBM p690. The performance using TCP was typically a
factor of 2 less in these same tests. Here are a couple of examples from the
DLPOLY benchmark 1 (27,000 NaCl ions)

CPUS   OpenMPI/TCP (P4D)   MPI/GAMMA (P4D)  OpenMPI/Infiniband (Opteron 275)

 1		1255			1276
 2		614			635
 4		337			328
 8		184			173
16		125			95
32		82			56
64		84			34

A detailed write up can be found at:

Tony Ladd
Chemical Engineering
University of Florida

Tony Ladd
Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005

Tel: 352-392-6509
FAX: 352-392-9513
Email: tladd at che.ufl.edu
Web: http://ladd.che.ufl.edu 

More information about the Beowulf mailing list