[Beowulf] EM64T Clusters
Bill Broadley
bill at cse.ucdavis.edu
Wed Jul 28 16:07:18 PDT 2004
> We've just brought up a test stand with both PCI-X and PCI-E Infiniband
> host channel adapters. Some very preliminary (and sketchy, sorry) test
> results which will be updated occassionally are available at:
>
> http://lqcd.fnal.gov/benchmarks/newib/
Interesting, the listed:
* PCI Express: 4.5 microsec
* PCI-X, "HPC Gold": 7.4 microsec
* PCI-X, Topspin v2.0.0_531: 7.3 microsec
Seem kind of slow to me, I suspect it's mostly the nodes (not pci-x).
I'm using dual opterons, PCI-X, and "HPC Gold" and getting 0.62 seconds:
compute-0-0.local compute-0-1.local
size= 1, 131072 hops, 2 nodes in 0.62 sec ( 4.7 us/hop) 826 KB/sec
My benchmark just does a MPI_Send<->MPI_Recv of a single integer,
increments the integer it and passes it along in a circularly linked list
of nodes. What exact command line arguments did you use with netpipe
I'd like to compare results.
> The PCI Express nodes are based on Abit AA8 motherboards, which have x16
> slots. We used the OpenIB drivers, as supplied by Mellanox in their
> "HPC Gold" package, with Mellanox Infinihost III Ex HCA's.
>
> The PCI-X nodes are a bit dated, but still capable. They are based on
> SuperMicro P4DPE motherboards, which use the E7500 chipset. We used
> Topspin HCA's on these systems, with either the supplied drivers or the
> OpenIB drivers.
>
> I've posted NetPipe graphs (MPI, rdma, and IPoIB) and Pallas MPI
> benchmark results. MPI latencies for the PCI Express systems were about
Are the raw results for your netpipe runs available?
> 4.5 microseconds; for the PCI-X systems, the figure was 7.3
> microseconds. With Pallas, sendrecv() bandwidths peaked at
> approximately 1120 MB/sec on the PCI Express nodes, and about 620 MB/sec
My pci-x nodes do about midway between those numbers:
# Benchmarking Sendrecv
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
524288 80 1249.87 1374.87 1312.37 727.34
1048576 40 2499.78 2499.78 2499.78 800.07
2097152 20 4999.55 5499.45 5249.50 727.35
> I don't have benchmarks for our application posted yet but will do so
> once we add another pair of PCI-E nodes.
I have 10 PCI-X dual opterons and should have 16 real soon if you want
to compare Infiniband+pci-x on nodes that are closer to your pci-express
nodes.
--
Bill Broadley
Computational Science and Engineering
UC Davis
More information about the Beowulf
mailing list