[Beowulf] posting bonnie++ stats from our cluster: any comments about my I/O performance stats?
bmcnally at u.washington.edu
Fri Sep 25 17:02:56 PDT 2009
> One note about bonding/trunking, check it closely to see that it is
> working the way you expect. We have a cluster with 14 racks of 20 nodes
> each rack with a 24 port switch at the top. Each of these switches has
> four ports trunked together back to the core switch. All nodes have two
> GbE ports but only eth0 was being used. It turns out that all eth0 MAC
> addresses in this cluster are even. The hashing algorithm on these
> switches (HP) only uses the last two bits of the MAC address for a total
> of four paths. Since all MAC's were even it went from four choices to
> two so we were only getting half the bandwidth.
I'd second testing to make sure bonding/trunking is working before you
base other performance numbers on it. You may also want to consider
different bonding modes if you have problems with balancing the traffic
Just getting bonding working in an optimal way can take some time. Use
the port counters on your switches in conjunction with counters on your
hosts to make sure traffic is going where you'd expect it to.
> Once the server has the performance you want, I'd use Netcat from a
> number of clients at the same time to see if your network is doing what
> you want. Use netcat and bypass any disks (writing to /dev/null on the
> server and reading from /dev/zero on the client and vica versa) in order
> to test that bonding is working. You should be able to fill up the
> network pipes with aggregate tests from multiple nodes using netcat.
You may also consider using iperf for network testing. I used to do raw
network tests like this but discovered that iperf is often easier to set
up and use.
More information about the Beowulf