[Beowulf] clustering using xen virtualized machines

Tim Cutts tjrc at sanger.ac.uk
Tue Jan 26 05:24:33 PST 2010

On 26 Jan 2010, at 12:00 pm, Jonathan Aquilina wrote:

> does anyone have any benchmarks for I/O in a virtualized cluster?

I don't have formal benchmarks, but I can tell you what I see on my  
VMware virtual machines in general:

Network I/O is reasonably fast - there's some additional latency, but  
nothing particularly severe.  VMware can special-case communication  
between VMs on the same physical host, if required, but that reduces  
flexibility in moving the VMs around.

Disk I/O is fairly poor, especially once the number of virtual  
machines becomes large.  This is hardly surprising - the VMs are  
contending for shared resources, and there's bound to be more  
contention in a virtualised setup than in physical machines.

In our case (~170 virtual machines running on 9 physical servers, each  
of which has dual GigE for VM traffic and dual port fibrechannel)

Forgive me for using VMware parlance rather than Xen, but hopefully  
the ideas will be the same.  Here are a few things I've noted:

1)  Applications with I/O patterns of large numbers of small disk  
operations are particularly painful (such as our ganglia server with  
all its thousands of tiny updates to RRD files).  We've mitigated this  
by configuring Linux on this guest to allow a much larger proportion  
of dirty pages than usual, and to not flush to disk quite so often.   
OK, so I risk losing more data if the VM goes pop, but this is just  
ganglia graphing, so I don't really care too much in that particular  

2)  Raw device maps (where you pass a LUN straight through to a single  
virtual machine, rather than carving the disk out of a datastore)  
reduce contention and increase performance somewhat, at the cost of  
using up device minor numbers on ESX quite quickly; because ESX is  
basically Linux, you're limited to 256 (I think - it might be 128)  
LUNs presented to each host, and probably to each cluster, since VMs  
need to be able to migrate.  I basically use RDMs for database  
applications where the storage requirements are greater than about 500  
GB.  For less than that I use datastores.

3)  Keep the number of virtual machines per datastore quite low,  
especially if the applications are I/O heavy, to reduce contention.

4)  In an ideal world I'd spread the datastores over a larger number  
of RAID units than I currently have, but my budget can't stand that.

All this is rather dependent of course on what technology you're using  
to provide storage to your virtual machines.  We're using  
fibrechannel, but of course mileage may vary considerably if you use  
NAS or iSCSI, and depending on how many NICs you're bonding together  
to get bandwidth.

 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

More information about the Beowulf mailing list