VM and performance (was Re: [Beowulf] best Linux distribution)

Robert G. Brown rgb at phy.duke.edu
Tue Oct 9 08:27:10 PDT 2007


On Tue, 9 Oct 2007, Jeffrey B. Layton wrote:

> The recent emails from rgb and Doug lead me to a question. Has anyone
> tested codes running under a VM versus running them "natively" on
> the hardware (native isn't a good word and I hope everyone gets my
> meaning)?  The last word I heard is that performance takes a substantial
> hit if you are running a code in a VM. Some of the reasons are that the
> code has only virtualized access to the hardware (particularly the NICs)
> and memory management is a bit more difficult (although Barcelona with
> nested page tables should help there). I do remember VirtualIron saying
> that they had IB drivers so that the VM had direct access to the hardware.
>
> Thanks!

I'd actually expect that as always YMMV.  For CPU bound code, I don't
expect that it WOULD take a hit on modern CPUs with hardware support for
virtualization, although yes you'll pay a penalty for adding what
amounts to an additional superscheduler on top of the kernel scheduler
per se.  That is, the toplevel host OS has to get enough cycles that its
scheduler doesn't starve because it may be multitasking the VM(s) with
pretty much whatever you like.  So there is the baseline overhead of a
single instance of Linux as a host OS running the VM manager plus
whatever tasks you choose to run there (which may be none at all).

To give you a very precise idea of the average load associated with the
VM manager itself, VMware has been running on a server I help control
since April 18th (without a system reboot, although it is due to come
down tomorrow for a memory upgrade so you're lucky you asked this
today:-).  For most of the time from May on, it has been running a VM
instance of Linux pretty much full time (the system itself is a dual
CPU, dual core IBM server).  In all that time, VMware has accumulated
about 2100 minutes of CPU, or about a day and a half.  Lessee, May,
June, July, August, September -- call it 150 days, give or take -- and
we see that it has cost around 1% in net overhead.  Not, actually a huge
burden.  However, this system has idle cores at all times and hence does
not thrash.

CPU bound code, especially on a multicore system with a core to devote
to the host OS, probably will not slow down "at all" in a VM partition,
although I can probably test that if you give me a day or three (I'm
working on a multicore laptop, but I don't have a working linux VM on
the system at this particular moment).  My other laptop has VM
Workstation on it but only a single core.  I can probably borrow cycles
on a failover server to run some straight numerical benchmark tests but
not before Thursday or Friday.

Network bound code is a different matter altogether.  VMs present the
running guest OS with a "virtual" hardware layer that is wrapped up to
look like a "standard" widely supported network adapter, one likely to
have readily available drivers in any OS.  In addition, it can insert
itself between the VM's "network" -- an entirely artificial private
internal network -- and the real world, acting as a NAT/gateway for the
VM guest, or it can actually give over the guest interface to e.g.  DHCP
or static addressing on the host system's network.  Clearly there are
differing amounts of work being done by the intermediary host OS in
these cases, but pretty much all of them will add LATENCY to any sort of
connection, and may eat a bit of bandwidth as well.  Again, I'm happy to
run e.g. netpipe to test this, in a couple of days, but I'd predict
1.5-3x the native latency of ethernet and 95% of the bandwidth
accessible to the (unloaded) host OS, lessened by contention and
thrashing caused by multiple VMs sharing an interface.

Disk performance can vary quite a bit, as VMs can mount native
partitions or e.g. NFS partitions with performance that should be at
least comparable to native performance or NFS performance (modified by
network efficiency in the latter case), OR all VM disk can itself be
virtual.  In the case of VMware you have two distinct categories of the
latter -- preallocated disk (the "disk" for the VM is basically a big,
fixed size file) and "growable" disk (the "disk" is still a file, but it
starts out at modest size and then dynamically grows as you fill it up
to some maximum).  In order this is fast (almost as fast as native),
slow, and slower, for again fairly obvious reasons.  Nevertheless, there
are plenty of times one might want to run the VM in a relatively slow
but growable file to minimize impact on your available disk resources
while still in PRINCIPLE being able to create a big file without
completely reprovisioning the entire VM.

VMware (through its better toplevel interface) lets you reprovision
things fairly freely with the exception of the fixed-resizable
transition.  You can add processors (or cores), memory (within sane
bounds) and so on to any quiescent VM.  It is truly a pretty awesome
tool, although IMO it is too expensive by a factor of 3-4.  It is the
usual thing -- at $50 a seat full retail in single-seat quantities, $25
a seat academic for VMware Workstation, I think they'd sell a zillion
seats because it is SO damn useful.  Who wouldn't pay $50 to be able to
never dual boot their box again?  Pop Windows in right where it belongs
as a Linux task, add a few other Linux VMs for e.g. code prototyping or
node prototyping or putting up a webserver or FTP server that CANNOT
compromise your host OS and your valuable data even if an exploit goes
unpatched.  But for $189 list and well over $100 either academic or
bulk, lots and lots of people that might otherwise buy it don't, and
VMware fails to take over the world.

As I like to say, if Sun Microsystems had sold Unix for the Intel
architecture for $50 full list, $25 academic back in 1988 (at which time
it should be noted that they HAD a Unix that would run on the 386 and
where these prices would have undercut Microsoft's prices for DOS), then
Sun would be Microsoft and Microsoft would either be selling unix or out
of business.  Products that never would have been developed include
Windows, OS/2, NT, and -- Linux.  Maybe, just maybe, BSD would have
still survived.  VMware has a similar opportunity now, but the window
(so to speak) is rapidly closing as Microsoft is due to co-opt the
entire VM market any day now using their standard strategy.

VMware has created and proven the market, chip manufacturers have moved
to support it, so MS will now implement their own version of VM, ensure
that it only works well for Windows VM guests (and may not work at all
for other guests), change their licensing to make it more or less
illegal to run Windows guests under other VMs (already underway), use
their sales channels to insert their product cheaply everywhere, sow
some judicious FUD about how their product is secure, reliable, and
supported and its competitors are not (backed up with ominous rumblings
about license violations, DRM, and legal action).  In six months, a year
tops, they own 70% of the market, and its open and closed competitors
spiral down to gradual extinction as virtualization is a key component
of modern server provisioning and failover and not even Linux can hold
its own in the server room if Microsoft (say) makes it a license
violation to run its Server products in a non-Microsoft VM manager.

You heard it here first, folks -- pure crystal ball stuff determined by
my awesome psychic powers.  VMware's window to beat this absolutely
standard operating procedure (used time and time again by Microsoft
under precisely these conditions) is probably measured in months.  If it
weren't for the Vista debacle, it would probably be underway already, as
Vista introduced the first of the necessary licensing changes, but until
MS's clients are comfortable paying through the nose for Vista business
class licenses or better in order to virtualize at all, and until the
stink from Vista's amazingly poor performance goes away, they're laying
low.  VMware needs volume -- massive volume.  Volume at the consumer
level, to have a CHANCE of creating a bit of a consumer revolution,
volume at the server level. Ubiquity.  They need low margin high volume
sales, not high margin low volume sales.  It might already be too late,
but then -- it might not!

But Alas, nobody listens to an Oracle...

    rgb

(...and yes, fully open source Xen and KVM are safe enough topically as
open source products never really go away even if they DO stop being
overtly profitable in 80-90% of all instances, but by the time the
billion dollar antitrust lawsuit winds down and Microsoft loses (some
2-4 years from now) the issue really will be moot. And VMware doesn't
have that luxury -- the dark side of closed licensing of a product that
Microsoft "wants" is that they are trying to beat Microsoft at its own
game, which is basically impossible.  After all, it owns the playing
field and the umpires and the balls and the bats, and the world series
is over long before an appeal to the commissioners has a chance of being
heard.)

>
> Jeff
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
Robert G. Brown
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone(cell): 1-919-280-8443
Web: http://www.phy.duke.edu/~rgb
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977



More information about the Beowulf mailing list