[Beowulf] clustering using xen virtualized machines

Fri Jan 29 23:38:17 PST 2010

then why not just run vm's on the host. also then in that case would it be
possible to point pxe and tell it when booting the nodes which image to use?

On Fri, Jan 29, 2010 at 10:55 PM, Ashley Pittman <ashley at pittman.co.uk>wrote:

>
> On 26 Jan 2010, at 19:37, Paul Van Allsburg wrote:
> > Ashley Pittman wrote:
> >> On 25 Jan 2010, at 15:28, Jonathan Aquilina wrote:
> >>> has anyone tried clustering using xen based vm's. what is everyones
> take on that? its something that popped into my head while in my lectures
> today.
> >>>
> >>
> >> I've been using Amazon ec2 for clustering for months now, from a
> software perspective it's very similar to running real hardware.  For my
> needs (development) it's perfectly adequate, I've not benchmarked it against
> running the same code on the raw hardware though.
> >
> > I'd love to try clustering on Amazon.
>
> It's really easy.
>
> > Is there a good writeup somewhere on how to configure & use mpi in the
> cloud?
>
>
> I'm not sure one is needed.  As a bit of background I develop and support
> an open source debugging tool for parallel applications (see my sig for
> details), as such I run a lot of parallel apps but I run them purely to have
> something to test padb against hence I'm not bothered about performance, I
> just need a running job to interrogate.  What is important for me (or rather
> my tool) is that it works in different environments so I run with a variety
> of clustering software.
>
> With Amazon I can boot any numbers of machine "instances" and pay $0.85c/h
> for each one, typically I run four at a time but I've run with up to twenty.
>  Once the instances are booted there is no difference between using them and
> using real machines.  I regularly use Slurm, OpenMPI (ORTE and under Slurm),
> MPICH2 (mpd, hydra and under slurm) and I've yet to find any way in which
> the setup differs from running on real metal.  For persistent storage I pay
> for a 'EBS' volume which I attach to one vm and nfs export to the others
> which use as a shared /home, each instance also comes with a large scratch
> partition but I typically don't use this at all.  I have a bunch of scripts
> for populating the hosts files and adding user accounts and that's all there
> is to it.  For the EBS volume you simply pick the size you need, create the
> volume, attach it to a vm and them mkfs.ext3 as normal, this volume is
> persistent and is charged for by Gb by calendar month rather than instance
> hour.
>
> I can also choose what distro and indeed OS to run, the default is FC8 but
> it's easy enough to pick something else, I tend to flip between FC8, debian
> and Solaris every few weeks, this is mostly to ensure my code is well tested
> in different machines - it does mean re-compiling everything each time I
> switch which can take a while.
>
> I also noticed that over-committing virtual machines doesn't have the same
> negative impact as over-commiting the CPU's on virtual machines, sure the
> application performance plummets in either case but the virtual machine is
> still usable where as a real machine can stop responding almost completely.
>  This means I can over-commit my vm's by running 32 procs per node and run
> 512 process jobs at a cost of only $1.36 an hour.  Cheap enough to be able
> to try something, see if it works and not have to worry about the cost.
>
> In short, Amazon makes a really good development or test system for small
> scale clusters, it's good for testing code correctness and experimenting
> with different distos.  I'm not convinced about the performance and I'm not
> convinced about the cost effectiveness or larger or longer running
> applications but as a place to start it's ideal.
>
> Ashley,
>
> --
>
> Ashley Pittman, Bath, UK.
>
> Padb - A parallel job inspection tool for cluster computing
> http://padb.pittman.org.uk
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
Jonathan Aquilina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100130/7fc4088a/attachment.html>