[Beowulf] Differenz between a Grid and a Cluster???

Robert G. Brown rgb at phy.duke.edu
Wed Sep 21 15:31:51 PDT 2005


Mark Hahn writes:

>> A nice use case is to use grid stuff to get a uniform way to access
>> preinstalled applications, locally tuned according to the
>> idiosyncrasies of the local systems.
> 
> oh, absolutely.  in this case, a "grid" is just an application farm.
> and I'm sure there's lots of demand for that.  but it assumes that your
> apps change very slowly or you have a _horde_ of very effective admins
> who can make all the grid nodes look basically identical to the app.

To be really fair, one should note that tools have existed to manage
moderate cluster heterogeneity for single applications since the very
earliest days of PVM.  The very first presentation I ever saw on PVM in
1992 showed slides of computations parallelized over a cluster that
included a Cray, a pile of DEC workstations, and a pile of Sun
workstations.  PVM's aimk and arch-specific binary path layout was
designed so that a user could relatively easily build a set of
executables for several distinct hardware architectures and still join
them in a single message passing cluster application.

Some of the gridware packages do exactly this -- you don't distribute
binaries, you distribute tarball (or other) packages and a set of rules
to build and THEN run your application.  I don't think that any of these
use rpms, although they should -- a well designed src rpm is a nearly
ideal package for this.  Similarly it would be quite trivial to set up a
private yum repository to maintain binary rpm's built for at least the
major hardware architectures and simply start a job with a suitable yum
install and end it with a suitable yum erase.

Most grids are likely not THAT hardware heterogeneous so that only a
handful (e.g. i386, x86_64) of binaries need to be maintained.  Because
of binary compatibility, these grid applications give up at most certain
optimizations when run on imperfectly matched platforms, e.g. i386 on an
Opteron.  That leaves plenty of room for very beneficial scaling as far
as the cycle consumer is concerned, even if it is less than hardware
optimal.  It also permits the grid organization to trade off the human
costs of managing multiple binary images against the efficiency costs of
running a generic version even where it isn't optimal.

Basically, it isn't that hard to manage binaries for x86_64 and i386 --
I have to do this in our own cluster, let alone a grid.  Nor is it that
bad (performance-wise) if you have to run i386 on x86_64.  For most of
the (embarrassingly parallel) jobs that use a grid in the first place,
the point is the massive numbers of CPUs with near perfect scaling, not
how much you eke out of each CPU.

> in that way of thinking, grids make a lot of sense as a shrink-wrap-app farm.

Sure.  Or farms for any application where building a binary for the 2-3
distinct architectures takes five minutes per and you plan to run them
for months on hundreds of CPUs.  Retuning and optimizing per
architecture being strictly optional -- do it if the return for doing so
outweighs the cost.  Or if you have slave -- I mean "graduate student"
-- labor with nothing better to do:-)

    rgb

> 
> regards, mark hahn.
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050921/4bf078fa/attachment.sig>


More information about the Beowulf mailing list