[Beowulf] Differenz between a Grid and a Cluster???

Robert G. Brown rgb at phy.duke.edu
Tue Sep 20 06:11:28 PDT 2005

Mark Hahn writes:

>> Setting up a Grid can be more political than technical.
> I'd even go so far as to say that setting up a grid is basically
> a PR measure, whereas setting up a cluster is usually done for a 
> specific set of practical reasons ;)
> that said, we will probably link or unify the queueing systems, since if
> there are idle cpus on a cluster, some kinds of jobs can profitably run
> anywhere.  I'd tend to call that a single cluster, though, not a grid.

I would have to strongly disagree with both of these statements.

The thing that differentiates a "grid" from a more generic heterogeneous
cluster that runs primarily embarrassingly parallel applications is that
it is a cluster that spans administrative domains (and often the globe).
If all the users already have lan accounts in the domain that "owns" the
resource, it is a cluster.  When you have a cluster that has users from
all over a larger entity (that span several administrative domains) it
is still a cluster, but now the term "compute farm" or "central cluster"
comes to mind as INFORMAL descriptors (no point in fighting over
technical definitions as there aren't any, really).  When you have
several clusters, each already in its own administrative domain,
supporting users from the union of the domains or more, then you have
some unique challenges that are facilitated by specific software tools
and are moving towards a "grid" model.

In fact, it is this SOFTWARE (typically quite different and more complex
than "cluster" software) that makes a union of clusters into a "grid" if
anything.  A few of the challenges faced in the grid world are:

   a) Authenticating users from far far away.  Sometimes literally the
other side of the world.
   b) Monitoring resource pools that are ALSO not necessarily spatially
or networkily contiguous.  You can build a grid out of computers at
several universities or other institutions, for example.
   c) Managing these same resource pools, where "managing" includes but
is not limited to setting usage policies of potentially great complexity
(as in the "grid" community can use this resource but not when my jobs
are running and never on Sundays when me'n'the boyz play MUD games on
it), installing new users, removing old users, connecting nodes with
storage resources, managing policy on the STORAGE resources (on many
"real" grids the storage aspect of the problem dominates the grid
engineering -- the "cluster" aspect per se is almost trivial), dealing
with queues and users of great ignorance whom you cannot talk to and
will likely never meet in person...
   d) Load/usage balancing over multihop networks in the face of
policies, networks, heterogeneous compute resources, heterogeneous
storage resources, heterogeneous application bases, and aforementioned
foolish/ignorant users.

As Doug said, Globus is a popular toolset, with things like condor used
to control policy at the system level, and with all kinds of things
forming bits and pieces of intermediate glue.  I don't really think that
gridware is anywhere near cut and dried yet.

Real Grids do exist, and they look nothing like clusters per se -- they
reall are a union of clusters, sometimes very differently configured
clusters.  They are often funded as Grids, usually by the big agencies
e.g.  DOE, for performing specific tasks being undertaken by a very
large and diverse community that is spread all over the globe.
Basically, the goal of these Grids is to ensure that the duty cycle of
(e.g.) DOE purchased cluster resources remains as close to 100% as
possible, which is generally NOT possible for strictly local usage,
which tends to follow research cycles with as much as 50% idle time
overall (if enough capacity is purchased to keep up with experimental
data inflow).  Intrauniversity grids often have the same purpose but can
often get away with less than the full toolkit above as it is often
possible to do e.g. authentication across the entire University domain
with e.g. kerberos with something less than globus, etc.  There the
dividing line between grid and some other cluster descriptors is
somewhat finer, but the notion of a grid as a union of "independently"
managed clusters is still a valid one.

  rgb (who helped fairly extensively to put together an ATLAS grid
proposal for HEP at Duke and learned far more about grids than he cared
to in the process:-)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050920/c1f7c87d/attachment.sig>

More information about the Beowulf mailing list