[Beowulf] about concept of beowulf clusters

Donald Becker becker at scyld.com
Thu Feb 24 16:41:22 PST 2005


On Thu, 24 Feb 2005 rmiguel at usmp.edu.pe wrote:

> Hi, i have a doubt about the strict concept of Beowulf cluster. Is a cluster
> build with comodity hardware only?.. what's up when i build a cluster using
> some tools as OSCAR, or ROCKS, etc on servers or using some kind of high speed
> networks?.
> If I have two Alpha servers with Linux and open source software conected by a
> high speed network.. is this a beowulf cluster?.

My definition of a cluster
   independent machines
   combined into a unified system
   through software and networking

The Beowulf definition is
   commodity machines
   connected by a private cluster network
   running an open source software infrastructure
   for scalable performance computing

Traditionally the term "Beowulf Cluster" has included non-PC
architectures such as the Alpha and somewhat specialized networks such
as Myrinet, but excluded the purpose-built tightly coupled machines such
as the Cray T3E and Digital SC.

We can back to the "cluster" definition.  We are starting with general
purpose machines capable of independent operation, generally those with
a broad market appeal.  The goal is to make them appear to be a single
machine.  We start by networking them together, then we add a software
layer to smooth over the ugliness caused because we couldn't custom
design the hardware.

To distinguish independent machines from the aggregate machine we call the
former "nodes" and the latter the "cluster".

The Beowulf definition sets a category by excluding other important
classes:
    commodity machines
      We are excluding custom built hardware e.g. a single Altix is not a
      Beowulf cluster (or even a cluster by the strict definition)
    connected by a cluster network
      These machines are dedicated to being a cluster, at least
      temporarily.  This excludes cycle scavenging from NOWs and wide
      area grids.
    running an open source infrastructure
      The core elements of the system are open source and verifiable
    for scalable performance computing
      The goal is to scale up performance over many dimensions, rather
      than simulate a single more reliable machine e.g. fail-over.
      Ideally a cluster incrementally scales both up and down, rather
      than being a fixed size.

The original challenges for building clusters were very basic:
    can we build them at all?
    how can we get the nodes to communicate?
    do they do anything useful?
In the early days the answers were
    you have to build them yourself
    writing and improving the basic networking
    for a few application you can use basic message passing

There were many intermediate steps, but those problems were solved a
half decade ago
    You can buy stock cluster configurations from many vendors
    Good OS networking and libraries such as MPI are established
    Most HPTC applications run well on small scale clusters
The real challenges were obvious
    Can we remove compute density as an obstacle to adoption?
    They node can talk to each other, now how do we provision and manage
      cluster that scale in production deployments
    How can we support essentially all applications, and solve the
      programming problem?

Donald Becker				becker at scyld.com
Scyld Software	 			Scyld Beowulf cluster systems
914 Bay Ridge Road, Suite 220		www.scyld.com
Annapolis MD 21403			410-990-9993



More information about the Beowulf mailing list