[Beowulf] Cluster Diagram of 500 PC
Mark Hahn
hahn at mcmaster.ca
Tue Jul 10 15:29:26 PDT 2007
> We want to setup a Cluster of 500 PC in following Configuration:
> Intel Duel Core 1.88 GHz 2MB Cache
> 4 GB DDR-2 RAM
> 2 X 80 GB
I'm not saying this configuration is bad, but how did you arrive at it?
there are tradeoffs in each of these hardware choices, and those decisions
are the ones you can't fix later. (I have especially not often seen
2x80G be a useful cluster node config. it's potentially too much for
a diskful install, even if you insist on disfulness. and yet clusters
are normally quite automated, so raid1-ing an OS install doesn't make much
sense unless you have some specific reasons. finally, if you have
disk-intensive applications that can utilize local disks, it would make
more sense to use larger disks, since these days 250-320G is pretty much
entry-level (one-platter).)
> How dow we connect these computers and how many will be defined as master.
you don't necessarily even need a master, but to answer this, you must
quantify your workload fairly precisely. with 500 nodes, you might well
expect a significant number of users logged in at once, which might
incurr a significant "support" load (compiling, etc). or perhaps you
want to run many, many serial jobs, in which case you'll need to split
up your "admin" load across multiple machines (queueing, monitoring,
logging.)
> How do we conect using how many switch.
again, depends entirely on your workload. it _could_ be quite reasonable
to have a rack of nodes going to one switch, and just one uplink from
each rack to a top-level switch. that would clearly optimize the cabling
at the expense of serious MPI programs. unless the workload consisted
solely of rack-sized MPI programs! large switches of the size you're
looking for tend to be expensive; if you compromise (say, single 10G
uplink per rack), modular switches can still be used.
otoh, maybe a spindly ethernet fabric alongside a fast and flat
512-way myri10G network?
all depends on the workload.
> How power connection will be provided.
you want some sort of PDU in the rack with high-current, high-voltage
feeds to each rack. dual 30A 220 3-phase is not an unusual design point.
obviously, if you can make nodes more efficient, you save money on the
power infrastructure, as well as operating costs. for instance, the
cluster I sit next to has dual-95W sockets per node, with each node
pulling around 300W. higher-efficiency power supplies might save 30W/node,
which would be only 1.1 kw/rack; 65W sockets would save 60W/node -
that's starting to be significant. providing consistently cool air
saves power too (nodes here have 12 fans that consume up to 10W each!).
> How do we start and stop all nodes using a remote computer.
IPMI is an excellent, portable, well-scriptable interface for control and
monitoring. there are some vendor-specific alternatives, as well as
cruder mechanisms (controllable PDU's).
> How do we ensure fault tolarent network connectivity.
something like LVS. it's a software thing, thus easy ;)
> We want to use windows XP or windows 2003 as OS. Better persormance Centos ro Linux RHL may be selected.
don't bother with windows unless you really are a windows guru
and also incredibly linux-averse.
> please advice us and help me in providing a network diagram of the system
nodes in racks, leaf switch(s) in racks, uplinking to top-level switch(s).
knowing nothing about your workload, that's reasonable.
More information about the Beowulf
mailing list