[Beowulf] Building a new cluster - seeking some advice

Fri Dec 21 22:12:10 PST 2007

Speaking of nodes w/ disks vs. nodes without - I was thinking of equiping a
small cluster (microwulf style) with each node having a single USB
thumbdrive instead of a disk.  I thought it might be easier than trying to
get nodes to boot PXE style over the network.  And it seemed to me that
thumbdrives might be easier than disk-per-node to keep in sync: I'd just
unplug them from the nodes, plug them into to a USB hub on another computer
where I build my distribution, and copy files to them, then plug them back
into their nodes.  Also the USB drives would serve for any local filesystem
needs, e.g., for logging or whatever.  With a 1Gb key available for about
$12 it seemed a pretty easy and cheap and low power solution.  And no moving
parts means the "disks" won't die for mechanical reasons (and they won't be
written to enough to worry about flash-wear).

Does anyone have any thoughts on this?  Tried it?  Knows why it won't work?

Thanks!  -- David Bakin

On 12/21/07, Mark Hahn <hahn at mcmaster.ca> wrote:
>
> > 1. I'd like to go diskless.  I've never done this before (the other two
> > clusters are...diskful?).  I've used Fedora on both of the previous
> > clusters.  Is this a good choice for diskless?  Any advice on where to
> > start with diskless or operating system choices?
>
> I prefer diskless installs:
>        - NFS root: fast, can be RO, no significant server load.
>        - node-specific files on tmpfs: hardly any - pidfiles mostly.
>        - local disk for swap, /tmp: disks are cheap and fast, why not?
>
> such an approach is really nicely scalable and very pleasant to
> maintain.  a diskful cluster, by comparison, is often annoying:
> disk failures actually matter, and it's not that hard for nodes
> to get out of sync.  systemimager does a good job of reimaging nodes,
> but it's still not quite as "liberating" as just resetting a node,
> knowing it's ephemeral...
>
> > 2. Given my budget (about 20K), I plan on going with GigE on about 24
> > nodes.  Am I right in thinking that faster network interconnects are
> > just too expensive for this budget?
>
> Greg's right: buy the right interconnect, not just the cheapest.
>
> > 3. I'll be spending most of my cluster's time diagonalizing large
> > matrices.  I plan on using ScaLAPACK eventually; currently I just use
> > LAPACK/ATLAS and do individual matrices on each node.  The only thing
>
> my experience with scalapack and diagonalization is with monster-sized
> sparse matrices, which seem to be fairly latency-sensitive.  if your
> workload is anything like that, gigabit isn't going to scale well,
> at least with a conventional mpi+tcp stack.  (I'm looking forward to
> the OpenMX stack for this reason.)
>
> >       * Intel Core 2 Duo E6850 Conroe 3.0GHz ($280)
> >       * 8 GB (4 X 2 GB) DDR2 800 (~$200)
>
> did you consider AMD?  "large matrices" makes me think of memory balance
> (bandwidth per flop), where AMD normally leads Intel.
>
> >  The motherboard does NOT have integrated video.  Will I need video
> > output?  Can you even build a node without it?
>
> this is a bios issue: will the board boot without a video card?
> I guess you can try configuring it with the card, then remove the card
> and see if it still boots.  I would make sure you can't get integrated
> video - these days, such boards are often cheaper.
>
> > motherboards with adequate support for 8GB memory and 1333 FSB don't
> > have video.
>
> I would also consider AMD, which has lots of integrated-video options.
>
> > seems like a waste.  From reading around, it seems like there is no
> > advantage really to DDR3 memory...is that right?  Any advice on the
>
> power savings, probably some headroom in clock, but it's really at
> the early-adopter stage, I think.
>
> regards, mark hahn.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20071221/d6b85466/attachment.html>