[Beowulf] motherboards for diskless nodes

Craig Tierney ctierney at HPTI.com
Fri Feb 25 11:38:44 PST 2005

On Fri, 2005-02-25 at 11:21, Mike Davis wrote:
> Craig,
> Reasons to run disks for physics work.
> 1. Large tmp files and checkpoints.

Good reason, except when a node fails you lose your checkpoints.

> 2. Ability for distributed jobs to continue if master node fails.

Jobs will continue to run once libraries are loaded.  They just hang
at the end.  

It all ends up being a risk assessment.  We have been up for close
to 6 months now.  We have not had a failure of the NFS server.  The
load is all at boot time, but it does very little the rest of the time.

I suspect that by making that statement I will be up at 2am tomorrow
morning replacing hardware.....

> 3. saving network io for jobs rather than admin
> I actually seldom update compute nodes (unless an update is required for 
> software required for research). I mount, a /usr/global that does 
> contain software. I also mount /home on each node.

I guess I wasn't as clear and someone else pointed out why
disks are good.  I actually have disks in some of my compute
nodes for exactly these reasons.  However, they are only for
/tmp and swap.  

You do want to consider how you design your network and the rest
of your system to boot diskless.  Is the cost justified?  For us,
either the systems are booting and all of the IO is image IO, or
the nodes are running and reading/writing files, the IO doesn't
interfere.  We are exporting our IO over the HSN (myrinet in this
case) so the really fast IO isn't interfering anyway.

Your /usr/global does seem to be a good solution that is half way
between having everything local and pure diskless.
> An example of item 1 above are Gaussian jobs that we are now running 
> that require >40GB of tmp space. For these jobs I have both an OS 20GB 
> and tmp 100GB disk in each node. Due to a problematic scsi to ide 
> converter, I have experienced item 2 too many times with one cluster, 
> but even on the others I like knowing that work can continue even if the 
> host is down (facilitated by a separate nfs server).

If you know your job load needs /tmp, disk is great.  I have never had
users than needed to use space in this way, so moving away from diskfull
nodes wasn't an issue.

> Of course, I am definitely old school. I use static IP's, individual 
> passwd files. and simple scripts to handle administration.

I still would probably run system this way if it was disk-full.  I have
run both ways and I diskless has made my life much easier.  Faster to
get the system up, faster to make changes, easier to deal with hardware


> Mike
> Craig Tierney wrote:
> >On Fri, 2005-02-25 at 01:16, John Hearns wrote:
> >  
> >
> >>On Thu, 2005-02-24 at 18:20 -0500, Jamie Rollins wrote:
> >>    
> >>
> >>>Hello.  I am new to this list, and to beowulfery in general.  I am working
> >>>at a physics lab and we have decided to put together a relatively small
> >>>beowulf cluster for doing data analysis.  I was wondering if people on
> >>>this list could answer a couple of my newbie questions.
> >>>
> >>>The basic idea of the system is that it would be a collection of 16 to 32
> >>>off-the-shelf motherboards, all booting off the network and operating
> >>>completely disklessly.  We're looking at amd64 architecture running
> >>>Debian, although we're flexible (at least with the architecture ;).  Most
> >>>of my questions have to do with diskless operation.
> >>>      
> >>>
> >>Jamie, 
> >>  why are you going diskless?
> >>IDE hard drives cost very little, and you can still do your network
> >>install.
> >>Pick your favourite toolkit, Rocks, Oscar, Warewulf and away you go.
> >>
> >>    
> >>
> >
> >IDE drives fail, they use power, you waste time cloning, and
> >depending on the toolkit you use you will run into problems
> >with image consistency.
> >
> >I have run large systems of both kinds.  The last system was
> >diskless and I don't see myself going back.  I like changing
> >one file in one place and having the changes show up immediately.
> >I like installing a packing once, and having it show up immediately,
> >so I don't have to reclone or take the node offline to update
> >the image.
> >
> >Craig
> >
> >
> >  
> >
> >>BTW, have a look at Clusterworld http://www.clusterworld.com
> >>They have a project for a low-cost cluster which is similar to your
> >>thoughts.
> >>
> >>
> >>Also, with the caveat that I work for a clustering company,
> >>why not look at a small turnkey cluster?
> >>I fully acknowledge that building a small cluster from scratch will be
> >>a good learning exercise, and you can get to grips with the motherboard,
> >>PXE etc. 
> >>However if you are spending a research grant, I'd argue that it would be
> >>cost effective to buy a system with support from any one of the
> >>companies that do this.
> >>If you get a prebuilt cluster, the company will have done the research
> >>on PXE booting, chosen gigabit interfaces and switches which perform
> >>well, chosen components which will last. And when your power supplies
> >>fail, or a disk fails someone will come round to replace them.
> >>And you can get on with doing your science.
> >>
> >>    
> >>
> >
> >
> >
> >_______________________________________________
> >Beowulf mailing list, Beowulf at beowulf.org
> >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> >
> >  
> >

More information about the Beowulf mailing list