Questions and Sanity Check

Dan Yocum yocum at linuxcare.com
Thu Mar 1 11:47:28 PST 2001


Daniel Ridge wrote:
> 
> On Thu, 1 Mar 2001, Dan Yocum wrote:
> 
> > Daniel Ridge wrote:
> 
> > Since I haven't built/booted a Scyld cluster yet, and have only seen Don
> > talk about it at Fermi, please excuse my potentially naive comments.
> >
> >
> > > For people who are spending a lot of time booting their Scyld slave nodes
> > > -- I would suggest trimming the library list.
> > >
> > > This is the list of shared libraries which the nodes cache for improved
> > > runtime migration performance. These libraries are transferred over to
> > > the nodes at node boot time.
> >
> >
> > Hm.  Wouldn't it be better (i.e., more efficient) to cache these libs on
> > a small, dedicated partition on the worker node (provided you have a
> > disk available, of course) and simply check that they're up-to-date each
> > time you boot and only update them when they change, say, via rsync?
> 
> Possibly. We're working on making available versions of our software that
> simulateously host multiple pid spaces from different frontends. In this
> situation, you could wind up needing 1 magic partition per frontend -- as
> each master could have its own set of shared libraries.
> 
> Also, I think Amdahl's law kicks in and tells us that the potential
> speedup is small in most cases (with respect to my trimming comment
> above) and that there might be other areas that are worth more attention
> in lowering boot times. On my VMware slave nodes, it costs me .5 seconds


You've got a "cluster" on a single machine running multiple versions of
VMware, right?  So, the transfer of the libs would be understandably
faster on a virtual interface - it's not like your sending them via a
real NIC.  

Hold it.  How big are the shared libs?  If they're tiny, then yeah,
ferget it.  No big deal tranferring them over (I don't know big your
libs are).  What I'm concerned about is transferring 40MB, or more, to
hundreds of nodes, hundreds of times.  Then there would be a definite
increase in bootup time to have big libs on the individual nodes. 
Unless, you multicast the libs out to the worker nodes... ;-)

> to transfer my libraries but still takes me the better part of a minute to
> get the damn BIOS out of the way.

Well, yeah, there is that.  Have you tried running Beowulf2 on machines
with Linux BIOS?  Now that'd be cool to see - a Beowulf cluster come up
in 3 seconds.  :)

Cheers,
Dan

-- 
Dan Yocum, Sr. Linux Consultant
Linuxcare, Inc.  630.697.8066 tel
yocum at linuxcare.com, http://www.linuxcare.com
Linuxcare. Putting open source to work.




More information about the Beowulf mailing list