Questions and Sanity Check
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Donald Becker becker at scyld.comFri Mar 2 06:57:13 PST 2001
- Previous message: Questions and Sanity Check
- Next message: Questions and Sanity Check
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 1 Mar 2001, Dan Yocum wrote: > Daniel Ridge wrote: > > On Thu, 1 Mar 2001, Dan Yocum wrote: > > > Daniel Ridge wrote: > > > > For people who are spending a lot of time booting their Scyld slave nodes > > > > -- I would suggest trimming the library list. > > > > > > > > This is the list of shared libraries which the nodes cache for improved > > > > runtime migration performance. These libraries are transferred over to > > > > the nodes at node boot time. > > > > > > Hm. Wouldn't it be better (i.e., more efficient) to cache these libs on > > > a small, dedicated partition... .. > > Also, I think Amdahl's law kicks in and tells us that the potential > > speedup is small in most cases (with respect to my trimming comment > > above) and that there might be other areas that are worth more attention > > in lowering boot times. On my VMware slave nodes, it costs me .5 seconds > > Hold it. How big are the shared libs? If they're tiny, then yeah, > ferget it. No big deal tranferring them over... The cached libraries on the slave nodes are 10-40MB uncompressed. That's on the order of 1 second of Fast Ethernet time to transfer the compressed version. The boot time isn't a significant issue. A project that's on the "to do" list but not yet scheduled(*) is to dynamically adjust the shared library list. The Scyld Beowulf system could be booted with just a few cached elements on the slaves, with frequently referenced libraries slowly added to the cached list. The existing caching technique isn't limited to libraries. A subtle aspect of the current ld.so design is that there is very little difference between a library and an executable. Full programs, say a frequently-run 10MB simulation engine, could be cached on the slave nodes without changing the code. It's a larger step extending that concept to a persistent disk-based cache. We want to avoid that for philosophical reason: unless done carefully, it reintroduces the risk of version skew, and there is a slippery slope returning to the old full-node-install model. (*) Yes, that's a hint to anyone looking for a project. > > to transfer my libraries but still takes me the better part of a minute to > > get the damn BIOS out of the way. > > Well, yeah, there is that. Have you tried running Beowulf2 on machines > with Linux BIOS? Now that'd be cool to see - a Beowulf cluster come up > in 3 seconds. :) Ron Minnich uses Scyld Beowulf with his LinuxBIOS work. He was demoing the resulting "instant boot" clusters at SC2000 and the Extreme Linux Developers Forum last week. Some tuning must be done to reach a 3 second boot time -- some device drivers have needless delays and IDE disks might take long time to respond after a reset. Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993
- Previous message: Questions and Sanity Check
- Next message: Questions and Sanity Check
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
