[Beowulf] Scyld implementation of BProc

Donald Becker becker at scyld.com
Fri Sep 23 09:22:30 PDT 2005


On Fri, 23 Sep 2005, Sean Dilda wrote:

> Tony Stocker wrote:
> > Michael,
> > 
> > Okay I'm guessing that since my documentation doesn't mention any of the 
> > three commands that the Scyld implementation doesn't have them.  I'm 
> > still curious to know if they're part of the default BProc package or if 
> > it's something special that the LANL folks added.  I appreciate the info 
> > on the bpstat -P functionality that will help with one issue.
>
> The developer of BProc used to work at Scyld.  He left many years ago. 
> Last I heard, he was working at LANL.  So, the LANL version is the 
> 'default BProc'.

As many might guess, the story is longer and uglier than this.  The 
developer left with things he wasn't permitted to take, such as internal 
documentation, CVS trees and the build system.  LANL used this to build a 
copy of the Scyld system.  In some cases the only changes were to remove 
the Scyld name from the files.  In other cases (e.g. the configfile 
library) the function names were changed and the Scyld copyright removed.

> Last I checked, Scyld was still using an older version 
> of the BProc code.

A lower version number doesn't always mean an older code.

Major version numbers should indicate interface level.  Many times it's 
possible to add new capabilities and features without changing the 
existing interface.  This isn't important in a research environment, where 
you can tell people "don't use last week's version, that's three revs 
back".  But when you are building production systems, you don't want the 
programming interface or core behavior to change.

There are many cases where you look back at an interface and decide that 
something could have been done better.  But changing a functioning 
interface to be slightly nicer is often far uglier than just leaving it 
alone.  (Remember the story about the creat() system call? It was 
better to leave it as it was than renaming it create().)

We made many infrastructure changes to BProc, and still have not needed to 
significantly change the interface.  The next set of changes will perhaps 
change the interface, but only to be more compatible with other projects.  
There is a discussion around "Clusterhooks", driven by Bruce Walker.  We
expect that in the next few years people will understand the trade-offs 
and values implementing cluster process space features: 
   directed and transparent process migration,
   I/O forwarding, libraries vs. in-kernel
   a single unified process space is useful vs. multiple process ID spaces
      - full Posix process semantics or only process tree
      - global signaling vs. only local
   execute permission mechanism and policy, in-kernel or user-level
   lazy, active or synchronous migration

All of these design decisions need to be made understanding how people 
will use clusters, and what they will expect for redundancy and continuing 
operation in the face of failure.  You are unlikely to find good long-term 
system decisions made where established interfaces are considered be 
ripe for change rather than something to be carefully preserved.


-- 
Donald Becker				becker at scyld.com
Scyld Software	 			Scyld Beowulf cluster systems
914 Bay Ridge Road, Suite 220		www.scyld.com
Annapolis MD 21403			410-990-9993




More information about the Beowulf mailing list