[Beowulf] Scyld implementation of BProc
becker at scyld.com
Fri Sep 23 09:22:30 PDT 2005
On Fri, 23 Sep 2005, Sean Dilda wrote:
> Tony Stocker wrote:
> > Michael,
> > Okay I'm guessing that since my documentation doesn't mention any of the
> > three commands that the Scyld implementation doesn't have them. I'm
> > still curious to know if they're part of the default BProc package or if
> > it's something special that the LANL folks added. I appreciate the info
> > on the bpstat -P functionality that will help with one issue.
> The developer of BProc used to work at Scyld. He left many years ago.
> Last I heard, he was working at LANL. So, the LANL version is the
> 'default BProc'.
As many might guess, the story is longer and uglier than this. The
developer left with things he wasn't permitted to take, such as internal
documentation, CVS trees and the build system. LANL used this to build a
copy of the Scyld system. In some cases the only changes were to remove
the Scyld name from the files. In other cases (e.g. the configfile
library) the function names were changed and the Scyld copyright removed.
> Last I checked, Scyld was still using an older version
> of the BProc code.
A lower version number doesn't always mean an older code.
Major version numbers should indicate interface level. Many times it's
possible to add new capabilities and features without changing the
existing interface. This isn't important in a research environment, where
you can tell people "don't use last week's version, that's three revs
back". But when you are building production systems, you don't want the
programming interface or core behavior to change.
There are many cases where you look back at an interface and decide that
something could have been done better. But changing a functioning
interface to be slightly nicer is often far uglier than just leaving it
alone. (Remember the story about the creat() system call? It was
better to leave it as it was than renaming it create().)
We made many infrastructure changes to BProc, and still have not needed to
significantly change the interface. The next set of changes will perhaps
change the interface, but only to be more compatible with other projects.
There is a discussion around "Clusterhooks", driven by Bruce Walker. We
expect that in the next few years people will understand the trade-offs
and values implementing cluster process space features:
directed and transparent process migration,
I/O forwarding, libraries vs. in-kernel
a single unified process space is useful vs. multiple process ID spaces
- full Posix process semantics or only process tree
- global signaling vs. only local
execute permission mechanism and policy, in-kernel or user-level
lazy, active or synchronous migration
All of these design decisions need to be made understanding how people
will use clusters, and what they will expect for redundancy and continuing
operation in the face of failure. You are unlikely to find good long-term
system decisions made where established interfaces are considered be
ripe for change rather than something to be carefully preserved.
Donald Becker becker at scyld.com
Scyld Software Scyld Beowulf cluster systems
914 Bay Ridge Road, Suite 220 www.scyld.com
Annapolis MD 21403 410-990-9993
More information about the Beowulf