both fast and gigabit ethernet on a Scyld cluster

Donald Becker becker at scyld.com
Wed Jan 15 07:40:50 PST 2003


On Wed, 15 Jan 2003, Hans Schwengeler wrote:

> Subject: both fast and gigabit ethernet on a Scyld cluster
..
> In fact I needed to type 'beoserv -f /etc/beowulf/config.giga' (instead of
> 'beoserv -c <config_file>', but it got me going a bit further.

Ahhh, that's right: the old versions of the programs had sometimes
inconsistent options.  The "-f" option made more sense for the 'beoserv'
program, but we changed to a consistent set of common options with the
'28' series.

> Now it booted into stage 3 but exited bproc because of 'connection refused'.
> After some time I realized I had also to start a bpmaster process tailored
> for the gigabit-ethernet (bpmaster -c /etc/beowulf/config.giga).
> I had to kill the old process because bpmaster refused to start when
> onother one was running. After this, the single-processor nodes came up
> just fine.

I should have been more clear with my suggestion.
You created a new cluster operation configuration file, leaving the
standard configuration file only for cluster boot.
Instead the new configuration file should be for the stage 1 boot only.
   /etc/beowulf/config.stage1
Since all of the other programs default to using
   /etc/beowulf/config
that configuration file should reflect the current cluster state.

> This "hand"-booting seems to be a bit ackward. maybe you have a better
> solution. But a least all the nodes are now up for the moment which is great.

This has become a surprisingly common configuration.

It's now pretty much typical to have a PXE boot and management on a Fast
Ethernet port, but want to operate the cluster over Gb Ethernet.  
The 100Mbps Ethernet is typically integrated on the motherboard and
supports IPMI or Wake-On-LAN.  The Gb Ethernet rarely has PXE boot, and
I don't know of any Gb chips which support management.

We have ad hoc solutions for this with both the 27 and 28 series
releases.  In the past few months we have changed the underlying
programs to avoid replicating servers for split networks.  Our 29
series will have a GUI configuration checkbox to automate the (now
trivial) configuration file setup.

Using two networks turned out to have significant implications for our
new scalable flow-controlled PXE boot server.  The server has to
explicitly track which interface initial PXE request arrived on and
handle them differently than later information requests.  It's easier to
add the support now, when there are only a few users, than after the release.



-- 
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993




More information about the Beowulf mailing list