What? what? oh, no, no again ...

Donald Becker becker at scyld.com
Mon Jul 9 11:51:53 PDT 2001


On Mon, 9 Jul 2001, Andreas Boklund wrote:

> I have also noticed the "neighborhood table overflow" error when booting
> my Scyld cluster (It is not a production cluster, just made it to compare
> a few features with MOSIX). I have 3c509-something, cards and as i see it
> the error occurs whenever the nodes are sending RARP-requests and the
> network is not responding. In my case this means that the switch (Cisco 
> catalyst 2900XL) doesnt show the connection as up. 

This error message sometimes indicates a communication problem.
(Grrrr, what an obscure message.)

Do you mean 3c509 (ISA) or 3c905 (PCI)?

> I have no idea why you are getting the "fatal error: resetting machine in
> 30 seconds" error, but i think that the 2 errors might be unrelated. Plug

The slave nodes have a default configuration to reboot if they lose the
connection with the master for 30 seconds.  This allows automatic cluster
recovery if something happens to the network connection or master.  You
can change the reboot configurable after boot, but you usually would
change the 30 second reboot default only with a multiple-master fail-over
configuration.

Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993





More information about the Beowulf mailing list