RARP requests

Donald Becker becker at scyld.com
Fri Jul 19 05:15:10 PDT 2002

On Wed, 17 Jul 2002, Jarda Privoznik wrote:

> Hey, I need a little help. Our research group jsut finished installing SCYLD 
> on a three node testing system.

What version are you using?

> We tried some MPI programs, and it worked 
> just fine. But there are couple problems. One is that after a certain time 
> limit (maybe?) the client nodes automatically reboot, and I can't find where 
> to change this option...

You are likely using the basic edition.
The timeout is not a user-configurable option on that release.

The reboot is part of the action taken when the connection between the
master and compute node fails ("cluster membership").  What is occuring
just before the reboot?  The default cluster membership test is very
relaxed -- you have at least 30 seconds before reboot.

> The second and worse problem is that after it did 
> it the second time, the clients did not load up as usually to the Red Hat 
> 5.2, but they stopped at sending RARP requests and it's waiting
> forever.

Was Red Hat 5.2 on these machines previously?
Unless you used beofdisk, or changed /etc/beowulf/fstab to mount the
disks on the slave nodes, the disk contents on the clients should be
Donald Becker				becker at scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993

More information about the Beowulf mailing list