RSH scaling problems...
landman at scalableinformatics.com
Mon Dec 16 08:37:41 PST 2002
Look in your log files Luke...
You might find the relevant error message at the tail end of
/var/log/message. Look for rshd or in.rshd errors.
Some thoughts that might help if RSH is really the issue:
In later linuxes (linicies?) rsh spawning is done by xinetd. You want
to make sure xinetd can spawn enough processes. Look at the xinetd man
page, and the -limit option. Adjust the /etc/xinetd.conf file to
reflect the limit. One one system I had to bump this pretty high to
allow all the connections to daemons.
If you are still using /etc/inetd.conf, you can tell it how many servers
it may spawn by including a .nservers at the appropriate part of the
line (though my memory is unclear as to which part)
You might also be running out of network bandwidth. Try running
and see what your machine is doing network-wise. Try grabbing the atop
program from freshmeat, and using that to summarize the net utilization
(or use ntop, or any of the others).
On Sun, 2002-12-15 at 20:05, Mike S Galicki wrote:
> Can't seem to get rsh to scale past like 63 nodes with mpi jobs. SSH
> scales much higher, but the performance is a lot worse in customer
> benchmark tests. I'm guessing that I'm running out of pty's or tty's
> or something on the headnode. Anyone have some ideas? I believe the
> default pty's in 2.4.20 is 1024, but when I list /dev/pty I only see
> 256 entries. MAKEDEV -m 1024 didn't seem to do anything past 256.
> Mike Galicki
> Technical Consultant
> Linux Services Team
> San Francisco, CA
> Internet ID: mgalicki at us.ibm.com
Joseph Landman, Ph.D.
Scalable Informatics LLC
email: landman at scalableinformatics.com
voice: +1 734 612 4615
fax: +1 734 398 5774
More information about the Beowulf