RSH scaling problems...
Joseph Landman
landman at scalableinformatics.com
Mon Dec 16 08:37:41 PST 2002
Hi Mike:
Look in your log files Luke...
You might find the relevant error message at the tail end of
/var/log/message. Look for rshd or in.rshd errors.
Some thoughts that might help if RSH is really the issue:
In later linuxes (linicies?) rsh spawning is done by xinetd. You want
to make sure xinetd can spawn enough processes. Look at the xinetd man
page, and the -limit option. Adjust the /etc/xinetd.conf file to
reflect the limit. One one system I had to bump this pretty high to
allow all the connections to daemons.
If you are still using /etc/inetd.conf, you can tell it how many servers
it may spawn by including a .nservers at the appropriate part of the
line (though my memory is unclear as to which part)
You might also be running out of network bandwidth. Try running
vmstat 1
netstat -cav
and see what your machine is doing network-wise. Try grabbing the atop
program from freshmeat, and using that to summarize the net utilization
(or use ntop, or any of the others).
Joe
On Sun, 2002-12-15 at 20:05, Mike S Galicki wrote:
> Can't seem to get rsh to scale past like 63 nodes with mpi jobs. SSH
> scales much higher, but the performance is a lot worse in customer
> benchmark tests. I'm guessing that I'm running out of pty's or tty's
> or something on the headnode. Anyone have some ideas? I believe the
> default pty's in 2.4.20 is 1024, but when I list /dev/pty I only see
> 256 entries. MAKEDEV -m 1024 didn't seem to do anything past 256.
>
> Mike Galicki
> Technical Consultant
> Linux Services Team
> San Francisco, CA
> Internet ID: mgalicki at us.ibm.com
--
Joseph Landman, Ph.D.
Scalable Informatics LLC
email: landman at scalableinformatics.com
web: http://scalableinformatics.com
voice: +1 734 612 4615
fax: +1 734 398 5774
More information about the Beowulf
mailing list