RSH scaling problems

David Mathog mathog at mendel.bio.caltech.edu
Wed Dec 18 09:26:19 PST 2002


Greg Lindahl wrote:

> Low ports can't be reused until TIME_WAIT time has passed.

True.  Let's see what kind of a limit that imposes on rsh
command rates on a typical system - RedHat 7.3 with a few
servers and SGE running, over 100baseT.  I put 100 copies
of a target node's name in a file and then did:

 time rsh -zf manycopies.txt hostname

That blew up. But initially it was because of the default
cps setting in xinetd.d for rsh, which picked up the default 

  cps = 25 30

So I added 

   cps = 250 10

to the /etc/xinetd.d/rsh, restarted xinetd, and tried it again,
whereupon it completed in 2.196 seconds real time.  Running this
3 times quickly failed in the third one, and netstat on the
target showed all the ports used up.  On the node running rsh
netstat showed no TIME_WAIT connections. I think
that means the target was closing the connection before the
source.  After a while (TIME_WAIT, presumably) these
all dropped out of netstat and rsh to the target started 
working again.   Then I changed
the target file so that it listed 50 copies of target1 and 50
copies of target2.  That variation failed in the 6th iteration,
further supporting the conjecture that the limit is on the target
end. So the rate for outgoing rsh from a given node seems not
to be limited (at least by this effect) but the incoming rate
to a node is limited.

It jams up when about 290 ports are stuck in TIME_WAIT.  TIME_WAIT
on linux is 60 seconds (I think).  So the average sustainable
rate of incoming rsh (or rlogin, or rcp) commands is about 290/60,
or just less than 5 per second.  cps set to 250 is overly
optimistic as well, if all rsh come from one source, since the
fastest that rsh can send them (my modified version, which
basically runs rcmd() in a loop), is only about 50/second.
This was over 100baseT, maybe you can go higher with Myrinet.

Which means, I suppose that if you want to fire a lot of commands
from one machine to another putting rsh inside a loop is a bad idea.
Better to start up one rsh, leave it running, and pipe the commands
through it to some target process which runs them on the other end
without dropping the connection between commands.

ANYWAY, going back to the original post by Mike Galicki, he should
check that the xinetd cps value (or equivalent, if it isn't linux)
isn't setting the upper limit.  Possibly he can get more
throughput by raising it.  Failing that, perhaps one of the other
mpi devices keeps a line open all the time and so bypasses
this limit entirely?

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech



More information about the Beowulf mailing list