[Beowulf] automount on high ports
Joe Landman
landman at scalableinformatics.com
Wed Jul 2 05:31:15 PDT 2008
Carsten Aulbert wrote:
>> The clients are connecting from ports below 1024 because Berkeley set
>> up a hack in the original BSD stack so that only root could open ports
>> below 1024. This way, you could "know" the process on the remote host
>> was a root process, thus you could feel "secure" [sic]. It doesn't add
>> any real security any more, but it is also not the cause of any
>> problem you are experiencing.
>
> We might run out of "secure" ports.
But you can force NFS to connect from the ports above 1024 so this
shouldn't be an issue.
[...]
> OK, we have 1342 nodes which act as servers as well as clients. Every
There is a short writeup on this with quotes from Bruce Allen in
HPCwire. Too bad you didn't opt for JackRabbits there :)
> node exports a single local directory and all other nodes can mount this.
Fine, nothing terrible.
>
> What we do now to optimize the available bandwidth and IOs is spread
> millions of files according to a hash algorithm to all nodes (multiple
> copies as well) and then run a few 1000 jobs opening one file from one
> box then one file from the other box and so on. With a short autofs
Hmmm.... So you want to "track" spatial metadata (e.g. where the file
is) according to some hash function that each node can execute, and then
once this is known, perform IO.
So, for example (as a relatively naive/simple minded version) some quick
Perl pseudo-code ...
# ....
my $hash = MD5SUM($filename);
my $machine = $hash % $Number_of_machines;
my $machine_name= $name[$machine];
my $full_path = sprintf("/%s/%s",$machine_name,$filename);
open(my $fh, ">".$full_path) or die "FATAL ERROR: unable to
open $full_path\n";
# ....
Is this about right?
> timeout that ought to work. Typically it is possible that a single
> process opens about 10-15 files per second, i.e. making 10-15 mounts per
> second. With 4 parallel process per node that's 40-60 mounts/second.
Hmmm ... mount latency we have seen is ~0.1 seconds or so, so I can
believe 10-14/second. Note that due to strange latency effects in
larger machines, we have also seen an automount take 0.5 seconds and
more. Some delays due to name resolution. Never fully traced it, but
this was on a 32 node cluster. You are talking a little bigger.
> With a timeout of 5 seconds we should roughly have 200-300 concurrent
> mounts (on average, no idea abut the variance).
200-300 mounts across 1342 nodes, sure. 200-300 mounts of one file
system on one server from 200-300 client machines? I have some doubts ...
> Our tests so far have shown that sometimes a node keeps a few mounts
> open (autofs4 problems AFAIK) and at some point is not able to mount
> more shares. Usually this occurs at about 350 mounts and we are not yet
> 100% sure if we are running out of secure ports.
Older kernels couldn't do more than 256 mounts. Not sure when/if this
limit has been raised. This is a different problem though. If you have
N machines mounting a file system, then you get N requests on port
2049 or similar (the inbound NFS port). You don't run out of secure ports.
If the issue is that you are running 200+ outgoing mount requests from
one machine, you will likely have a delay issue as you cross the 256
mount number (if your kernel hasn't been patched ... not sure if/when
this has/will change).
> All our boxes export now with "insecure" option (NFSv3), but our clients
> all connect from a "secure" port, anyone here who might give us a hint
> how to force this in Linux?
See if you can get less than 256 mounts working well. If so, and it
only starts falling off above 256 mounts, this would be important to know.
Joe
>
> Thanks a lot
>
> Carsten
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf
mailing list