[Beowulf] [External] Re: NFS alternative for 200 core compute (beowulf) cluster

Mon Aug 14 15:44:55 UTC 2023

I'm surprised no one here has mentioned tuning kernel/network 
parameters. I would take at which of these parameters you can tune to 
improve performance first because it's free, quick, and the least 
labor-intensive way to improve performance. I would take a look at the 
website below and see what parameters you can tweak to improve your 
performance.

https://fasterdata.es.net/

Prentice

On 8/10/23 3:35 PM, Jeff Johnson wrote:
> Leo,
>
> NFS can be a hindrance but if tuned and configured properly might not 
> be as terrible. Some thoughts...
>
>   * What interface are the nodes accessing NFS via? Ethernet or
>     Infiniband?
>   * Have you tuned the number of NFS server threads above defaults?
>   * As a test, you could deploy a single Lustre node that would act as
>     MGS/MDS and OSS simultaneously to test for performance gains via
>     Infiniband.
>   * Your scratch volume must really be scratch because you are
>     running with no parity protection (two disk os SSD stripe)
>   * You're probably better off with tuned NFS as opposed to GlusterFS
>
> --Jeff
>
> On Thu, Aug 10, 2023 at 12:19 PM leo camilo <lhcamilo at gmail.com> wrote:
>
>     Hi everyone,
>
>     I was hoping I would seek some sage advice from you guys.
>
>     At my department we have build this small prototyping cluster with
>     5 compute nodes,1 name node and 1 file server.
>
>     Up until now, the name node contained the scratch partition, which
>     consisted of 2x4TB HDD, which form an 8 TB striped zfs pool. The
>     pool is shared to all the nodes using nfs. The compute nodes and
>     the name node and compute nodes are connected with both cat6
>     ethernet net cable and infiniband. Each compute node has 40 cores.
>
>     Recently I have attempted to launch computation from each node (40
>     tasks per node), so 1 computation per node.  And the performance
>     was abysmal. I reckon I might have reached the limits of NFS.
>
>     I then realised that this was due to very poor performance from
>     NFS. I am not using stateless nodes, so each node has about 200 GB
>     of SSD storage and running directly from there was a lot faster.
>
>     So, to solve the issue,  I reckon I should replace NFS with
>     something better. I have ordered 2x4TB NVMEs  for the new scratch
>     and I was thinking of :
>
>       * using the 2x4TB NVME in a striped ZFS pool and use a single
>         node GlusterFS to replace NFS
>       * using the 2x4TB NVME with GlusterFS in a distributed
>         arrangement (still single node)
>
>     Some people told me to use lustre,but I reckon that might be
>     overkill. And I would only use a single fileserver machine(1 node).
>
>     Could you guys give me some sage advice here?
>
>     Thanks in advance
>
>
>     _______________________________________________
>     Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>     Computing
>     To change your subscription (digest mode or unsubscribe) visit
>     https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
>
>
> -- 
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
>
> jeff.johnson at aeoncomputing.com
> www.aeoncomputing.com <http://www.aeoncomputing.com>
> t: 858-412-3810 x1001   f: 858-412-3845
> m: 619-204-9061
>
> 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>
> High-Performance Computing / Lustre Filesystems / Scale-out Storage
>
> _______________________________________________
> Beowulf mailing list,Beowulf at beowulf.org  sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visithttps://beowulf.org/cgi-bin/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20230814/f0a67e33/attachment.htm>