[Beowulf] NFS alternative for 200 core compute (beowulf) cluster

Thu Aug 10 20:03:58 UTC 2023

Awesome, thanks for the info!

Best,

leo

On Thu, 10 Aug 2023 at 22:01, Jeff Johnson <jeff.johnson at aeoncomputing.com>
wrote:

> Leo,
>
> Both BeeGFS and Lustre require a backend file system on the disks
> themselves. Both Lustre and BeeGFS support ZFS backend.
>
> --Jeff
>
>
> On Thu, Aug 10, 2023 at 1:00 PM leo camilo <lhcamilo at gmail.com> wrote:
>
>> Hi there,
>>
>> thanks for your response.
>>
>> BeeGFS indeed looks like a good call option, though realistically I can
>> only afford to use a single node/server for it.
>>
>> Would it be feasible to use zfs as volume manager coupled with BeeGFS for
>> the shares, or should I write zfs off all together?
>>
>> thanks again,
>>
>> best,
>>
>> leo
>>
>> On Thu, 10 Aug 2023 at 21:29, Bernd Schubert <bernd.schubert at fastmail.fm>
>> wrote:
>>
>>>
>>>
>>> On 8/10/23 21:18, leo camilo wrote:
>>> > Hi everyone,
>>> >
>>> > I was hoping I would seek some sage advice from you guys.
>>> >
>>> > At my department we have build this small prototyping cluster with 5
>>> > compute nodes,1 name node and 1 file server.
>>> >
>>> > Up until now, the name node contained the scratch partition, which
>>> > consisted of 2x4TB HDD, which form an 8 TB striped zfs pool. The pool
>>> is
>>> > shared to all the nodes using nfs. The compute nodes and the name node
>>> > and compute nodes are connected with both cat6 ethernet net cable and
>>> > infiniband. Each compute node has 40 cores.
>>> >
>>> > Recently I have attempted to launch computation from each node (40
>>> tasks
>>> > per node), so 1 computation per node.  And the performance was
>>> abysmal.
>>> > I reckon I might have reached the limits of NFS.
>>> >
>>> > I then realised that this was due to very poor performance from NFS. I
>>> > am not using stateless nodes, so each node has about 200 GB of SSD
>>> > storage and running directly from there was a lot faster.
>>> >
>>> > So, to solve the issue,  I reckon I should replace NFS with something
>>> > better. I have ordered 2x4TB NVMEs  for the new scratch and I was
>>> > thinking of :
>>> >
>>> >   * using the 2x4TB NVME in a striped ZFS pool and use a single node
>>> >     GlusterFS to replace NFS
>>> >   * using the 2x4TB NVME with GlusterFS in a distributed arrangement
>>> >     (still single node)
>>> >
>>> > Some people told me to use lustre,but I reckon that might be overkill.
>>> > And I would only use a single fileserver machine(1 node).
>>> >
>>> > Could you guys give me some sage advice here?
>>> >
>>>
>>> So glusterfs is using fuse, which doesn't have the best performance
>>> reputation (although hopefully not for long - feel free to search for
>>> "fuse" + "uring").
>>>
>>> If you want to avoid complexity of Lustre, maybe look into BeeGFS. Well,
>>> I would recommend to look into it anyway (as former developer I'm biased
>>> again ;) ).
>>>
>>>
>>> Cheers,
>>> Bernd
>>>
>>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
>
>
> --
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
>
> jeff.johnson at aeoncomputing.com
> www.aeoncomputing.com
> t: 858-412-3810 x1001   f: 858-412-3845
> m: 619-204-9061
>
> 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>
> High-Performance Computing / Lustre Filesystems / Scale-out Storage
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20230810/ede36fdc/attachment-0001.htm>