[Beowulf] NFS alternative for 200 core compute (beowulf) cluster

leo camilo lhcamilo at gmail.com
Thu Aug 10 19:54:13 UTC 2023


Hi there,

I will have a look. thanks for the tip.

best,

leo

On Thu, 10 Aug 2023 at 21:34, John Hearns <hearnsj at gmail.com> wrote:

> I would look at BeeGFS here
>
> On Thu, 10 Aug 2023, 20:19 leo camilo, <lhcamilo at gmail.com> wrote:
>
>> Hi everyone,
>>
>> I was hoping I would seek some sage advice from you guys.
>>
>> At my department we have build this small prototyping cluster with 5
>> compute nodes,1 name node and 1 file server.
>>
>> Up until now, the name node contained the scratch partition, which
>> consisted of 2x4TB HDD, which form an 8 TB striped zfs pool. The pool is
>> shared to all the nodes using nfs. The compute nodes and the name node and
>> compute nodes are connected with both cat6 ethernet net cable and
>> infiniband. Each compute node has 40 cores.
>>
>> Recently I have attempted to launch computation from each node (40 tasks
>> per node), so 1 computation per node.  And the performance was abysmal. I
>> reckon I might have reached the limits of NFS.
>>
>> I then realised that this was due to very poor performance from NFS. I am
>> not using stateless nodes, so each node has about 200 GB of SSD storage and
>> running directly from there was a lot faster.
>>
>> So, to solve the issue,  I reckon I should replace NFS with something
>> better. I have ordered 2x4TB NVMEs  for the new scratch and I was thinking
>> of :
>>
>>
>>    - using the 2x4TB NVME in a striped ZFS pool and use a single node
>>    GlusterFS to replace NFS
>>    - using the 2x4TB NVME with GlusterFS in a distributed arrangement
>>    (still single node)
>>
>> Some people told me to use lustre,but I reckon that might be overkill.
>> And I would only use a single fileserver machine(1 node).
>>
>> Could you guys give me some sage advice here?
>>
>> Thanks in advance
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20230810/c32bb65c/attachment-0001.htm>


More information about the Beowulf mailing list