[Beowulf] Build Recommendations - Private Cluster

Alexander Antoniades sander at columbia.edu
Wed Aug 21 07:27:41 PDT 2019


We have been building out a cluster based on commodity servers (mainly
Gigabyte motherboards) with 8x1080ti/2080ti per server.

We are using a combination of OpenHPC compiled tools and Ansible. I would
recommend using the OpenHPC software so you don't have to deal with
figuring out what versions of the tools you need to get and manually
building them, but I would not go down their prescribed way for building a
cluster with base images and all for a small heterogeneous cluster. I would
just build the machines as consistently as they can and then use the
OpenHPC versions of programs where needed and augment the management with
something like ansible or even pdsh.

Also unless you're really just doing this an exercise to kill time on
weekends, or you literally have no money and can get free power/cooling, I
would really consider looking at what other more modern hardware is
available, or at least benchmark your system against a sample cloud system
if you really want to learn GPU computing.

Thanks,

Sander

On Wed, Aug 21, 2019 at 1:56 AM Richard Edwards <ejb at fastmail.fm> wrote:

> Hi John
>
> No doom and gloom.
>
> It's in a purpose built workshop/computer room that I have; 42U Rack,
> cross draft cooling which is sufficient and 32AMP Power into the PDU’s. The
> equipment is housed in the 42U Rack along with a variety of other machines
> such as Sun Enterprise 4000 and a 30 CPU Transputer Cluster. None of it
> runs 24/7 and not all of it is on at the same time, mainly because of the
> cost of power :-/
>
> Yeah the Tesla 1070’s scream like a banshee…..
>
> I am planning on running it as power on, on demand setup, which I already
> do through some HP iLo and APC PDU Scripts that I have for these machines.
> Until recently I have been running some of them as a vSphere cluster and
> others as standalone CUDA machines.
>
> So that’s one vote for OpenHPC.
>
> Cheers
>
> Richard
>
> On 21 Aug 2019, at 3:45 pm, John Hearns via Beowulf <beowulf at beowulf.org>
> wrote:
>
> Add up the power consumption for each of those servers. If you plan on
> installing this in a domestic house or indeed in a normal office
> environment you probably wont have enough amperage in the circuit you
> intend to power it from.
> Sorry to be all doom and gloom.
> Also this setup will make a great deal of noise. If in a domestic setting
> put it in the garage.
> In an office setting the obvious place is a comms room but be careful
> about the ventilation.
> Office comms rooms often have a single wall mounted air conditioning unit.
> Make SURE to run a temperature shutdown script.
> This air con unit WILL fail over a weekend.
>
> Regarding the software stack I would look at OpenHPC. But that's just me.
>
>
>
>
>
> On Wed, 21 Aug 2019 at 06:09, Dmitri Chubarov <dmitri.chubarov at gmail.com>
> wrote:
>
>> Hi,
>> this is a very old hardware and you would have to stay with a very
>> outdated software stack as 1070 cards are not supported by the recent
>> versions of NVIDIA Drivers and old versions of NVIDIA drivers do not play
>> well with modern kernels and modern system libraries.Unless you are doing
>> this for digital preservation, consider dropping 1070s out of the equation.
>>
>> Dmitri
>>
>>
>> On Wed, 21 Aug 2019 at 06:46, Richard Edwards <ejb at fastmail.fm> wrote:
>>
>>> Hi Folks
>>>
>>> So about to build a new personal GPU enabled cluster and am looking for
>>> peoples thoughts on distribution and management tools.
>>>
>>> Hardware that I have available for the build
>>> - HP Proliant DL380/360 - mix of G5/G6
>>> - HP Proliant SL6500 with 8 GPU
>>> - HP Proliant DL580 - G7 + 2x K20x GPU
>>> -3x Nvidia Tesla 1070 (4 GPU per unit)
>>>
>>> Appreciate people insights/thoughts
>>>
>>> Regards
>>>
>>> Richard
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190821/c67288ec/attachment.html>


More information about the Beowulf mailing list