<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi Dmitri,<div class=""><br class=""></div><div class="">I have no specific application. I have done some CUDA Enabled OpenCV for realtime video stitching, pattern recognition etc in the past. Was planning on spending some time learning more about CUDA and getting into MPICH. I think the K20x’s might still be OK for tensor flow. However this exercise is for me more about infrastructure build, management and learning than a given application. <br class=""><div><br class=""></div><div>Would certainly be interested in insights to the s/w stack.</div><div><br class=""></div><div>Cheers</div><div><br class=""></div><div>Richard</div><div><br class=""><blockquote type="cite" class=""><div class="">On 21 Aug 2019, at 4:06 pm, Dmitri Chubarov <<a href="mailto:dmitri.chubarov@gmail.com" class="">dmitri.chubarov@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class="">Hi Richard,<br class=""><br class=""></div>I am speaking from experience of keeping up a small cluster of 4 Supermicro boxes with a total of 16 C2070 cards. We had to freeze NVIDIA Driver updates as Fermi cards are not supported by the latest drivers as well. This means you can use CUDA but not the latest versions of NVIDIA SDK. Machine Learning applications are out since later version of TensorFlow require later versions of NVIDIA Drivers and SDK. So this cluster runs some computational chemistry codes that are less demanding in terms of CUDA features. I can probably give you details of the software stack off the list.<br class=""><br class=""></div><div class=""> What would be good to keep in the list thread is the information on the type of applications that you intend to use the cluster for.</div><div class=""><br class=""></div><div class=""><br class=""></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 21 Aug 2019 at 12:47, Richard Edwards <<a href="mailto:ejb@fastmail.fm" class="">ejb@fastmail.fm</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class="">Hi Dmitri<div class=""><br class=""></div><div class="">Thanks for the response.<br class=""><div class=""><br class=""></div><div class="">Yes old hardware but as I said it is for a personal cluster. I have also put M2070’s in one of the 1070 chases as they are basically 4 slot PCI expansions. I have various other M2050/M2070/M2090/K20x cards around so depending on time I can certainly get more bang than the C1060’s that are in there now. I am prepared to live with the pain of older drivers, potentially having to use older linux distributions and not being able to support much beyond CUDA 2.0...</div><div class=""><br class=""></div><div class="">Yes I could go out and purchase probably a couple of newer cards and get the same performance or better but this is more about the exercise and the learning.</div><div class=""><br class=""></div><div class="">So maybe the hardware list was a distraction. What are people using as the predominant distro and management tools? </div><div class=""><br class=""></div><div class="">cheers</div><div class=""><br class=""></div><div class="">Richard</div><div class=""><br class=""></div><div class=""><br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On 21 Aug 2019, at 3:08 pm, Dmitri Chubarov <<a href="mailto:dmitri.chubarov@gmail.com" target="_blank" class="">dmitri.chubarov@gmail.com</a>> wrote:</div><br class="gmail-m_5737201908436034359Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class="">Hi,<br class=""></div>this is a very old hardware and you would have to stay with a very outdated software stack as 1070 cards are not supported by the recent versions of NVIDIA Drivers and old versions of NVIDIA drivers do not play well with modern kernels and modern system libraries.Unless you are doing this for digital preservation, consider dropping 1070s out of the equation.</div><div class=""><br class=""></div><div class="">Dmitri<br class=""></div><div class=""><br class=""></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 21 Aug 2019 at 06:46, Richard Edwards <<a href="mailto:ejb@fastmail.fm" target="_blank" class="">ejb@fastmail.fm</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Folks<br class="">
<br class="">
So about to build a new personal GPU enabled cluster and am looking for peoples thoughts on distribution and management tools.<br class="">
<br class="">
Hardware that I have available for the build<br class="">
- HP Proliant DL380/360 - mix of G5/G6<br class="">
- HP Proliant SL6500 with 8 GPU<br class="">
- HP Proliant DL580 - G7 + 2x K20x GPU<br class="">
-3x Nvidia Tesla 1070 (4 GPU per unit)<br class="">
<br class="">
Appreciate people insights/thoughts<br class="">
<br class="">
Regards<br class="">
<br class="">
Richard<br class="">
_______________________________________________<br class="">
Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank" class="">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br class="">
To change your subscription (digest mode or unsubscribe) visit <a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf" rel="noreferrer" target="_blank" class="">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br class="">
</blockquote></div>
</div></blockquote></div><br class=""></div></div></div></blockquote></div>
</div></blockquote></div><br class=""></div></body></html>