[Beowulf] inspur gpu box?

Kilian Cavalotti kilian.cavalotti.work at gmail.com
Thu Apr 27 10:32:08 PDT 2017


On Thu, Apr 27, 2017 at 9:45 AM, Michael Di Domenico
<mdidomenico4 at gmail.com> wrote:
> https://www.hpcwire.com/2017/04/27/inspur-launches-16-gpu-capable-ai-computing-box/
>
> i saw this article today from inspur.  it seems to suggest that one
> could attach up to 64 GPU's to a single compute node using a pci-e
> switch.  i'm not sure if it's just a poorly worded article or i'm
> totally misreading it.

Yup, you can have as many PCIe devices in a single root-complex with
the appropriate number of PCIe switches.

Will it create contentions? Certainly, because your CPU only have so
many (40 typically) PCIe lanes. Single PCI root-complex is good for
internal device-to-device communication (provided a sane PCIe
switching architecture), but transferring data from the host memory to
the PCIe devices (and vice-versa) will go through those 40-ish lanes,
and that could quickly become a serious bottleneck.

Will it work reliably? Heh, depends a lot on the specific PCB-level
design of the boards and backplane (devil is in the details, and
retimers).

> given the likely hood that i'm reading this wrong, i'll ask a
> secondary question; is any vendor pitching solutions that support more
> then 8 GPU's attached to a single node?

Yes:
* https://www.supermicro.com/products/system/4U/4028/SYS-4028GR-TR2.cfm
allows 10 GPUs in a single root-complex,
* CocoLink specializes in this kind of product:
http://www.cocolink.co.kr/product.html
and probably quite a few others.

Cheers,
-- 
Kilian


More information about the Beowulf mailing list