[Beowulf] What is rdma, ofed, verbs, psm etc?
hearnsj at googlemail.com
Sun Sep 24 13:30:57 PDT 2017
ROCE is commonly used. We run GPFS over ROCE and plenty of other sites do
To answer questions on what network ROCE needs, I guess you could run it on
a 1 Gbps network with office grade network switches.
What it really needs is a lossless network. Dare I saw the Mellanox word....
I think you would find ROCE is a lot more prevalent than you would think...
I guess we should brin in GPUdirect and NVME over Fabrics here.
Google finds this website: http://www.roceinitiative.org/
On 21 September 2017 at 07:02, Jon Tegner <tegner at renget.se> wrote:
> What about RoCE? Is this something that is commonly used (I would guess no
> since I have not found much)? Are there other protocols that are worth
> considering (like "gamma" which doesn't seem to be developed anymore)?
> My impression is that with RoCE you have to use specialized hardware
> (unlike gamma - where one could use standard hardware, and still get a
> noticeable improvement in latency)?
> On 09/21/2017 04:09 AM, Christopher Samuel wrote:
> Thanks Peter for the high level overview! A few followup questions. What
> if I am using a non-Infiniband cluster, i.e something with 10gigE. Or
> even slower like at my home I have a raspbery pi cluster with 100 Mbps
> ethernet. Is ofed/psm/verbs all irrelevant?
> Pretty much, yes, unless you've got fancy switches that can do RoCE.
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf