[Beowulf] 1.2 us IB latency?

Steffen Persvold steffen.persvold at scali.com
Fri Apr 20 08:08:17 PDT 2007


> -----Original Message-----
> From: Patrick Geoffray [mailto:patrick at myri.com]
> Sent: Thursday, April 19, 2007 6:47 PM
> To: Steffen Persvold; Beowulf Mailing List
> Subject: Re: [Beowulf] 1.2 us IB latency?
> 
> greg.lindahl at qlogic.com wrote:
> >> Back then we were struggling with PIO transfers and how they were
> >> treated in the CPU/North bridge (write combining and all that). I
> >> believe this might still be an issue, correct ?
> 
> WC is well implemented on Opteron, it will aggregate consecutive PIO
> writes at 16, 32 and 64 Bytes smoothly. On Intel processors, this is
> more painful: WC is only 64 Bytes. If you flush the WC buffer with
less
> than 64 bytes in it, you will see multiple 8-byte PIO writes, and not
> always in order.

Yeah, that rings a bell... :)

So I'm guessing, both Myrinet MX and Qlogic Infinipath (confirmed) is
using PIO for "small" messages. Are we sure that Mellanox ConnectX
doesn't ? It seems they would have to in order to get the 1.2us numbers.
There's nothing that stops them from doing :

verbs_post_rdma_write() {
...
    if (msg_size < MAX_PIO_TRESHOLD) {
        copybuffertoremotewithpio();
    } else {
        setupdmaengine();
    }
...
}

Or something of that order.. However, they claim that it's "fully
offloaded", so I'm not sure..

> 
> > cases we can manipulate the mtrrs after boot to fix this. Getting
> > formal support for PAT in the Linux kernel is the long-term fix for
> > this.
> 
> It's interesting to note that most current OSes have native PAT
support,
> except Linux. Even Windows does it well :-)
> 

Hmm, I seem to remember having PAT support working fine with SCI on
Linux a couple of years ago. We started using PAT on x86_64 because of
the nightmare with MTRR and memory holes/overlapping regions (BIOSes
never seemed to get it right) especially on boxes with >4GB memory
(which became more and more common with the introduction of x86_64).


Cheers,

Steffen Persvold
Technical Director Americas
tel. 508-281-7100 x401
fax. 508-281-7171

http://www.scali.com/
Higher Performance Computing




More information about the Beowulf mailing list