[Beowulf] Q: IB message rate & large core counts (per node)?

Mon Mar 15 15:47:51 PDT 2010

On 3/15/2010 5:33 PM, Gilad Shainer wrote:
> To make it more accurate, most PCIe chipsets supports 256B reads, and
> the data bandwidth is 26Gb/s, which makes it 26+26, not 20+20.

I know Marketers lives in their own universe, but here are a few nuts 
for you to crack:

* If most PCIe chipsets would effectively do 256B Completions, why is 
the max unidirectional bandwidth for QDR/Nehalem is 3026 MB/s (24.2 
GB/s) as reported in the latest MVAPICH announcement ?
3026 MB/s is 73.4% efficiency compared to raw bandwidth of 4 GB for Gen2 
8x. With 256B Completions, the PCIe efficiency would be 92.7%, so 
someone would be losing 19.3% ? Would that be your silicon ?

* For 64B Completions: 64/84 is 0.7619, and 0.7619 * 32 = 24.38 Gb/s. 
How do you get 26 Gb/s again ?

* PCIe is a reliable protocol, there are Acks in the other direction. If 
you claim that one way is 26 GB/s and two-way is 26+26 Gb/s, does that 
mean you have invented a reliable protocol that does not need acks ?

* If bidirectional is 26+26, why is the max bidirectional bandwidth 
reported by MVAPICH is 5858 MB/s, ie 46.8 Gb/s  or 23.4+23.4 Gb/s ? 
Granted, it's more than 20+20, but it depends a lot on the 
chipset-dependent pipeline depth.

BTW, Greg's offer is still pending...

Patrick