[Beowulf] Nehalem and Shanghai code performance for our rzf example

richard.walsh at comcast.net richard.walsh at comcast.net
Tue Jan 20 15:24:42 PST 2009




----- Original Message ----- 


From: "Bill Broadley" bill at cse.ucdavis.edu 

>If gallium arsenide or some other material gave us 10x the clock rate per 
>watt, but 1/2 the transistors would it really matter?  Seemed like even intel 
>is begrudgingly admitting it's the memory bus, and finally the nehalem is 
>blessed with dramatically more bandwidth. 
> 
>Seems like increasingly cores are turning latency limited workloads (for the 
>parallel jobs of course) into bandwidth limited ones.  Without a memory bus 
>that allows for 10x the bandwidth it doesn't really seem like 10x the clock 
>rate would be of particular use. 



Right.  Excepting the potential for improving the performance of serial codes 

or pieces of serial code (and perhaps badly written code) , delivering 10x by 

clock or by core would not seem to change the bandwidth problem both create . 

Manycore core promises even greater multiples.  For bandwidth limited data 

parallel codes, you  might as well stay on the path of lowest economic resistance. 



rbw 

_______________________________________________ 
Beowulf mailing list, Beowulf at beowulf.org 
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20090120/5392b4e7/attachment.html>


More information about the Beowulf mailing list