[Beowulf] memory bandwidth scaling

mathog mathog at caltech.edu
Tue Oct 6 12:35:30 PDT 2015

  On 10/01/2015 09:27 AM, Orion Poplawski wrote:
> We may be looking a getting a couple new compute nodes.  I'm leery 
> though of
> going too high in processor core counts.  Does anyone have any general
> experiences with performance scaling up to 12 cores per processor with 
> general
> models like CM1/WRF/RAMS on the current crop of Xeon processors?

Lately I have been working on a system with >512Gb of RAM and a lot of 
This wouldn't be at all a cost effective beowulf node, but it is a 
godsend when the problems being addressed require huge amounts of memory 
and do not partition easily to run on multiple nodes.  The one I'm using 
is NUMA rather than SMP, and careful placement of the data with respect 
to the nodes is sometimes required for optimal performance. That is 
likely to be the case on some of the machines you may be looking at.  
One must also be careful using multiple processors on this machine 
because some of them share cache, and others don't, so that adding too 
many processes in the wrong place can reduce throughput because these 
start having to go to slower main memory inside tight loops.  Some of 
this is discussed in this longish thread:


This machine is also prone to locking up (to the point it doesn't answer 
terminal keystrokes from a remote X11 terminal) when writing huge files 
back to disk.  I have not tracked this one down yet, it seems to be 
related to unmapping a memory mapped 10.5 Gb file.  A bit difficult to 
debug because when it is happening it isn't possible to look at what the 
machine is doing.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

More information about the Beowulf mailing list