[Beowulf] ever heard of ScaleMP?

Mark Hahn hahn at mcmaster.ca
Tue Dec 11 09:50:03 PST 2007

>> I've been quite curious to try something like the f1200 as a potential 
>> replacement for our Altixes, which were bought predominantly for running 
>> single-threaded large-memory jobs.

we have an Altix as well, and I always cringe when I see a single-thread,
large-memory job running on it.  ours has 128p, 256G, and I think 6M/core caches.
so large-mem serial job, assuming uniform memory access, would have a 
hit rate of .00002289.  and in any case, there is >800 GB/s of memory
bandwidth available, but at best 6.4 GB/s in use.  don't forget that the it2
is a fairly strict in-order chip, as well.

sure, perhaps a large-memory serial code has a small working set that 
fits in cache.  but doesn't it strike you as strange to have a 
working set that's 1/40000 of the total footprint?  I suspect that you 
could reformulate such a code as a "memory-extension" MPI job and avoid
the need for custom hardware.  (ie, let rank0 do all the work, and just 
operate a software cache of data fed by all the other ranks.  of course,
this begs the question of whether the code _has_ to be serial...)

> It is fairly easy (barring cost issues) to get a single system image machine 
> with 8-16 processor cores and 128 GB ram.  Beyond that, you need something 
> like ScaleMP or a "proprietary" box to get more RAM.

I'm guessing ScaleMP is approximately the same speed as a user-level 
network-shared-memory implementation, but would love to see real numbers.

regards, mark hahn.

More information about the Beowulf mailing list