[Beowulf] HPC for community college?
deadline at eadline.org
Mon Mar 2 14:18:48 PST 2020
> Dear Doug,
> Might you be willing to give some indication of benchmarks you find
> useful for customers of Limulus systems?
We are working on updated system benchmarks. In addition, I have bunch of
"older" generation benchmarks for Intel i5, i7, Xeon-E processors that I
have not published on ClusterMonkey (yet). Plus, the
Ryzen 3 processors with ECC.
In order to design these type of systems, we run
some specific benchmarks. For instance, when I get a
new processor the first thing I do is run my Effective Cores (EC)
benchmark. A very simple way to check memory bandwidth using
single core NAS parallel kernels, i.e. run a single kernel measure the
time, then run 4, 8, 16 simultaneous copies (up to # of cores) and record
time. (note most of NAS kernels require power of two
If scaling is perfect single core time equals time for 4,8,16, etc. and
never happens, Results are reported as how
many "effective cores" are seen.
I then move on to parallel NAS on same processor (results
usually better than Effective Cores for some kernels)
Here is some articles for older processors (includes the test script)
So the question I ask "given what the market provides"
what is the best price/performance/heat option for personal clusters.
Thus, for a given power envelope, is more cores per socket and less nodes
better than more nodes with less cores per socket running at higher clock.
In addition, what is the memory throughput put on a fully loaded
socket? (generally more cores means more memory contention).
For me these kind of questions are getting more important because
core counts have always increased faster than shared memory BW.
If a multi-core is fully loaded the effective memory BW and
actual clock speed (as per specs) become quite low.
Some recent data on Phoronix for the new 64c Threadripper
illustrates the point (and yes, I realize the TR is not
an Epyc, and the same reasoning applies)
NAMD (normalized to 16 cores)
16c - 1.0
32c - 1.8
48c - 2.5
64c - 2.9
And, I don't think the results are that bad! (although if you
factor in the 64c TR cost, they my not be worth it!) My point
is multi-core scaling has application dependent limits and
I find it worthwhile to get a "feeling" for this behavior
for all new processors and design from there.
Wow, that was longer than I thought it would be.
> On Sat, Feb 22, 2020, at 6:42 AM, Douglas Eadline wrote:
>> That is the idea behind the Limulus systems -- a personal (or group) small
>> turn-key cluster that can deliver local HPC performance.
>> Users can learn HPC software, administration, and run production codes
on performance hardware.
>> I have been calling these "No Data Center Needed"
>> computing systems (or as is now the trend "Edge" computing).
>> These systems have a different power/noise/heat envelope
>> than a small pile of data center servers (i.e. you can use
>> them next to your desk, in a lab or classroom, at home etc.)
>> Performance is optimized to fit in an ambient power/noise/heat
>> envelope. Basement Supercomputing recently started shipping
>> updated systems with uATX blades and 65W Ryzen processors
>> (with ECC), more details are on the data sheet (web page not
>> updated to new systems just yet)
>> Full disclosure, I work with Basement Supercomputing.
>> > Is there a role for a modest HPC cluster at the community college?
>> > _______________________________________________
>> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> To change your subscription (digest mode or unsubscribe) visit
>> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
Computing To change your subscription (digest mode or unsubscribe)
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf