[Beowulf] Deskside clusters

Tue Aug 24 18:33:53 UTC 2021

Jim,

You are describing a lot of the design pathway for Limulus
clusters. The local (non-data center) power, heat, noise are all
minimized while performance is maximized.

A well decked out system is often less than $10K and
are on par with a fat multi-core workstations.
(and there are reasons a clustered approach performs better)

Another use case is where there is no available research data center
hardware because there is no specialized sysadmins/space/budget.
(Many smaller colleges and universities fall into this
group). Plus, often times, dropping something into a data center
means an additional cost to the researchers budget.

--
Doug

> I've been looking at "small scale" clusters for a long time (2000?)  and
> talked a lot with the folks from Orion, as well as on this list.
> They fit in a "hard to market to" niche.
>
> My own workflow tends to have use cases that are a big "off-nominal" - one
> is the rapid iteration of a computational model while experimenting - That
> is, I have a python code that generates input to Numerical
> Electromagnetics Code (NEC), I run the model over a range of parameters,
> then look at the output to see if I'm getting what what I want. If not, I
> change the code (which essentially changes the antenna design), rerun the
> models, and see if it worked.  I'd love an iteration time of "a minute or
> two" for the computation, maybe a minute or two to plot the outputs
> (fiddling with the plot ranges, etc.).  For reference, for a radio
> astronomy array on the far side of the Moon, I was running 144 cases, each
> at 380 frequencies: to run 1 case takes 30 seconds, so farming it out to
> 12 processors gave me a 6 minute run time, which is in the right range.
> Another model of interaction of antnenas on a spacecraft runs about 15
> seconds/case; and a third is about 120 seconds/case.
>
> To get "interactive development", then, I want the "cycle time" to be 10
> minutes - 30 minutes of thinking about how to change the design and
> altering the code to generate the new design, make a couple test runs to
> find the equivalent of "syntax errors", and then turn it loose - get a cup
> of coffee, answer a few emails, come back and see the results.  I could
> iterate maybe a half dozen shots a day, which is pretty productive.
> (Compared to straight up sequential - 144 runs at 30 seconds is more than
> an hour - and that triggers a different working cadence that devolves to
> sort of one shot a day) - The "10 minute" turnaround is also compatible
> with my job, which, unfortunately, has things other than computing -
> meetings, budgets, schedules.  At 10 minute runs, I can carve out a few
> hours and get into that "flow state" on the technical problem, before
> being disrupted by "a person from Porlock."
>
> So this is, I think, a classic example of  "I want local control" - sure,
> you might have access to a 1000 or more node cluster, but you're going to
> have to figure out how to use its batch management system (SLURM and PBS
> are two I've used) - and that's a bit different than "self managed 100%
> access". Or, AWS kinds of solutions for EP problems.   There's something
> very satisfying about getting an idea and not having to "ok, now I have to
> log in to the remote cluster with TFA, set up the tunnel, move my data,
> get the job spun up, get the data back" - especially for iterative
> development.  I did do that using JPLs and TACCs clusters, and "moving
> data" proved to be a barrier - the other thing was the "iterative code
> development" in between runs - Most institutional clusters discourage
> interactive development on the cluster (even if you're only sucking up one
> core).   If the tools were a bit more "transparent" and there were "shared
> disk" capabilities, this might be more attractive, and while everyone is
> exceedingly helpful, there are still barriers to making it "run it on my
> desktop"
>
> Another use case that I wind up designing for is the "HPC in places
> without good communications and limited infrastructure" -  The notional
> use case might be an archaeological expedition wanting to use HPC to
> process ground penetrating radar data or something like that.   (or, given
> that I work at JPL, you have a need for HPC on the surface of Mars) - So
> sending your data to a remote cluster isn't really an option.  And here,
> the "speedup" you need might well be a factor of 10-20 over a single
> computer, something doable in a "portable" configuration (check it as
> luggage, for instance). Just as for my antenna modeling problems, turning
> an "overnight" computation into a "10-20 minute"  computation would change
> the workflow dramatically.
>
>
> Another market is "learn how to cluster" - for which the RPi clusters work
> (or "packs" of Beagleboards) - they're fun, and in a classroom
> environment, I think they are an excellent cost effective solution to
> learning all the facets of "bringing up a cluster from scratch", but I'm
> not convinced they provide a good "MIPS/Watt" or "MIPS/liter" metric - in
> terms of convenience.  That is, rather than a cluster of 10 RPis, you
> might be better off just buying a faster desktop machine.
>
> Let's talk design desirements/constraints
>
> I've had a chance to use some "clusters in a box" over the last decades,
> and I'd suggest that while power is one constraint, another is noise.
> Just the other day, I was in a lab and someone commented that "those
> computers are amazingly fast, but you really need to put them in another
> room". Yes, all those 1U and 2U rack mounted boxes with tiny fans
> screaming is just not "office compatible"   And that kind of brings up
> another interesting constraint for "deskside" computing - heat.  Sure you
> can plug in 1500W of computers (or even 3000W if you have two circuits),
> but can you live in your office with a 1500W space heater?
> Interestingly, for *my* workflow, that's probably ok - *my* computation
> has a 10-30% duty cycle - think for 30 minutes, compute for 5-10.  But
> still, your office mate will appreciate if you keep the sound level down
> to 50dBA.
>
> GPUs - some codes can use them, some can't.  They tend, though, to be
> noisy (all that air flow for cooling). I don't know that GPU manufacturers
> spend a lot of time on this.  Sure, I've seen charts and specs that claim
> <50 dBA. But I think they're gaming the measurement, counting on the user
> to be a gamer wearing headphones or with a big sound system.  I will say,
> for instance, that the PS/4 positively roars when spun up unless youâ€™ve
> got external forced ventilation to keep the inlet air temp low.
>
> Looking at GSA guidelines for office space - if it's "deskside" it's got
> to fit in the 50-80 square foot cubicle or your shared part of a 120
> square foot office.
>
> Then one needs to figure out the "refresh cycle time" for buying hardware
> - This has been a topic on this list forever - you have 2 years of
> computation to do: do you buy N nodes today at speed X, or do you wait a
> year, buy N/2 nodes at speed 4X, and finish your computation at the same
> time.
>
> Fancy desktop PCs with monitors, etc. come in at under $5k, including
> burdens and installation, but not including monthly service charges (in an
> institutional environment).  If you look at "purchase limits" there's some
> thresholds (usually around $10k, then increasing in factors of 10 or 100
> steps) for approvals.  So a $100k deskside box is going to be a tough
> sell.
>
>
>
> ï»¿On 8/24/21, 6:07 AM, "Beowulf on behalf of Douglas Eadline"
> <beowulf-bounces at beowulf.org on behalf of deadline at eadline.org> wrote:
>
>     Jonathan
>
>     It is a real cluster, available in 4 and 8 node versions.
>     The design if for non-data center use. That is, local
>     office, lab, home where power, cooling, and noise
>     are important. More info here:
>
>     https://urldefense.us/v3/__https://www.limulus-computing.com__;!!PvBDto6Hs4WbVuu7!f3kkkCuq3GKO288fxeGGHi3i-bsSY5P83PKu_svOVUISu7dkNygQtSvIpxHkE0XDpKU4fOA$
>     https://urldefense.us/v3/__https://www.limulus-computing.com/Limulus-Manual__;!!PvBDto6Hs4WbVuu7!f3kkkCuq3GKO288fxeGGHi3i-bsSY5P83PKu_svOVUISu7dkNygQtSvIpxHkE0XD7eWwVuM$
>
>     --
>     Doug
>
>
>
>     > Hi Doug,
>     >
>     > Not to derail the discussion, but a quick question you say desk
> side
>     > cluster is it a single machine that will run a vm cluster?
>     >
>     > Regards,
>     > Jonathan
>     >
>     > -----Original Message-----
>     > From: Beowulf <beowulf-bounces at beowulf.org> On Behalf Of Douglas
> Eadline
>     > Sent: 23 August 2021 23:12
>     > To: John Hearns <hearnsj at gmail.com>
>     > Cc: Beowulf Mailing List <beowulf at beowulf.org>
>     > Subject: Re: [Beowulf] List archives
>     >
>     > John,
>     >
>     > I think that was on twitter.
>     >
>     > In any case, I'm working with these processors right now.
>     >
>     > On the new Ryzens, the power usage is actually quite tunable.
>     > There are three settings.
>     >
>     > 1) Package Power Tracking: The PPT threshold is the allowed socket
> power
>     > consumption permitted across the voltage rails supplying the
> socket.
>     >
>     > 2) Thermal Design Current: The maximum current (TDC) (amps) that can
> be
>     > delivered by a specific motherboard's voltage regulator
> configuration in
>     > thermally-constrained scenarios.
>     >
>     > 3) Electrical Design Current: The maximum current (EDC) (amps) that
> can be
>     > delivered by a specific motherboard's voltage regulator
> configuration in a
>     > peak ("spike") condition for a short period of time.
>     >
>     > My goal is to tweak the 105W TDP R7-5800X so it draws power like
> the
>     > 65W-TDP R5-5600X
>     >
>     > This is desk-side cluster low power stuff.
>     > I am using extension cable-plug for Limulus blades that have an
> in-line
>     > current meter (normally used for solar panels).
>     > Now I can load them up and watch exactly how much current is being
> pulled
>     > across the 12V rails.
>     >
>     > If you need more info, let me know
>     >
>     > --
>     > Doug
>     >
>     >> The Beowulf list archives seem to end in July 2021.
>     >> I was looking for Doug Eadline's post on limiting AMD power and
> the
>     >> results on performance.
>     >>
>     >> John H
>     >> _______________________________________________
>     >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>     >> Computing To change your subscription (digest mode or unsubscribe)
>     >> visit
>     >> https://urldefense.us/v3/__https://link.edgepilot.com/s/9c656d83/pBaaRl2iV0OmLHAXqkoDZQ?u=https:*__;Lw!!PvBDto6Hs4WbVuu7!f3kkkCuq3GKO288fxeGGHi3i-bsSY5P83PKu_svOVUISu7dkNygQtSvIpxHkE0XDvUGSdHI$
>     >> /beowulf.org/cgi-bin/mailman/listinfo/beowulf
>     >>
>     >
>     >
>     > --
>     > Doug
>     >
>     > _______________________________________________
>     > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing
>     > To change your subscription (digest mode or unsubscribe) visit
>     > https://urldefense.us/v3/__https://link.edgepilot.com/s/9c656d83/pBaaRl2iV0OmLHAXqkoDZQ?u=https:**Abeowulf.org*cgi-bin*mailman*listinfo*beowulf__;Ly8vLy8v!!PvBDto6Hs4WbVuu7!f3kkkCuq3GKO288fxeGGHi3i-bsSY5P83PKu_svOVUISu7dkNygQtSvIpxHkE0XDUP8JZUc$
>     >
>
>
>     --
>     Doug
>
>     _______________________________________________
>     Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing
>     To change your subscription (digest mode or unsubscribe) visit
> https://urldefense.us/v3/__https://beowulf.org/cgi-bin/mailman/listinfo/beowulf__;!!PvBDto6Hs4WbVuu7!f3kkkCuq3GKO288fxeGGHi3i-bsSY5P83PKu_svOVUISu7dkNygQtSvIpxHkE0XDv6c1nNc$
>
>

-- 
Doug