[Beowulf] cloud: ho hum?

Wed Feb 1 08:03:25 PST 2012

On 2/1/12 7:08 AM, "Mark Hahn" <hahn at mcmaster.ca> wrote:

>in hopes of leaving the moderation discussion behind,
>here's a more interesting topic: cloud wrt beowulf/hpc.
>
>when I meet cloud-enthused people, normally I just explain how
>HPC clustering has been doing PaaS cloud all along.  there are some
>people who run with it though: bioinformatics people mostly, who
>take personal affront to the concept of their jobs being queued.
>(they don't seem to understand that queueing is a function of how
>efficiently utilized a cluster is, and since a cloud is indeed a
>cluster, you get queueing in a cloud as well.)
>
>part of the issue here seems to be that people buy into a couple
>fallacies that they apply to cloud:
> 	- private sector is inherently more efficient.  this is a bit
> 	of a mystery to me, but I guess this is one of the great rhetorical
> 	successes of the neocon movement.  I've looked at Amazon prices,
> 	and they are remarkably high - depending on purchasing model,
> 	about 20x higher than an academic-run research cluster.  why is there
> 	not more skepticism of outsourcing, since it always means your cost
> 	includes one or more corporate profit margins?

'twas ever thus, I suspect.  We get the same thing at JPL.  Whatever
potential inefficiencies there are with academically oriented or
government toilers, the fact that we are non-profit means instantly that
we have a 10% advantage. But we have an overhead of proving we're not
ripping off the taxpayer, and that probably eats up the advantage

That said, private industry does have some advantages in some
circumstances: They are probably more nimble when it comes to ramping up
manufacturing.  There are definitely inefficiencies in government work,
because of the increased scrutiny that expenditures of tax dollars get.
We bear a heavier burden of proving that we're getting what we paid for,
that the procurement was free and unbiased, etc.  Those $1000 hammer
stories are a case in point.

There are numerous common business to business practices that are outright
illegal when done in a business to government context.  You can argue
about whether the practices are moral or ethical, but the fact of the
matter is that things like finder's fees, profit as a fixed percentage of
job cost, etc are all perfectly legal and common in business.  There are
probably some aspects of this that allow business to perform some task
cheaper than government can, at least in the short run.  That is, business
can externalize some of the costs, while government cannot.

These days, though, industry is paying more for software talent than the
government is (you won't see JPL or civil service offering fresh-out CS
majors $100k/yr+50k hire bonus + 100k RSU like facebook is).

I think that when all is said and done, it's about the same.  After all,
everyone is buying the same sand and the same people to do the work.

Any differences are really small scale arbitrage opportunities.

>
> 	- economies of scale: people seem to think that a datacenter at the
> 	scale of google/amazon/facebook is going to be dramatically cheaper.
> 	while I'm sure they get a good deal from their suppliers, I also
> 	doubt it's game-changing.  power, for instance, is a relatively
> 	modest portion of costs, ~10% per year of a server's purchase price.
> 	machineroom cost is pretty linear with number of nodes (power);
> 	people overhead is very small (say, > 1000 servers per fte.)

I suspect that there's a sort of middle ground where "clouding" or "co-lo
hosting" or "rent a rack' is cheaper.  Someone who has a need for say,
10-50 machines.  That's really not enough to justify a built in
infrastructure, but it's too big to "have the receptionist manage it".

The folks running 1000s of servers, they've got the economy of scale built
in, so they'll be making their choice upon small optimizations (cheaper to
buy Amazon time because our electricity rates happen to be high right now)
or because they have a wildly fluctuating need (we need 10,000 CPUs this
week, but none for the next 3 weeks after that)

But there are thousands and thousands of medium sized organizations that
could probably benefit from "someone else" providing the computing
infrastructure.  Think of some manufacturing and design company that makes
widgets, but needs some server horsepower to do whatever it is.  Their
business isn't doing sys admin, backups, etc.   They can usefully
outsource that to "the cloud" and focus their efforts on their core
competencies. They can work a deal where someone else does the off-site
backups, etc. and they don't have to worry about it.

(yes, they could also hire a consulting company to do much of this as
well, but for "commodity computing" maybe a generic provider "the cloud"
is a better solution.)
>
>most of all, I just don't see how cloud changes the HPC picture at all.
>HPC is already based on shared resources handling burstiness of demand -
>if anything, cloud is simply slower.  certainly I can't submit a job to
>EC2 that uses half the Virgina zone and expect it to run immediately.
>it's not clear to me whether cloud-pushers are getting real traction with
>the funding agencies (gov is neocon here in Canada.)  it worries me that
>cloud might be framed as "better computing than HPC".
>
>I'm curious: what kind of cloudiness are you seeing?

We've got a big "use the cloud" thing going on at JPL (and within NASA as
well).
To a certain extent, I think (personal opinion here, not JPL's or NASA's)
it's a "everyone is talking about cloud, so we better do something with
it, so at least we can comment intelligently".
But it's also useful for bursty load.  We have a real problem with
physical space for more computers amid our aging infrastructure (most of
our buildings are 40-50 years old) and the need for "I must have my hands
on the physical box" is going away, as the interface mechanisms get
smoother and cleaner.

It's all about control, after all.. I, who strongly advocate personal
supercomputers under your desk, because nobody is looking over your
shoulder trying to optimize their utilization, find that the concept of
smoothly divisible and scalable compute power available with a network
connection is pretty close to what you want.

The "external control and optimization" aspects that prompt my desire for
personal computing come about when the cost granularity of the system is
sufficiently coarse that a bureaucracy springs up to manage the system,
which inevitably means that the "transaction cost" to get an increment of
computation goes up, and they impose a minimum transaction size that is
substantially larger than my "incremental need".

Example using test equipment.  I might want to use a $100,000 spectrum
analyzer for half a day.  That's a $5,000/month kind of rental, with a 2
month minimum.  I'd happily pay the $50-100 for a half day's use, but
because the system doesn't accommodate short usage, I'm stuck for $10,000
to do my measurement which is worth $100. And there's no effective way for
me to resell the extra 60 days worth of spectrum analyzer availability.
This is because my need patterns are mismatched to the supply patterns.

The cloud concept has definitely worked to reduce the "transaction cost".
You can buy an hour's time on 100 CPUs, pretty easily.  Nobody is coming
after you to help chip in for the capital cost on the machine room, or
asking you to buy a month's worth of time.

And, I think that in the HPC world in general, this sort of model has
already existed  (and heck, it goes way back to when IBM didn't sell
computers, they sold CPU seconds and Core seconds and I/O Channel
seconds).  But it is totally unfamiliar to a lot of current IT people, who
have never worked with "timesharing" systems.  Their conceptual models are
built around "buy a PC or three or hundred" and then scaled to "buy racks
of servers and put them in a room"  or, "get a loan to buy 1000 servers
and put them in a room", or, perhaps "lease 1000 servers and hire a
room"... All of those are really based upon "buying" (in some sense) a
physical box and providing for it's care and feeding.

The big difference in cloud is that you are buying "service" on a fine
scale.

And the term cloud is just a wonderful sexy marketing description that
no-doubt came from someone looking at network diagrams.  There has been
that "cloud" bubble around for decades to represent "stuff over which we
don't have control nor do we care, it's just there and outside our domain"

>