[Beowulf] cloud: ho hum?
landman at scalableinformatics.com
Wed Feb 1 07:45:42 PST 2012
On 02/01/2012 10:08 AM, Mark Hahn wrote:
> in hopes of leaving the moderation discussion behind,
> here's a more interesting topic: cloud wrt beowulf/hpc.
> when I meet cloud-enthused people, normally I just explain how
> HPC clustering has been doing PaaS cloud all along. there are some
> people who run with it though: bioinformatics people mostly, who
> take personal affront to the concept of their jobs being queued.
Heh ... to put it mildly, this subset of HPC users tend to be more prone
to fads a fair number of others. As often as not, we have to work to
solve the real problem in part by helping to unmask the real problem
(and move past the perceptions of what some CS person told them the
> (they don't seem to understand that queueing is a function of how
> efficiently utilized a cluster is, and since a cloud is indeed a
> cluster, you get queueing in a cloud as well.)
Sort of, but the illusion in a cloud is, that its all theirs, regardless
of whether or not its emulated/virtualized/bare metal.
> part of the issue here seems to be that people buy into a couple
> fallacies that they apply to cloud:
> - private sector is inherently more efficient. this is a bit
> of a mystery to me, but I guess this is one of the great rhetorical
> successes of the neocon movement. I've looked at Amazon prices,
I'll ignore the obvious (and profoundly incorrect) political stance
here, and focus upon the (failed) economic argument. Yes, the
competitive private sector is *always* more efficient at delivering
goods and services than the non-competitive government sector. The only
time the private sector is less efficient is when there is no meaningful
competition, then the consumers of a good or service will pay market
pricing set, not by competitive forces, but by the preference of the
dominant vendor which does not need to compete to win the business.
For example, in desktop software environments, for the better part of 20
years, Microsoft has been the dominant player, and has had complete
freedom to set whatever pricing it wishes. Now that it faces
competitive pressure on several fronts, you are seeing pricing starting
to react accordingly to market forces.
Economics 101 applies: Competitive market forces enable efficient
markets. Non-competitive market forces don't.
> and they are remarkably high - depending on purchasing model,
> about 20x higher than an academic-run research cluster. why is there
Hmmm ... I don't think you are taking everything into account, and more
to the point, you are not comparing apples to oranges. Compare Amazon
to CRL to Joyent to Sabalcore to ... . You will find competitive
pricing among these for similar use cases. In all cases, your up front
costs and recurring costs are capped. You want to use 10k nodes for 1
hour, you can. And it won't cost you 10k nodes of capital +
infrastructure, power, cooling, ... to make it happen. You want 10k
nodes for one hour at an academic site? Get in line, and someone has to
have laid out the capex for all of this. Just because you don't see
this direct cost, or the chargeback to you as an end user doesn't
reflect a cost recovery and a profit (latter being irrelevant for most
academic sites) doesn't mean it "costs 1/20 as much". It means you
haven't accounted for the real costs correctly.
> not more skepticism of outsourcing, since it always means your cost
> includes one or more corporate profit margins?
... and is corporate profit a bad thing? Seriously?
There is a cost associated with you not taking the capital charge for
the systems you use, or for the OPEX of using them. Or for the other
indirect costs surrounding the rest of this. You are paying for the
privilege of keeping your costs down. So, for an academic user that has
to obtain 10k CPU hours on 1000 CPUs, in order to solve their problem,
they can a) sign on to and get a grant for SHARCNET and others, which
involve some sort of charge back mechanism (cause SHARCNET and others
have to pay for their power, cooling, data, people) b) build their own
cluster (which makes sense only if you do many runs), c) buy it from
Amazon/CRL/Sabalcore/... and only pay for what they use and start
running right away.
So which one makes the most sense? Rhetorical question to a degree as
it depends strongly upon the use case, the grant needs, etc.
> - economies of scale: people seem to think that a datacenter at the
> scale of google/amazon/facebook is going to be dramatically cheaper.
It generally is.
> while I'm sure they get a good deal from their suppliers, I also
> doubt it's game-changing. power, for instance, is a relatively
> modest portion of costs, ~10% per year of a server's purchase price.
Then why do Google et al colocate their data centers near cheap power if
power is only a modest/minute fraction of the total cost? TCO matters,
and if you have to pay for power 24x7 during the life of the system, you
want to minimize this cost. Multiple the cost of power for 1 server by
100k, add in other bits, and this modest fraction starts adding up to
significant amounts (and fractions of the total cost), very quickly. It
can be game changing. Which is why they locate their data centers where
there is an optimin (minimizing total lifetime costs of power, taxes,
etc.) as compared with the nearby data center where you pay a premium
> machineroom cost is pretty linear with number of nodes (power);
> people overhead is very small (say,> 1000 servers per fte.)
> most of all, I just don't see how cloud changes the HPC picture at all.
> HPC is already based on shared resources handling burstiness of demand -
Not all HPC is this way. Actually most isn't.
> if anything, cloud is simply slower. certainly I can't submit a job to
> EC2 that uses half the Virgina zone and expect it to run immediately.
> it's not clear to me whether cloud-pushers are getting real traction with
> the funding agencies (gov is neocon here in Canada.) it worries me that
> cloud might be framed as "better computing than HPC".
> I'm curious: what kind of cloudiness are you seeing?
Quite a bit. People are looking at clouds for private use with trivial
extension to public usage for computing. We are seeing huge amounts of
private storage cloud builds.
Cloud is ASP v3 (or v4 if you count clusters). In ASPs, large external
high cost gear was centralized. Economics simply didn't work for it and
this model died. Clusters started around then. Grid/Utility Computing
started around then, and Amazon launched their offering at the notional
end of this market. Grid was largely a bust from a commercial view, as
it again had bad economics. Clusters were in full blossom then.
Economics favored them. If you like to look at Clusters as ASP v3, you
can, though they've been running along side of the fads. Cloud is ASP
v3 or v4 (if you say clusters were v3). Natural evolution of taking a
cluster, putting a VM on demand on it, or running something bare metal
on it. Where its located matters to a degree, and data motion is still
the hardest problem, and its getting harder. This is why private data
clouds (and computing clouds) are getting more popular.
This said, like all other fads/trends, Cloud is (massively over-)hyped.
It has value, it has staying power (unlike grid, ASP, ...). It solves
a specific set of problems, and does so well, and you pay a premium for
solving those set of problems in that manner. We see more folks
building private clouds (e.g. clusters with more intelligent
allocation/provisioning) than we do see people run exclusively on the cloud.
In financial services, we've had customers tell us how wonderful it was
(from a convenience view) and how awful it was (from a performance
view). It matters more to people who care about getting cycles than for
people who care about getting really good single CPU performance. Cloud
is a throughput engine, and this mode of operation is becoming more
important over time. Even in HPC. Especially with BigData (hey, wanna
talk about a massively over-hyped term? There's one for ya ... they
hype masks the real issues, and this is a shame, but such is life).
And for what its worth, VC's are positively throwing money at cloud/big
data companies. This doesn't make it better. Probably worse. But
thats a whole other discussion.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf