[Beowulf] HPC in the cloud question

Joe Landman landman at scalableinformatics.com
Fri May 8 07:17:00 PDT 2015

On 05/08/2015 10:04 AM, Jason Ingram wrote:
> Azure does offer InfiniBand based VM's, and CentOS is one of their
> six primary distributions.
> http://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-endorsed-distributions/
>  I wish I had more to offer on the subject, I joined this community
> as a personal choice to try to learn more about HPC and Beowulf type
> clusters(very new to it that technology area).   I am an Azure
> architect though, so am happy to answer questions regarding Azure.

The big issue for performance systems will be how thin/performant the 
link to the bare metal/silicon resources are.

Clouds are fantastic capacity machines, and if you have workloads that 
match that, great.  General clouds are not good on the capability side. 
  You need a very specific architecture/implementation for them to make 
sense in this regard.

Generally (though there are a few special cases) hypervirtualization 
isn't as performant as "bare metal".  This is one of several reasons why 
containers are so interesting to so many people.  Paravirtualization is 
simply not performant, and is largely for an infrastructure density 
play, where fundamental app performance isn't the major issue.

For very high performance cloud/on-demand architectures, you need 
something very close to the metal.  There you have a more limited set of 
choices, including our hosts Penguin on Demand system (or to toot our 
own horn in financial services, Lucera).  There the virtualization (if 
it exists) is very closely coupled to the bare metal or containerized so 
you don't get the performance degradation common in many cloud designs.

The danger with cloud (and pretty much every other technology) is 
believing the hype and assuming its a silver bullet to solve every 
problem.  In HPC its more along the lines of "it depends", usually on 
the use case.  For non-performance sensitive workloads, it can be 
fantastic.  For performance sensitive workloads, you need to be careful 
where you apply it.

FWIW:  there's a strong argument to be made that workloads are generally 
getting performance sensitive given the volume of data people are 
manipulating, so there will be pressure on cloud builders to adopt 
architectures/implementations more along the lines of what we built at 
Lucera and others.

My $0.02USD, and note that my biases should be quite obvious.

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: landman at scalableinformatics.com
w: http://scalableinformatics.com
t: @scalableinfo
p: +1 734 786 8423 x121
c: +1 734 612 4615

More information about the Beowulf mailing list