[Beowulf] HPC cloud bursting providers?

Jeff Friedman jeff.friedman at siliconmechanics.com
Thu Feb 23 13:18:57 PST 2017


Looks interesting, thank you.

Jeff Friedman
Sales Engineer
o: 425.420.1291
c: 206.819.2824
www.siliconmechanics.com


On Feb 23, 2017, at 1:14 PM, Chris Dagdigian <dag at sonsorol.org> wrote:


On Amazon you should be looking at CfnCluster:

https://aws.amazon.com/hpc/cfncluster/ <https://aws.amazon.com/hpc/cfncluster/>

The entire HPC stack is:

- Directly supported by AWS
- Written, used and deployed as a CloudFormation template
- Very easy to extend and customize
- Is under more active development than MIT StarCluster which is what we used to use for similar purposes

Normally we use CfnCluster for AWS-only HPC but it should be amenable to a cloudbursting scenario especially if your VPN/VPC links are solid

Chris
> Jeff Friedman <mailto:jeff.friedman at siliconmechanics.com <mailto:jeff.friedman at siliconmechanics.com>>
> February 23, 2017 at 3:56 PM
> Thank you all for the info, it is very useful.  It seems most of the cloud orchestrator software includes a bit more functionality than we need. We want to use the standard HPC provisioning, scheduling, and monitoring software, and just automate the setup and presentation of the cloud nodes. We are looking into establishing a VPN to AWS, and then continuing to see what software would do the best job of the automated setup/teardown of cloud resources. We are looking at just using AWS CloudFormation as an option. There is also Bright Computing Cluster Manager, Cycle Computing, RightScale, and a couple others. But again, I think these are a bit to robust for what we need.  I’ll keep y’all posted if interested.
> 
> Thanks again!
> 
> Jeff Friedman
> Sales Engineer
> o: 425.420.1291
> c: 206.819.2824
> www.siliconmechanics.com
> 
> 
> On Feb 23, 2017, at 10:49 AM, Lev Lafayette<lev.lafayette at unimelb.edu.au>  wrote:
> 
> On Wed, 2017-02-22 at 10:02 +1100, Christopher Samuel wrote:
>> On 21/02/17 12:40, Lachlan Musicman wrote:
>> 
>>> I know that it's been done successfully here by the University of
>>> Melbourne's Research Platforms team - but they are bursting into the non
>>> commercial Aust govt Open Stack installation Nectar.
> 
> In context that was after (a) small test cases of cloud bursting worked
> and (b) cloud bursting was used to replace our existing cloud partition.
> 
>> So now they just provision extra VM's when they need more and add them
>> to Slurm and given demand doesn't seem to go down there hasn't been a
>> need to take any away yet. :-)
> 
> Watch this space ;)
> 
>> So this doesn't really reflect what Jeff was asking about as it's all
>> the same infrastructure, it's not hitting remote clouds where you have
>> to figure out how you are going to see your filesystem there, or how to
>> stage data.
>> 
> 
> Very much so. The ability to set up an additional partition to external
> providers (e.g., amazon, azure, any openstack provider) is much less of a
> problem that the interconnect issues which are quite significant.
> 
> 
> All the best,
> 
> 
> Lev Lafayette <mailto:lev.lafayette at unimelb.edu.au <mailto:lev.lafayette at unimelb.edu.au>>
> February 23, 2017 at 1:49 PM
> On Wed, 2017-02-22 at 10:02 +1100, Christopher Samuel wrote:
>> On 21/02/17 12:40, Lachlan Musicman wrote:
>> 
>>> I know that it's been done successfully here by the University of
>>> Melbourne's Research Platforms team - but they are bursting into the non
>>> commercial Aust govt Open Stack installation Nectar.
> 
> In context that was after (a) small test cases of cloud bursting worked
> and (b) cloud bursting was used to replace our existing cloud partition.
> 
>> So now they just provision extra VM's when they need more and add them
>> to Slurm and given demand doesn't seem to go down there hasn't been a
>> need to take any away yet. :-)
> 
> Watch this space ;)
> 
>> So this doesn't really reflect what Jeff was asking about as it's all
>> the same infrastructure, it's not hitting remote clouds where you have
>> to figure out how you are going to see your filesystem there, or how to
>> stage data.
>> 
> 
> Very much so. The ability to set up an additional partition to external
> providers (e.g., amazon, azure, any openstack provider) is much less of a
> problem that the interconnect issues which are quite significant.
> 
> 
> All the best,
> 
> 
> Christopher Samuel <mailto:samuel at unimelb.edu.au <mailto:samuel at unimelb.edu.au>>
> February 21, 2017 at 6:02 PM
> 
> We (I help them with this) did try cloudbursting into the Melbourne Uni
> Openstack instance (the same place that provided the VM's for most of
> Spartan) but had to give up on it because of a bug in Slurm (since
> fixed) and the unreliability of bringing up VM's - from memory we had
> one particular case where it tried to boot about 50 nodes and about 20
> of them failed to start.
> 
> So now they just provision extra VM's when they need more and add them
> to Slurm and given demand doesn't seem to go down there hasn't been a
> need to take any away yet. :-)
> 
> So this doesn't really reflect what Jeff was asking about as it's all
> the same infrastructure, it's not hitting remote clouds where you have
> to figure out how you are going to see your filesystem there, or how to
> stage data.
> 
> cheers!
> Chris
> Lachlan Musicman <mailto:datakid at gmail.com <mailto:datakid at gmail.com>>
> February 20, 2017 at 8:40 PM
> I know that it's been done successfully here by the University of Melbourne's Research Platforms team - but they are bursting into the non commercial Aust govt Open Stack installation Nectar. http://nectar.org.au <http://nectar.org.au/>
> 
> cheers
> L.
> 
> 
> 
> ------
> The most dangerous phrase in the language is, "We've always done it this way."
> 
> - Grace Hopper
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf <http://www.beowulf.org/mailman/listinfo/beowulf>
> Jeff Friedman <mailto:jeff.friedman at siliconmechanics.com <mailto:jeff.friedman at siliconmechanics.com>>
> February 20, 2017 at 7:03 PM
> Hi all,
> 
> Has anyone dealt with bursting from an on-prem HPC cluster to the cloud before? Are there any providers that stand out in this category? Quick searches reveal the usual suspects (AWS, Google, etc). Just wondered what real world experience has to say… :)
> 
> Thanks!
> 
> Jeff Friedman
> Sales Engineer
> c: 206.819.2824
> www.siliconmechanics.com <http://www.siliconmechanics.com/>
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf <http://www.beowulf.org/mailman/listinfo/beowulf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20170223/1dceedda/attachment-0001.html>


More information about the Beowulf mailing list