[Beowulf] Lustre on google cloud

Thu Jul 25 17:26:47 PDT 2019

Dear all, dear Chris,

thanks for the detailed explanation. We are currently looking into cloud-
bursting so your email was very timely for me as I am suppose to look into it. 

One of the issues I can see with our workload is simply getting data into the 
cloud and back out again. We are not talking about a few Gigs here, we are 
talking up to say 1 or more TB. For reference: we got 9 PB of storage (GPFS) 
of which we are currently using 7 PB and there are around 1000+ users 
connected to the system. So cloud bursting would only be possible in some 
cases. 
Do you happen to have a feeling of how to handle the issue with the file sizes 
sensibly? 

Sorry for hijacking the thread here a bit.

All the best from a hot London

Jörg

Am Montag, 22. Juli 2019, 14:14:13 BST schrieb Chris Dagdigian:
> A lot of production HPC runs on cloud systems.
> 
> AWS is big for this via their AWS Parallelcluster stack which does
> include lustre support via vfXT for lustre service although they are
> careful to caveat it as staging/scratch space not suitable for
> persistant storage.  AWS has some cool node types now with 25gig, 50gig
> and 100-gigabit network support.
> 
> Microsoft Azure is doing amazing things now that they have the
> cyclecomputing folks on board, integrated and able to call shots within
> the product space. They actually offer bare metal HPC and infiniband
> SKUs now and have some interesting parallel filesystem offerings as well.
> 
> Can't comment on google as I've not touched or used it professionally
> but AWS and Azure for sure are real players now to consider if you have
> an HPC requirement.
> 
> 
> That said, however, a sober cost accounting still shows on-prem or
> "owned' HPC is best from a financial perspective if your workload is
> 24x7x365 constant.  The cloud based HPC is best for capability,  bursty
> workloads, temporary workloads, auto-scaling, computing against
> cloud-resident data sets or the neat new model where instead of on-prem
> multi-user shared HPC you go out and decide to deliver individual
> bespoke HPC clusters to each user or team on the cloud.
> 
> The big paradigm shift for cloud HPC is that it does not make a lot of
> sense to make a monolithic stack shared by multiple competing users and
> groups. The automated provisioning and elasticity of the cloud make it
> more sensible to build many clusters so that you can tune each cluster
> specifically for the cluster or workload and then blow it up when the
> work is done.
> 
> My $.02 of course!
> 
> Chris
> 
> > Jonathan Aquilina <mailto:jaquilina at eagleeyet.net>
> > July 22, 2019 at 1:48 PM
> > 
> > Hi Guys,
> > 
> > I am looking at
> > https://cloud.google.com/blog/products/storage-data-transfer/introducing-l
> > ustre-file-system-cloud-deployment-manager-scripts
> > 
> > This basically allows you to deploy a lustre cluster on google cloud.
> > In your HPC setups have you considered moving towards cloud based
> > clusters?
> > 
> > Regards,
> > 
> > Jonathan
> > 
> > 
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf