[Beowulf] Lustre on google cloud
Jörg Saßmannshausen
sassy-work at sassy.formativ.net
Thu Jul 25 17:26:47 PDT 2019
Dear all, dear Chris,
thanks for the detailed explanation. We are currently looking into cloud-
bursting so your email was very timely for me as I am suppose to look into it.
One of the issues I can see with our workload is simply getting data into the
cloud and back out again. We are not talking about a few Gigs here, we are
talking up to say 1 or more TB. For reference: we got 9 PB of storage (GPFS)
of which we are currently using 7 PB and there are around 1000+ users
connected to the system. So cloud bursting would only be possible in some
cases.
Do you happen to have a feeling of how to handle the issue with the file sizes
sensibly?
Sorry for hijacking the thread here a bit.
All the best from a hot London
Jörg
Am Montag, 22. Juli 2019, 14:14:13 BST schrieb Chris Dagdigian:
> A lot of production HPC runs on cloud systems.
>
> AWS is big for this via their AWS Parallelcluster stack which does
> include lustre support via vfXT for lustre service although they are
> careful to caveat it as staging/scratch space not suitable for
> persistant storage. AWS has some cool node types now with 25gig, 50gig
> and 100-gigabit network support.
>
> Microsoft Azure is doing amazing things now that they have the
> cyclecomputing folks on board, integrated and able to call shots within
> the product space. They actually offer bare metal HPC and infiniband
> SKUs now and have some interesting parallel filesystem offerings as well.
>
> Can't comment on google as I've not touched or used it professionally
> but AWS and Azure for sure are real players now to consider if you have
> an HPC requirement.
>
>
> That said, however, a sober cost accounting still shows on-prem or
> "owned' HPC is best from a financial perspective if your workload is
> 24x7x365 constant. The cloud based HPC is best for capability, bursty
> workloads, temporary workloads, auto-scaling, computing against
> cloud-resident data sets or the neat new model where instead of on-prem
> multi-user shared HPC you go out and decide to deliver individual
> bespoke HPC clusters to each user or team on the cloud.
>
> The big paradigm shift for cloud HPC is that it does not make a lot of
> sense to make a monolithic stack shared by multiple competing users and
> groups. The automated provisioning and elasticity of the cloud make it
> more sensible to build many clusters so that you can tune each cluster
> specifically for the cluster or workload and then blow it up when the
> work is done.
>
> My $.02 of course!
>
> Chris
>
> > Jonathan Aquilina <mailto:jaquilina at eagleeyet.net>
> > July 22, 2019 at 1:48 PM
> >
> > Hi Guys,
> >
> > I am looking at
> > https://cloud.google.com/blog/products/storage-data-transfer/introducing-l
> > ustre-file-system-cloud-deployment-manager-scripts
> >
> > This basically allows you to deploy a lustre cluster on google cloud.
> > In your HPC setups have you considered moving towards cloud based
> > clusters?
> >
> > Regards,
> >
> > Jonathan
> >
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
More information about the Beowulf
mailing list