[Beowulf] Lustre on google cloud

Thu Jul 25 21:00:45 PDT 2019

On 7/25/19 8:26 PM, Jörg Saßmannshausen wrote:
> Dear all, dear Chris,
>
> thanks for the detailed explanation. We are currently looking into cloud-
> bursting so your email was very timely for me as I am suppose to look into it.
>
> One of the issues I can see with our workload is simply getting data into the
> cloud and back out again. We are not talking about a few Gigs here, we are
> talking up to say 1 or more TB. For reference: we got 9 PB of storage (GPFS)
> of which we are currently using 7 PB and there are around 1000+ users
> connected to the system. So cloud bursting would only be possible in some
> cases.
> Do you happen to have a feeling of how to handle the issue with the file sizes
> sensibly?

The issue is bursting with large data sets.  You might be able to 
pre-stage some portion of the data set in a public cloud, and then burst 
jobs from there.  Data motion between sites is going to be the hard 
problem in the mix.  Not technically hard, but hard from a cost/time 
perspective.

-- 
Joe Landman
e: joe.landman at gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman