[Beowulf] GPFS question

Jörg Saßmannshausen sassy-work at sassy.formativ.net
Mon Apr 29 14:33:57 PDT 2019


Dear all,

just a quick question regarding GPFS:
we are running a 9 PB GPFS storage space at work of which around 6 -7 PB are 
used. It is a single file system but with different file-sets installed on it.
During our routine checks we found that:
$ mmhealth node show -n all
reports this problem:

fserrinvalid(FOO)

(where FOO is being the file system).

Our vendor suggested to do an online check:

$ mmfsck FOO -o -y

which is still running. 
Today the vendor suggested to take the GPFS file system offline and run the above 
command without the -o option, which would lead to an outage. 

So my simply question is: has anybody ever done that on such a large file set 
and how long roughly would that take? Every time I am asking this question I 
get told: a long time! 
Our vendor told us we could use for example 
--threads 128
as oppose to the normally used 16 threads, so I am aware my mileage will vary 
here a bit, but I would just like a guestimate of the time. 

Many thanks for your help here!

All the best from London

Jörg



More information about the Beowulf mailing list