[Beowulf] Putting /home on Lusture of GPFS

Tue Dec 23 09:33:53 PST 2014

On 12/23/2014 12:12 PM, Prentice Bisbal wrote:
> Beowulfers,
>
> I have limited experience managing parallel filesytems like GPFS or 
> Lustre. I was discussing putting /home and /usr/local for my cluster 
> on a GPFS or Lustre filesystem, in addition to using it just for 
> /scratch. I've never done this before, but it doesn't seem like all 
> that bad an idea. My logic for this is the following:

This is not a great idea ...

>
> 1. Users often try to run programs from in /home, which leads to 
> errors, no matter how many times I tell them not to do that. This 
> would make the system more user-friendly. I could use quotas/policies 
> to encourage them to use 'steer' them to use other filesystems if needed.

This is an educational problem more than anything else.  You could 
easily set up their bashrc/others to cd to $SCRATCH on login, or process 
startup.

>
> 2. Having one storage system to manage is much better than 3.

True, though having one system increases the need for stability and 
performance of that one file system.

>
> 3. Profit?
>
> Anyway, another person in the conversation felt that this would be 
> bad, because if someone was running a job that would hammer the 
> fileystem, it would make the filesystem unresponsive, and keep other 
> people from logging in and doing work. I'm not buying this concern for 
> the following 

This happens.  I've seen it happen.  Many people have seen this happen.  
It does happen.

> reasons:
>
> If a job can hammer your parallel filesystem so that the login nodes 
> become unresponsive, you've got bigger problems, because that means 
> other jobs can't run on the cluster, and the job hitting the 
> filesystem hard has probably slowed down to a crawl, too.

Note that "hitting the file system hard" could be

a) an IOP storm (millions of small files, think 
bioinformatics/proteomics/*omics code), which starves the rest of the 
system from IOP standpoint.  Its always fun to watch these, for a 
schadenfreude definition of the word "fun".  Just try doing a 'df -h .' 
on a directory on a file system being hammered in an IOP storm.  This is 
pretty much the definition of Denial Of Service. Its very annoying when 
your users are denied service.

b) sudden massive bolus on the part of a cluster job *really* makes 
peoples vi and other sessions ... surprising (again with that 
schadenfreude definition of the word "surprising").   IOPs and stats may 
work, but so much bandwidth and bulk data is flowing, that your storage 
systems can't keep up.  This happens less frequently, and with a good 
design/implementation you can largely mitigate this (c.f. 
https://scalability.org/images/dash-3.png )

c) scratch down time for any reason now prevents users from using the 
system.  That is, the failure radius is now much larger and less 
localized, impacting people whom might be able to otherwise work.

/home should generally be on a reasonably fast and very stable 
platform.  Apply quotas, and active LARTing, with daily/weekly/monthly 
metrics so that it doesn't get abused. /usr/local, similar issue (though 
you can simple NFS ro mount that in most cases).

You can do it, but beware the issues.

>
> I know there are some concerns  with the stability of parallel 
> filesystems, so if someone wants to comment on the dangers of that, 
> too, I'm all ears. I think that the relative instability of parallel 
> filesystems compared to NFS would be the biggest concern, not 
> performance.
>

Performance is always concern (see the point "a" above).  Most of these 
things can be handled with education, and some automation (login and 
batch automatically generate a temp directory, and chdir the user into 
it, with a failure test built into the login, so if the PFS is down, it 
will revert to $HOME ).

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
twtr : @scalableinfo
phone: +1 734 786 8423 x121
cell : +1 734 612 4615