[Beowulf] Putting /home on Lusture of GPFS

Jeff Johnson jeff.johnson at aeoncomputing.com
Tue Dec 23 09:22:04 PST 2014


1. A little administrative 'tough love' isn't a bad thing. This is even 
if you unify everything under Lustre or GPFS. That same user could use 
up all of the inodes in your Lustre MDT just as easily as they can 
implode your NFS with reckless usage.

I have seen several instances of /home running on Lustre. Just know the 
tradeoffs up front and if you are comfortable with them do it. Given the 
small block random I/O challenges in Lustre it can be a more robust 
approach to have places where different I/O can be run. (NFS and Lustre 
filesystems). That all depends on your NFS infrastructure being able to 
endure normal /home usage and small-block/random jobs.

Just my $.02 worth.


On 12/23/14 9:12 AM, Prentice Bisbal wrote:
> Beowulfers,
>
> I have limited experience managing parallel filesytems like GPFS or 
> Lustre. I was discussing putting /home and /usr/local for my cluster 
> on a GPFS or Lustre filesystem, in addition to using it just for 
> /scratch. I've never done this before, but it doesn't seem like all 
> that bad an idea. My logic for this is the following:
>
> 1. Users often try to run programs from in /home, which leads to 
> errors, no matter how many times I tell them not to do that. This 
> would make the system more user-friendly. I could use quotas/policies 
> to encourage them to use 'steer' them to use other filesystems if needed.
>
> 2. Having one storage system to manage is much better than 3.
>
> 3. Profit?
>
> Anyway, another person in the conversation felt that this would be 
> bad, because if someone was running a job that would hammer the 
> fileystem, it would make the filesystem unresponsive, and keep other 
> people from logging in and doing work. I'm not buying this concern for 
> the following reasons:
>
> If a job can hammer your parallel filesystem so that the login nodes 
> become unresponsive, you've got bigger problems, because that means 
> other jobs can't run on the cluster, and the job hitting the 
> filesystem hard has probably slowed down to a crawl, too.
>
> I know there are some concerns  with the stability of parallel 
> filesystems, so if someone wants to comment on the dangers of that, 
> too, I'm all ears. I think that the relative instability of parallel 
> filesystems compared to NFS would be the biggest concern, not 
> performance.
>


-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117

High-performance Computing / Lustre Filesystems / Scale-out Storage



More information about the Beowulf mailing list