[Beowulf] Suggestions to what DFS to use

Mon Feb 13 11:45:05 PST 2017

On 02/13/17 14:00, Greg Lindahl wrote:
> On Mon, Feb 13, 2017 at 07:55:43AM +0000, Tony Brian Albers wrote:
>> Hi guys,
>>
>> So, we're running a small(as in a small number of nodes(10), not
>> storage(170TB)) hadoop cluster here. Right now we're on IBM Spectrum
>> Scale(GPFS) which works fine and has POSIX support. On top of GPFS we
>> have a GPFS transparency connector so that HDFS uses GPFS.
>
> I don't understand the question. Hadoop comes with HDFS, and HDFS runs
> happily on top of shared-nothing, direct-attach storage. Is there
> something about your hardware or usage that makes this a non-starter?
> If so, that might help folks make better suggestions.

I'm guessing the "POSIX support" is the piece that's missing with a 
native HDFS installation.  You can kinda-sorta get a form of it with 
plug-ins, but it's not a first-class citizen like in most DFS and when I 
used it last it was not performant.  Native HDFS makes large datasets 
expensive to work with in anything but Hadoop-ready (largely MR) 
applications.  If there is a mixed workload, having a filesystem that 
can support both POSIX access and HDFS /without/ copies is invaluable. 
With extremely large datasets (170TB is not that huge anymore), copies 
may be a non-starter.  With dated codebases or applications that don't 
fit the MR model, complete movement to HDFS may also be a non-starter.

The questions I feel need to be answered here to get good answers rather 
than a shotgun full of random DFS's are:

1. How much time and effort are you willing to commit to setup and 
administration of the DFS?  For many completely open source solutions 
(Lustre and HDFS come to mind) setup and more critically maintenance can 
become quite heavyweight, and performance tuning can grow to 
summer-grad-student-internship level.

2. Are you looking to replace the hardware, or just the DFS?  These 
days, 170 TB is at the fringes (IMHO) of what can fit reasonably into a 
single (albeit rather large) box.  It wouldn't be completely unthinkable 
to run all of your storage with ZFS/BTRFS, a very beefy server, 
redundant 10, 25 or 40GE NICs, some SSD acceleration, a UPS, and 
plain-jane NFS (or your protocol of choice out of most Linux distros). 
You could even host the HDFS daemons on that node, pointing at POSIX 
paths rather than devices.  But this falls into the category of "host it 
yourself," so that might be too much work.

3. How committed to HDFS are you (i.e., what features of it do your 
applications actually leverage)?  Many map reduce applications actually 
have zero attachment to HDFS whatsoever.  You can reasonably re-point 
them at posix-complaint NAS and they'll "just work."  Plus you get 
cross-protocol access to the files without any wizardry, copying, etc. 
HBase is a notable example of where they've built dependence on HDFS 
into the code, but that's more the exception than the norm.

Best,

ellis

Disclaimer: I work for Panasas, a storage appliance vendor.  I don't 
think I'm shamelessly plugging anywhere above as I love when people host 
themselves, but it's not for everybody.