[Beowulf] integrating node disks into a cluster filesystem?
jlb17 at duke.edu
Fri Sep 25 15:32:36 PDT 2009
On Fri, 25 Sep 2009 at 6:09pm, Mark Hahn wrote
> but since 1U nodes are still the most common HPC building block, and most of
> them support 4 LFF SATA disks with very little added cost (esp using the
> chipset's integrated controller), is there a way to integrate them into a
> whole-cluster filesystem?
This is something I've considered/toyed-with/lusted after for a long
while. I haven't pursued it as much as I could have because the clusters
I've run to this point have generally run embarrassingly parallel jobs,
and I train the users to cache data-in-progress to scratch space on the
nodes. But there's a definite draw to a single global scratch space that
scales automatically with the cluster itself.
> - obviously want to minimize the interference of remote IO to a node's jobs.
> for serial jobs, this is almost moot. for loosely-coupled parallel jobs
> (whether threaded or cross-node), this is probably non-critical. even for
> tight-coupled jobs, perhaps it would be enough to reserve a core for
> admin/filesystem overhead.
I'd also strongly consider a separate network for filesystem I/O.
> - distributed filesystem (ceph? gluster? please post any experience!) I
> know it's possible to run oss+ost services on a lustre client, but not
> recommended because of the deadlock issue.
I played with PVFS1 a bit back in the day. My impression at the time was
they they were focused on MPI-IO, and the POSIX layer was a bit of an
afterthought -- access with "regular" tools (tar, cp, etc) was pretty
slow. I don't know what the situation is with PVFS2. Anyone?
> - this is certainly related to more focused systems like google/mapreduce.
> but I'm mainly looking for more general-purpose clusters - the space would
> be used for normal files, and definitely mixed read/write with something
> close to normal POSIX semantics...
It seems we're after the same thing.
QB3 Shared Cluster Sysadmin
More information about the Beowulf