[Beowulf] dedupe filesystem

Mark Hahn hahn at mcmaster.ca
Fri Jun 5 07:22:55 PDT 2009


>>> The best of both worlds would certainly be a central, fast storage
>>> filesystem, coupled with a hierarchical storage management system.
>>
>> I'm not sure - is there some clear indication that one level of storage is
>> not good enough?
>
> I guess it strongly depends on your workload and applications. If your users
> tend to keep all their files for long-term purposes, as Bogdan Costescu
> pertinently described earlier, it might make sense to transparently free up
> the fast centralized filesystem and move the unused-at-the-moment-but-still-
> crucially-important files to a slower, farther filesystem (or tapes).

no: my point concerned only whether there is a meaningful distinction
between what you call fast and slow storage.  from a hardware perspective,
there is can be differences in latency, though probably not noticed by 
anything other than benchmarks or quite specialized apps.  there can also
be aggregate throughput differences: an OST on QDR IB vs Gb.

so, an open question: do HPC people still buy high-end disks (small FC/SAS
disks, usually also 10-15k)?  we're putting them in our next MDS, but
certainly not anywhere else.

another question: do HPC people still buy tapes?  I'm personally opposed,
but some apparently reasonable people find them comforting (oddly, because 
they can to some limited extent be taken offline.)

>> this seems like a bad design to me.  I would think (and I'm reasonably
>> familiar with Lustre, though not an internals expert) that if you're going
>> to touch Lustre interfaces at all, you should simply add cheaper,
>> higher-density OSTs, and make more intelligent placement/migration
>> heuristics.
>
> In Lustre, that would be done through OST pools. Eh, isn't this also a feature
> CEA contributed to? :)

is there a documented, programmatic lustre interface to control OST placement
(other than lfs setstripe)?  such an interface would also need a
high-performance way to query the MDS directly (not through the mounted 
FS, which is too slow for any seriously large FS.)



More information about the Beowulf mailing list