[Beowulf] Mature open source hierarchical storage management

Jon Forrest jlforrest at berkeley.edu
Fri Oct 23 13:56:17 PDT 2009


Carl Thomas wrote:
> HI all,
> 
> We are currently in the midst of planning a major refresh of our 
> existing HPC cluster.
> It is expected that our storage will consist of a combination of fast 
> fibre channel and SATA based disk and we would like to implement a 
> system whereby user files are automatically migrated to and from slow 
> storage depending on frequency of usage. Initial investigations seem to 
> indicate that larger commercial hierarchical storage management systems 
> vastly exceed our budget.

About 15 years ago I did A LOT of work with various HSMs on Unix.
They were all very fragile, but I think this was mostly due to
one prevailing problem. This is that, at the time, the OSs didn't
have hooks in the places necessary for an HSM system to do the
right thing. So HSM vendors had to make custom mods to the kernel,
or else perform other heroics, to fake out the file system to
be able to do migrations transparently to and from slow storage
and fast storage.

When you look at current HSM implementations I would suggest you
look at this issue to see how this issue is handled today.

Depending on the implementation of the HSM system, it might
use a data base to keep track of where things on the slower
media are kept. You might want to make sure that the slower media
is is organized in a way so that the data is self describing in case
the database becomes corrupt. Otherwise, you'd have just a pile
of bits if this happens. Ironically, the system I was using used
a University Ingres database for this. When the database
became corrupt this was very embarrassing since I was using
this HSM system for work that the Postgres database group
was doing. This was the same group that had originally developed
Ingres.

I hope HSMs have become better since then.

Cordially,
-- 
Jon Forrest
Research Computing Support
College of Chemistry
173 Tan Hall
University of California Berkeley
Berkeley, CA
94720-1460
510-643-1032
jlforrest at berkeley.edu



More information about the Beowulf mailing list