[Beowulf] Mature open source hierarchical storage management

Nifty Tom Mitchell niftyompi at niftyegg.com
Tue Oct 27 18:02:03 PDT 2009

On Fri, Oct 23, 2009 at 04:12:11PM +1100, Carl Thomas wrote:
> Date: Fri, 23 Oct 2009 16:12:11 +1100
>    We are currently in the midst of planning a major refresh of our existing
>    HPC cluster.


Do add "PowerFile" to your research list.


My back of the email envelope view of what you are doing should have
quick cluster disks for binary objects, swap and libs /scratch /tmp and a
largish NFS RAID based filesystem with an archival back end.  Perhaps a
large slow spinning disk staging RAID in the middle or off to the side too.

There are multiple "delta equations" that
you need to evaluate.  I know I missed some

   - delta file change (GB/day).
   - performance delta at each layer.
   - cost delta at each layer.
   - management cost delta
   - operational cost delta
   - cost of compliance -- what the law requires, by method.
   - cost of physical storage on and off site, include handling and shipping.
   - cost of user training delta.
   - cost of expansion delta.
   - cost of necessary bandwidth, by layer.

Clusters are unique in that they have the potential
of hosting their own distributed RAID (lustre, gluster, zfs)
and with a sufficient archival backend life could be good.
Thus select systems that you can add a second disk to.

Choice of filesystem can help too (see dmapi and friends).

Have fun.

More information about the Beowulf mailing list