[Beowulf] Mature open source hierarchical storage management
Nifty Tom Mitchell
niftyompi at niftyegg.com
Tue Oct 27 18:02:03 PDT 2009
these
On Fri, Oct 23, 2009 at 04:12:11PM +1100, Carl Thomas wrote:
> Date: Fri, 23 Oct 2009 16:12:11 +1100
> We are currently in the midst of planning a major refresh of our existing
> HPC cluster.
Carl,
Do add "PowerFile" to your research list.
http://www.powerfile.com/
My back of the email envelope view of what you are doing should have
quick cluster disks for binary objects, swap and libs /scratch /tmp and a
largish NFS RAID based filesystem with an archival back end. Perhaps a
large slow spinning disk staging RAID in the middle or off to the side too.
There are multiple "delta equations" that
you need to evaluate. I know I missed some
- delta file change (GB/day).
- performance delta at each layer.
- cost delta at each layer.
- management cost delta
- operational cost delta
- cost of compliance -- what the law requires, by method.
- cost of physical storage on and off site, include handling and shipping.
- cost of user training delta.
- cost of expansion delta.
- cost of necessary bandwidth, by layer.
Clusters are unique in that they have the potential
of hosting their own distributed RAID (lustre, gluster, zfs)
and with a sufficient archival backend life could be good.
Thus select systems that you can add a second disk to.
Choice of filesystem can help too (see dmapi and friends).
Have fun.
mitch
More information about the Beowulf
mailing list