[Beowulf] Striped file system with RAM disk.

Larry Stewart larry.stewart at sicortex.com
Sun Nov 25 08:30:52 PST 2007


Alan Louis Scheinine wrote:
>
>    There is a particular kind of application, single-client
> and serial process, for which a striped file system using
> RAM disk would be very useful.  Consider reading small
> blocks at random locations on a hard disk.  The latency
> of the HDD could be a few milliseconds.  Adding more HDD's
> does not solve the problem, unlike an application based on
> streaming.  Adding more disks and parallelizing the program
> could be a solution but sometimes there is no time
> to parallelize the program.
>
>    A possible solution is RAM disk.  But if we put, for example,
> 64 GB of RAM on a single computer then that computer becomes
> specialized and expensive, whereas the need for a huge
> amount of RAM may be only temporary.  An alternative is to
> use a cluster of nodes, a typical Beowulf cluster.  For example,
> using a striped file system over 16 nodes where each node has 4 GB
> of RAM.  Each node would have a normal amount of RAM and yet
> could provide the aggregate storage of 64 GB when the need arises.
> While we have not yet created this configuration, I suppose
> that Gbit Ethernet could provide 100 microsecond latency and
> Infiniband or Myrinet could provide 10 microsecond latency.
> Much, much less than the seek time of a HDD.
>
>    The idea is so simple that I imagine it has already been done.
> I would be interested in learning from other sites that have
> used this method with a file system such as Lustre, PVFS2 or
> another.
>
> best regards,
> Alan Scheinine
>
We do this at SiCortex using Lustre as the parallel filesystem, with 
tmpfs as the "backing store".  The
marketing folks call it "FabriCache".  The folks at Argonne are working 
on doing the same thing
with PVFS2 on the SiCortex machine they have.  It works well for us, and 
should work pretty
well on traditional cluster hardware as well.

I think there is room to improve the idea as well, perhaps by using some 
kind of journalling to
disk, with coherent timestamps, so the striped journals could be 
correlated into a consistent
view, but we haven't worked on that.

-Larry




More information about the Beowulf mailing list