the need for storage area networks [was: Shared diskspace between nodes]

Matthew O'Keefe okeefe at
Sun Jan 13 11:08:45 PST 2002


others have provided good suggestions for solutions to 
your problem, but I think the ultimate solution to your
problem is a storage area network between your Beowulf nodes
and a pool of shared storage devices.  This approach allows efficient
partitioning and sharing of storage between the Beowulf nodes.

A cluster file system like GFS can be used to map a shared
file system (one that all nodes can mount directly) onto the 
shared storage devices.  This approach completely removes 
your problem: trying to map your data evenly across many nodes,
when the data needs on each node can grow or shrink in
unexpected ways.  It also allows you to manage 1 file system,
instead of 40.  

Some may object that SANs are expensive, but that is changing.
IP-based SANs are now becoming available, and a cluster of
NFS servers with shared storage and a cluster file system can
also be used to share data across a Beowulf without the full
expense of a SAN.  For details see the white paper I wrote 
on "Accelerating Technical Computing..."  at the Sistina web site

When running complex parallel applications in production
on a Beowulf cluster (for example, Oracle Real Application Clusters), 
a storage area network and cluster file system greatly
simplifies your life.  

Matt O'Keefe

On Mon, Jan 07, 2002 at 02:36:42PM -0500, Jon E. Mitchiner wrote:
> Greetings!
> I presently run a 40-node cluster, Dual 1GHz with 20GB hard drive on each
> system.  This gives me roughly 15GB (safe estimate) after the OS, installed
> programs, some data, etc on each machine.  This gives me roughly 600GB of
> space that I am not currently utilizing on 40 nodes.
> Right now, we are saving data on various nodes, and moving it around when
> space gets tight on a machine.  This is getting time consuming as some of us
> have to look on different nodes to find out where your data is currently
> residing.  I am considering saving all directory names in a database and
> then making a GUI interface via the web so its easy to find the location of
> data directories, rather than looking for it (especially if someone moved my
> directory to another machine without letting me know).
> I am curious if there is a program out there that might be able to utilize
> the space that we are not utilizing -- such as linking the file space
> between nodes so that way I can set up a "large" data partition sharable by
> all nodes.  Some redunancy would be nice.  Im curious if there is a software
> solution (either GPL licensed, or commercial) to utilize the space better.
> Optimally, it would be nice to see all "shared" drives as one large
> partition to be mounted to all nodes and all the data is handled by a daemon
> or something like that.
> Does anyone have any ideas, suggestions, or programs that might be able to
> do something similar?
> Thanks!
> Regards,
> Jon E. Mitchiner
> Minotaur Technologies
> AOL IM [] MinotaurT
> _______________________________________________
> Beowulf mailing list, Beowulf at
> To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list