[Beowulf] Parallel memory

Mark Hahn hahn at physics.mcmaster.ca
Tue Oct 18 12:44:09 PDT 2005

> Are there any drivers, tools, etc. that can make the memory space on a
> cluster look shared, something like pvfs but for memory?  I'm sure there

sure, lots.  this is a semi-sane project that has been the subject 
of many student projects.  fundamentally, it doesn't make that much
sense unless you have some pretty much completely sequential, high-locality
access patterns.  but if you google a bit, you'll find plenty of references
to such projects in the 80's.

> would be a speed hit, but in this instance, speed isn't the problem as much
> as memory.  We have a code that uses a huge amount of memory and the memory
> usage is proportional to the cube of the problem size, but the time for it
> to run isn't too much of an issue.

bear in mind that good local memory is O(60ns) but fetching a page over 
gigabit is going to cost you at least, say, O(80us).

are you really up for a factor of 1e3 or more?  admittedly, if you 
invest in real interconnect (say Quadrics), the slowdown factor could drop 
to as "little" as 100x.

> I've been asked to parallelize the code
> using mpi which is going to be a major effort.  However, I thought that if

how about global arrays (see nwchem)?

> there was anyway, even if inefficient speed wise, to create a virtual
> parallel memory system it would be better than it using swap space and save
> a bunch of coding time.

swap is vastly under-rated ;)
or rather, you should probably consider how well you can do by putting the 
data onto a really fast raid0 array.  a trivial hack of a machine can manage
at least ~300 MB/s pretty easily, and with a little effort could probably
break 1 GB/s without investing in anything exotic.

but again, that approach depends on your access patterns.  if you're 
massively sequential, life is good.  if you're sparse and random, well,
maybe find a different approach...

> Also, are there any tools to help implement mpi in an older code?

like what?  I see vi most often used to add mpi to really old code...

do you have to preserve the illusion of actual memory, or can you 
factor out all the accesses into put/get?  it would be pretty easy to 
write an MPI program that consisted of one dusty-deck process that uses
put/get (and a tiny bit of mapping intelligence) to suck data from 
a pool of MPI slaves which did nothing but answer memory requests.

More information about the Beowulf mailing list