[Beowulf] Checkpointing using flash

Alan Louis Scheinine alscheinine at tuffmail.us
Sat Sep 22 12:47:25 PDT 2012


Andrew Holway wrote:
  > I've been playing around with GFS and Gluster a bit recently and this
  > has got me thinking... Given a fast enough, low enough latency network
  > might it but possible to have a Gluster like or GFS like memory space?

For random access, hard disk access times are milliseconds when the r/w
head needs to move whereas Infiniband switch latency is less then ten
microseconds.  So if an algorithm needs highly random access over more
memory than a single node, combining memory of a cluster might be the
best solution for certain problems.  I do not know of any parallel
filesystem that uses memory mapping rather than parallel disks, but
it seems like a useful utility.  While it is possible to put a huge
amount of memory into a single node, that node would be specialized,
whereas using the memory of a cluster means that the same memory serves
a general-purpose cluster when not being used for the specialized memory-based
parallel file system.

I once (many years ago) looked at PVFS, which has drivers for various
interconnection hardware.  It appeared to me that in order to modularize
the software, read/write was done from the user program to virtual memory
then copied over to DMA-able memory of the specific interconnect driver.
It would be useful to write similar software with lower latency due to
fewer copies.  (I looked at PVFS because the source code is available.)

No doubt some members of this list can provide much more up-to-date
information and a more sophisticated description of the problem and
possible solutions.  I merely want to indicate that due to the low latency
of some interconnects, for some applications it would be useful to combine
distributed memory to look like a filesystem.  I would suppose that for
any given algorithm the highest performance could be obtained by distributing
the calculation, whereas, my focus here is speeding-up applications that
use conventional file I/O without rewriting the application but rather by
changing the underlying file I/O implementation.

Regards,
Alan

-- 

  Alan Scheinine
  200 Georgann Dr., Apt. E6
  Vicksburg, MS  39180

  Email: alscheinine at tuffmail.us
  Mobile phone: 225 288 4176

  http://www.flickr.com/photos/ascheinine



More information about the Beowulf mailing list