[Beowulf] SSD caching for parallel filesystems

Vincent Diepeveen diep at xs4all.nl
Mon Feb 11 15:30:16 PST 2013

Ah you're doing the simulations a lot slower :)
No big deal. You get what you pay for :)

You're speaking about 1 box here or so?

What size SSD array size is in that box, what bandwidth does it  
deliver to your CPU's and what price did you buy it for Jim?
Then we can compare it with a harddrive raid array :)

Kind Regards,

On Feb 11, 2013, at 11:44 PM, Lux, Jim (337C) wrote:

>>> In any event, your original statement used to be wholly correct.
>>> It has
>>> changed to a certain degree to "SSDs are about IOPs," which isn't
>>> quite the same thing.  However, more pointedly, with modern HDDs
>>> barely approaching 200MB/s and SSD solutions approaching 2-4GB/s,
>>> this is an increasingly limited viewpoint.  We have to start
>>> considering their use for bandwidth.
>> Find me an application that needs big bandwidth and doesn't need
>> massive storage.
>> Digital waveform recording and playback.. e.g. in radar simulators.
>> You need very wide bandwidth, but not a huge amount of storage (e.g.
>> If I'm playing back a synthetic response to a 1 millisecond pulse  
>> with
>> 2 GHz BW, I only need 10s of Megasamples at most, but you need 10
>> Gsample/second sorts of bandwidth)
>> One might thing, heck, just slap a few GByte of RAM in there and be
>> done with it, but if you're simulating a radar with 10 different  
>> pulse
>> types, and you have 10-20 simulated targets each with several
>> different viewing aspects, you pretty quickly need a "library" of
>> several thousand pulses/returns to choose from.
> Yeah well i remember negotiating about writing CUDA code for  
> simulation software of something similar.
> Don't think that this example applies. You want it in RAM for a  
> proper simulation :)
> ---
> Nope... you want to store it in disk..
> a) 4 bytes/sample @ 20 Megasamples/pulse is 80 Mbyte/pulse
> b) * 1000 pulses is 80 GB.
> That's a lot of RAM (and a lot of power, if you DID buy that much  
> ram).
> A few Gbyte/second coming out of a SSD makes it actually feasible  
> to "stream from disk array" and keep that 1-2 GSample/Second  
> pipeline full.
> And on the receive side, where you want to capture the transmitted  
> pulses (or returns), a similar sort of thing applies, although SSDs  
> aren't a ball o'fire for write speed, they ARE faster than spinning  
> magnetic media, so to get a given throughput, it takes fewer drives.
> Sometimes, it's the "number of drives" that is the cost determining  
> aspect.  You don't need a lot of space, but you do need a very fast  
> transfer rate, and ganging up drives in parallel is how it's  
> done.   The instantaneous seek aspect of a SSD is also nice,  
> because you don't have to worry about rotational latency in this  
> kind of application.

More information about the Beowulf mailing list