[Beowulf] SSD caching for parallel filesystems
diep at xs4all.nl
Sun Feb 10 05:40:09 PST 2013
On Feb 10, 2013, at 2:09 PM, Ellis H. Wilson III wrote:
> On 02/10/13 04:41, Vincent Diepeveen wrote:
>> SSD's are not about bandwidth, they're about latency.
> This is a bit aggressive of a vantage point -- let's tone it back:
> "SSD's aren't always the cheapest way to achieve bandwidth, but
> they are
> critical for latency-sensitive applications that are too large for
SSD's are never the cheapest way to achieve bandwidth and never will be.
> In any event, your original statement used to be wholly correct.
> It has
> changed to a certain degree to "SSDs are about IOPs," which isn't
> the same thing. However, more pointedly, with modern HDDs barely
> approaching 200MB/s and SSD solutions approaching 2-4GB/s, this is an
> increasingly limited viewpoint. We have to start considering their
> for bandwidth.
Find me an application that needs big bandwidth and doesn't need
So any SSD solution that's *not* used for latency sensitive
workloads, it needs thousands of
dollars worth of SSD's.
In such case plain old harddrive technology that's at buy in price
right now $35 for a 2 TB disk
(if you buy in a lot, that's the actual buy in price for big shops
and you nor i get them for that price
of course), or $17.5 a terabyte, that's unbeatable in performance
for storage and bandwidth.
We speak about a sustained 200MB/s for dirt cheap RAID harddrives
here. Put 16 of them in a raid partition and
you can get more than you can deliver over the network from the file
server and more than your motherboard can effectively
handle a second.
We speak about a buy in price of total peanuts for 16 harddrives
here, and the same storage in SSD is
worth a total fortune.
So using SSD's is just for latency. Anyone not using them for that i
would never hire.
>> With a raid array of cheapo disks we can also get 3GB/s bandwidth,
>> more than most 2 socket nodes effectively can handle.
> 3GB/s divided by 200MB/s gives me something like 15 drives, unless my
> math is wrong, which will be something like $2-$3K, and that's really
> only possible in RAID0, so you're only going to get the capacity of
> drive. If all I'm looking for is bandwidth I'd rather spend that
> 3k on
> an expensive SSD (or RAID a bunch of cheaper SSDs) and get it for far
> less power, wire complexity, space consumption, and risk of failure.
> Moreover, it'll have better latency. This gap will continue to widen,
> so while we can talk about 15 disks reasonably right now, in a year
> we'll be talking more like 25-30 and then it just becomes absurd.
> buy the SSD(s) at that point.
>> Only theoretically a higher bandwidth will be possible (benchmarks
>> However getting 20 bytes from a SSD is in the few dozens of
>> microseconds, versus several milliseconds for the cheapskate disks.
>> That factor of 50-100 difference roughly in latency difference is the
>> reason SSD's exist.
>> Any bandwidth test of a SSD is total nonsense.
> (I wish you'd put [In my personal opinion] in front of all of your
> sentences. It would make them less nails on a chalkboard.)
> So what happened to "perfectly parallel"? Seems to me like a
> parallel device would be well tuned to deliver good bandwidth.
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf