[Beowulf] PetaBytes on a budget, take 2

Greg Lindahl lindahl at pbm.com
Thu Jul 21 23:55:59 PDT 2011


On Fri, Jul 22, 2011 at 01:44:56AM -0400, Mark Hahn wrote:

> to be honest, I don't understand what applications lead to focus on IOPS
> (rationally, not just aesthetic/ideologically).  it also seems like
> battery-backed ram and logging to disks would deliver the same goods...

In HPC, the metadata for your big parallel filesystem is a good example.
SSD is much cheaper capacity at high IOPs than battery-backed RAM. (The
RAM has higher IOPs than you need.)

For Big Data, there's often data that's hotter than the rest. An
example from the blekko search engine is our index; when you type a
query on our website, most often all of the 'disk' access is SSD.

Big Data systems generally don't have a metadata problem like HPC
does; instead of 200 million files, we have a couple of dozen tables
in our petabyte database.

-- greg






More information about the Beowulf mailing list