Ellis H. Wilson III ellis at runnersroll.com
Fri Oct 29 13:18:52 PDT 2010

On 10/29/10 15:48, Greg Lindahl wrote:
> On Fri, Oct 29, 2010 at 03:02:45PM -0400, Ellis H. Wilson III wrote:
>> I think it's making a pretty wild assumption to say search engines and
>> HPC have the same I/O needs (and thus can use the same I/O setups).
> Well, I'm an HPC guy doing infrastructure for a search engine, so I'm
> not assuming much. And I didn't say the setup would be the same --
> just that Lustre/PVFS would probably be more reliable and higher
> performance if they stored copies on multiple servers instead of using
> local or SAN RAID. (Or did they implement this while I wasn't looking?)

Setting up a parallel file system on multiple servers is fine for really 
chunky or really independent workloads (such as independent searches 
where one search running slowly will not degrade the performance of a 
concurrent search for something else).  This is not at all the case in 
most HPC situations, where latency between nodes during computation is 
the limiting factor.  Yes, you might get higher reliability power-wise 
and better performance bandwidth-wise (assuming you have some very wide 
links over distances between the servers) but you won't get reliability 
security-wise or performance latency-wise, both of which are critical 
for HPC.

When I said "assumption" I just meant saying "HPC is slower to abandon 
RAID than other kinds of computing," having just mentioned Blekko was 
drawing an invalid comparison between the two very different domains.



