[Beowulf] SATA vs SCSI drives

Mark Hahn hahn at physics.mcmaster.ca
Tue Oct 12 08:46:07 PDT 2004

> Though slightly dated, I hope the attachment is helpful....btw....I didn't 
> do an exhaustive search, but found the 10K SATA drives only offered at 
> 72GB's and under. The higher cap drives are 7200RPM.

that's correct.  but remember - RPM is mainly for latency, not bandwidth.
if your workload is not incredibly seeky, then you don't want to pay 
for latency, since higher density leads to lower cost, bigger disks, higher
bandwidth and slower seeks.

in summary:
	- meet your reliability requirements using raid.  it's insane 
	to think about relying on a single disk in any non-ephemeral
	setting anyway.  raid lets you achieve pretty much any reliability
	you want (as well as offering a broad spectrum of performance.)

	- meet your seek-rate requirements using RPM.  I find very, very
	few applications are really seek-limited - really it's only very
	databases with uniform-random distribution of reads of tiny data
	from monumentally large tables.  in particular, if there's any 
	data locality or reuse at all, spend money on RAM not RPM.

	- for anything large, get MTBF specs for prospective disks.
	this lets you calculate how often you'll be replacing hardware,
	physically.  your raid has taken care of data robustness;
	this is purely a maintenance issue.

there's no dramatic difference in any of the families of disks available
(well, avoid 1yr warranties, of course!).  consider, for instance, that 
you can easily build raids based on 300G SATA disks that have half as 
many moving parts as with 147G SCSI disks.  even if the MTBF's differ 
by 50% (guess 1.0 and 1.5 Mhours respectively) SATA is more reliabile.
it'll probably also be 1/4 the price and sometimes actually faster.

regards, mark hahn.

More information about the Beowulf mailing list