[Beowulf] File server dual opteron suggestions?

Mark Hahn hahn at physics.mcmaster.ca
Thu Aug 3 21:26:35 PDT 2006


>> We'll probably go with midrange Ultra320 SCSI internal disks again
>> (10K 74Gb drives from Maxtor or Seagate) since the high end SATA
>> drives cost nearly as much for the same capacity.  For this particular

but why are "boutique" SATA drives the appropriate comparison?
compare instead to 5-year waranteed "raid edition" drives.

> Hmmm.... I would argue that 7200 RPM disks make more sense for a number of 
> reasons.

me too.  there are very few places where higher RPM is justified: 
it gives you a lower latency to write-commit.  doesn't give a higher 
rate of write commits (since more, slower spindles do that).  doesn't
give a higher bandwidth, either.

>> socket 940   (or is that a mistake these days?)
>
> Not really.  Socket F stuff is on pricewatch and other places.

I certainly wouldn't fear buying "old" s940 stuff.  sure, it's becoming
obsolete, but not quickly, and besides, why does that matter?  it's 
become pretty uncommon to upgrade CPUs (and ddr ram will be around for 
at least a year or two longer.  actually, I wouldn't be surprised if ddr2
had a shorter total lifespan...)

>> 512M-1Gb ECC memory per socket (enough for a file server, no
>>    serious computing on this node.)
>
> I would recommend upping the memory.  Computing or not, large buffer caches 
> on file servers are with very rare exception, a preferred config.

unclear.  the FS's memory does act as an excellent cache, but then again,
the client memory does too.  do you have a pattern of file accesses in which
the same files are frequently re-read and would fit in memory?  the servers
I've looked at closely have had mostly write and attribute activity,
since the client's own cache already has a high hit-rate.  for writes, of
course, more FS memory is not important unless you have extremely high 
bandwidth net and disks.  in fact, I've been using the following sysctl.conf
entries:

# delay writing dirty blocks hoping to collect further writes (default 30s)
vm.dirty_expire_centisecs = 1000
# try writing back every 1s (default 500=5s)
vm.dirty_writeback_centisecs = 100

in short, don't bother working at write caching much.  with a lot of memory,
an untuned machine will exhibit unpleasant oscillations of delaying writes
then frantically flushing.

> 2Gb/socket minimum.  Nothing serves files faster than having them already 
> sitting in ram.

true, but is that actually your working set size?  it would be rather 
embarassing if 3 of the 4 GB were files read once a month...

>> 4 x 74 Gb disks Ultra320 (or make an argument for a particular SATA)

SATA disks are SATA disks, of course.  dumb controllers are all pretty
similar as well (cheap, fast, not-cpu-consuming).  if you have your
heart set on HW raid, at least get a 3ware 9550, which is quite fast.
(most other HW raid are surprisingly bad.)

>> dual 10/100/1000 ethernet on the mobo
>
> Careful on this... we and our customers have been badly bitten by tg3 and 
> broadcom NICs.  If the MB doesn't have Intel NICs, get an Intel 1000/MT dual 
> gigabit card.  You won't regret that, and it is money well spent.

that's odd; I have quite a few of both tg3 and bcm nics, and can't say 
I've had any complaints.  what are the problems?

>> case - 2U (big enough for adequate ventilation, right?)
>
> Yeah, just make sure you have good airflow.

2U still requires a custom PS, doesn't it?  it's kind of nice to be able 
to put in an ATX-ish PS.  and is 2U tall enough for stock/standard
heatsink/fans?



More information about the Beowulf mailing list