[Beowulf] Register article on Opteron - disagree
Robert G. Brown
rgb at phy.duke.edu
Mon Nov 22 00:04:16 PST 2004
On Sun, 21 Nov 2004 john.hearns at clustervision.com wrote:
> The fact that there are fewer Opteron based systems in the Top 500 is
> irrefutable (I didn't know this) but it makes me uneasy to extrapolate
> this to the impending death of a CPU.
> I DO agree (and let's have some debate here) that Nocona is bound to make
> big inroads.
Sure. We can start with the fact that the Top 500 list is irrelevant.
It is a hardware vendor pissing contest.
I criticized it pretty strongly in a recent CWM column. Let's see:
a) It lists identical hardware configurations as many times as they
are submitted. Thus "Geotrace" lists positions 109 through 114 with
identical hardware. If one ran uniq on the list, it would probably end
up being the top 200. At most. Arguably 100, since some clusters
differ at most by the number of processors. What's the point, if the
site's purpose is to encourage and display alternative engineering?
None at all.
b) It focusses on a single benchmark useful only to a (small) fraction
of all HPC applications. I mean, c'mon. Linpack? Have we learned
>>nothing<< from the last twenty years of developing benchmarks? Don't
get me wrong -- linpack is doubtless a useful measure to at least some
folks. However, why not actually present a full suite of tests instead
of just one? Vendors would hate this, just like they hate Larry Mcvoy's
benchmark suite, because it makes it so difficult to cook up a cluster
that does just one thing (runs Linpack) well...
c) It totally neglects the cost of the clusters. If you have to ask,
you can't afford to play, I suppose. Is it any surprise that the list
is dominated by the goldest-plated of the gold plated vendors?
Obviously many of the folks who build and buy the systems that make the
list aren't troubled by the spectre of cost-benefit. So we have
absolutely no idea what BlueGene/L costs for the R it produces compared
to SGI's Altix 1.5 GHz cluster that comes in at number 2. Even if we
did know their cost, we would be unlikely to know their true cost -- at
best the cost after the vendor discounted the system heavily for the
advertising benefit of getting a system into the top whatever.
I could go on. I mean, look at the banner ads on the site. Vendors
love this site. If it didn't exist, they'd go and invent it.
If they want me to take the top500 list seriously, they could start by
de-commercializing it, running a pretty stringent unique-ing process on
the submissions and accepting only the first of a given design or
architecture, especially for clusters that are more or less turnkey and
mass produced. Then they could run a SERIOUS suite of benchmarkS (note
plural) on the clusters, one which (like SPEC) attempts to provide
useful information about things like latency, bandwidth, interconnect
saturation for various communications patterns, speed for a variety of
actual applications including several with very different
computation/communication patterns (ranging from embarrassingly parallel
to fine grained synchronous). Scaling CURVES (rather than a single
silly number) would be really useful.
I mean, this site is "sponsored" by some presumably serious computer
science and research groups (although you'd never know it to look at all
the little flashy things blinking Myrinet, Atipa, Tyan, IBM from the
sides of the listings compared to the tiny little corner where the
sponsoring institutions are listed). If they want to do us a real
public service, they could do some actual computer science and see if
they couldn't come up with some measure a bit richer than just R_max and
Now, with that said (and it needed to be said, it did it did) the only
thing most real cluster computer buyers care about is price/performance.
To be more specific, price/performance on their particular
application(s). At a guess, some 2/3 to 3/4 of all cluster computer
users are doing something fairly coarse grained that doesn't use
anything at all that linpack is relevant to as a measure of performance.
This is probably true even on many of the clusters in the top 500.
AMD has more or less "owned" the price/performance sweet spot for the
last two years. If you have LIMITED money to spend and want to get the
most work done for your money, you buy Opterons, at least at this
particular moment and for most of the applications I've heard of that
have been compared. Could Nacoma change this? Sure, if Intel drops its
margins, but historically they've avoided doing this. It is also WAY
early to see if AMD's road map "beats" Intel's or vice versa -- there
are a lot of changes lined up in both architectures where getting to 64
bits was only the first step. So Nacoma could easily end up being both
more expensive and slower, at least in the medium run.
I personally am currently very fond of my Opteron-based systems and
would cheerfully buy a lot more if only somebody would give me the money
to do so. In a year, though, or two or four, who knows what I'd buy? I
don't pick Opterons because I'm "fond" of AMD or "hate" Intel. I'm
equally fond of both and hate neither one of them. I will, however, buy
the price/performance winner because it is my work that will suffer if I
don't. The only good reason to deliberately pick a more expensive
architecture is if there are issues with reliability (either software or
hardware). At the moment I'm unaware of any issues at all with Opterons
-- they run "perfectly" with FC2 and later, and I'm still waiting for
our first hardware failure from our Opteron stack after close to a year
under nearly continuous load.
SO I'd have to say that I doubt that the authors of the article were
particularly well informed, and that AMD is likely to be around and
kicking for a few years yet. Look, even the Power series hasn't
disappeared and it has almost no top 500 presence at all, if you
discount BG itself as IBM showing its marketing clout and finding a use
for 700 MHz CPUs in Very Large Quantities...
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf