[Beowulf] The Walmart Compute Node?

Peter St. John peter.st.john at gmail.com
Thu Nov 8 11:02:41 PST 2007

I look forward to buying you a beer :-) Hmmm, Dopple Bock.
Unfortunately Christmas will be in CA this year, not NC, on account of
my nephews new baby. (Of course it's a good thing in other ways :-)
I'm thinking, it might be fun to have say 4 cheapo 1.5GHz nodes and
say 1 3GHz with two cores and let them compete. My algorithm wants to
referee. (My algorithms talk to me.)
Your points about the L2 and the 32-bit-ness give me pause. I maybe
don't want to saddle myself with 32 bit things that won't play well
with near-future 64 bit things, but OTOH it's cheap to look see.

On Nov 8, 2007 1:39 PM, Robert G. Brown <rgb at phy.duke.edu> wrote:
> On Thu, 8 Nov 2007, Peter St. John wrote:
> > Recently, probably you noticed, Walmart began selling a $200 linux PC.
> > (Apparently the OS is just Ubuntu 7.10 with a small xindow manager
> > instead of Gnome or KDE). Now Slashdot points to
> > http://www.linuxdevices.com/news/NS5305482907.html, the MB being sold
> > separately for $60 ("development board"). It has 1.5GHz CPU,
> > unpopulated memory (slots for 2GB), one 10/100 connection. Does this
> > look to y'all like fair FLOPS/$ for a kitchen project? I'm thinking 6
> > of them as compute nodes per 8 port router, with a bigger head node
> > for fileserving. (actually I'll use a spare room but you know what I
> > mean). An arrangement like this might be faster RAM access per core,
> > compared to multicore, since each core has no competition for is't own
> > memory, right?
> Well by now you surely have heard the YMMV litany enough times not to
> hear it again from me, but YMMV quite a bit here so let me indicate a
> few potential difficulties.
>  a) For this money, I'm guessing the CPU is a 32 bit Celery, which has a
> very small L2.  For some code this won't matter, but if you're worrying
> about multiple cores and a memory bottleneck, let me assure you the L2
> bottleneck on a single 32-bit channel will likely be much worse.
>  b) Amdahl's law rewards higher clock and fewer CPUs over lower clock
> and more CPUs almost (but not quite) without exception.  I doubt that
> you are an exception.
>  c) A 64-bit CPU has some superlinear speedup compared to a 32-bit CPU
> at constant clock, for memory bound code especially.  64-bit CPUs have
> much larger caches as well.  This CAN work against you for very cache
> unfriendly code, but again in 99% of all applications it will work for
> you -- it is what a cache "does".
>  d) A perfectly fair question is to what extent the memory bus is
> oversubscribed on a 64-bit dual core, say, a very cheap AMD-64 at
> roughly twice the clock, with more than twice the total memory
> bandwidth, and with two cores.  This is the question that depends in
> detail on YOUR APPLICATION.  Many applications are de facto CPU bound
> and you get clock speed scaling within a CPU family all the way down to
> small cache Celerons.  Others are vehemently not.  "YMMV", so you have
> to analyze YOUR application to figure out which it is, where the easiest
> way by far to find out is to just try it.
> Sounds like it will cost you somewhere between $100 and $200 to set up a
> minimal system -- cheap case/power, motherboard, memory, a borrowed
> video card.  You can probably beg, borrow, or buy a dual core AMD at
> some middling low (but much higher!) clock for no more than $400.  Run
> your presumably EP application on the one, and on the other two at a
> time.  Buy lots of the winner, use the loser as a desktop or head node
> (even the Celery should be fine for that, especially on a 100 Mbps
> network).
> Now, I'm a gambling man (as you may not know) and I will bet you one
> bottle, can, or glass of ice-cold or cellar cool clean and refreshing or
> thick and chewy beer as the winner prefers, to be delivered at a
> mutually convenient time (such as both of us sitting side by side at in
> a venue that purveys said beverages), that the medium-low end AMD-64
> kicks the ass of the maximally cheap Celery in price-performance on your
> application (where I have an unfair advantage in that I know something
> about your application, but I'd make the same bet if I didn't).
> To go into detail, I expect that at contant cost you'll end up with
> somewhere in the ballpark of 2-3x aggregate bogomips/$ from the AMD,
> that memory bottlenecks will eat up no more than a small part of it (I
> actually expect the AMDs to win here TOO because of the probably at
> least doubled total memory bandwidth and larger cache), that when you
> factor in a roughly 4x increase in required system volume and 3x
> increase in total power consumption required to run the same number of
> Celeries that will match the AMD, at a marginal cost of roughly
> $200/year in increased power costs and some increased investment of your
> "free" time to install and mange the extra systems... well, let's just
> say that I think that the Celeries will look ugly.  And I'd expect
> similar savings from the lowball dual core Xeons, honesly -- system
> price around $350-500 stripped to match where you vary in this range to
> find the sweet spot in terms of total memory, processor clock, and other
> configuration details.
> Before you turn me down, note that this is a win-win bet for both of us,
> since the winner gets to buy the next round...;-)
>    rgb
> > Thanks,
> > Peter
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> >
> --
> Robert G. Brown
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone(cell): 1-919-280-8443
> Web: http://www.phy.duke.edu/~rgb
> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977

More information about the Beowulf mailing list