[Beowulf] The Walmart Compute Node?

Vincent Diepeveen diep at xs4all.nl
Fri Nov 9 09:21:41 PST 2007

Ok easy theoretic calculation, and it's still very rude of course:

1 core 2.4Ghz * 3 instructions a cycle * (sse)2 = 7.2 * 2 = 14.4 Gflop
4 cores of a quad core ==> 57.6 gflop
3 nodes ==> 3 * 57.6 =  172.8 gflop

Now of course your software won't be able to get that out of the  
hardware at core2,
at new K8 cores perhaps it goes a tad better (though they aren't  
there yet).

More careful calculation for core2 you can do using 2 instructions a  

172.8 * 2 / 3 = 115.2 gflop

If you really want to build this cheapo, checkout pricewatch. I'm  
sure one of you can manage it for $1000.

On other hand at sycortex.com under 'news' i see a 72 cpu solution,  
which in their case is 72Gflop (hope i'm wrong)
offered for under $15k.

Let's say roughly a factor 5-10 difference in price compared to pc's  
even if we add power for the coming 3 years?


On Nov 9, 2007, at 5:58 PM, Peter St. John wrote:

> Vincent,
> I'm missing something in the arithmetic. "3 nodes of quadcore" is 12
> cores? delivering 100 "GFlops" would require something like 8 GHz? So
> perhaps you mean, 3 nodes of dual socket, quadcore CPU  (24 cores) at
> 4GHz? And you can get that for $1500?
> Thanks,
> Peter
> On Nov 9, 2007 11:44 AM, Vincent Diepeveen <diep at xs4all.nl> wrote:
>> Larry, all what you write is very interesting and of course i hope
>> for you your product line gets a big succes.
>> Just like IBM's blue gene, the major expertise of your product line
>> is that it is only interesting to governments who need major  
>> amounts of
>> crunching power (the other conditions left aside such as no big RAM
>> requirements as that usually means you need good branch prediction
>> and so on),
>> and who have million dollar budgets, and probably have a program
>> lying around where this hardware can get used for.
>> The price of a box with say 100 "1 gflop" cpu's, delivering in total
>> 100 gflop isn't gonna be $1500 i guess, whereas for 1500$ one can
>> build hands down
>> 3 nodes with a quadcore, delivering not only *more* than 100 gflop,
>> but also capable of doing other software than just crunching; it's
>> also possible to put
>> a lot of RAM inside and it's also possible to run software that's
>> making a lot of use from the branch predictor.
>> For sure you're not qualifying for a $2500 setup, and with those
>> freak qualifications you qualify bigtime for this mailing list of
>> course :)
>> On Nov 9, 2007, at 3:42 PM, Larry Stewart wrote:
>>> Robert G. Brown wrote:
>>>> On Thu, 8 Nov 2007, Jim Lux wrote:
>>>>> In general, a N GHz processor will be poorer in a flops/Watt
>>>>> sense than a 2N GHz processor.
>>> Well that just isn't so.  It seems pretty clear from IBMs BlueGene/
>>> L, as well as the SiCortex processors, that the
>>> opposite is true.  The new Green 500 list is brand new, and there's
>>> not much on it yet, but the BG/L is delivering 190MF/Watt
>>> on HPL, whereas the machines made out of Intel and AMD chips are
>>> half that at best.
>>>>> The power draw is a combination of a fixed load plus a frequency
>>>>> dependent load, so for the SAME processor, running it at N/2 GHz
>>>>> consumes more than 50% of the power of running it at N GHz.
>>> This probably IS true, but high performance cores have a lot more
>>> logic in them to try to achieve performance: out of order
>>> execution, complex branch prediction, register renaming, etc. etc.
>>> A slower core can be a lot simpler with the same silicon process,
>>> so a decent lower-clock design will be more power efficient than a
>>> fast clock design.
>>>>> If you go to a faster processor design, the frequency dependent
>>>>> load gets smaller (smaller feature sizes= smaller capacitance to
>>>>> charge and discharge on each transition).  The core voltage is
>>>>> also usually smaller on higher speed processors, which also
>>>>> reduces the power dissipation (smaller number of joules to change
>>>>> the voltage from zero to one or vice versa).  So, in general, a
>>>>> 2N GHz processor consumes less than twice the power of a N GHz
>>>>> processor.
>>> The flaw in this argument is that a slower clock design can use the
>>> same small transistors and the same current state of the art
>>> processes and it will use many fewer transistors to get its work
>>> done, thus using very much less power.  Our 1 GF core is 600
>>> milliwatts, for example.
>>> Even after adding all the non-core stuff - caches, memory
>>> controllers, interconnect, main memory, and all overhead, it is
>>> still around 3 watts per GF.
>>>> In ADDITION to this is the fact that the processor has to live in a
>>>> house of some sort, and the house itself adds per processor  
>>>> overhead.
>>>> This overhead is significant -- typically a minimum of 10-20 W,
>>>> sometimes as much as 30-40 (depending on how many disks you  
>>>> have, how
>>> This factor does not scale this way!  With low power processors,
>>> you can pack them together, without the endless support chips, you
>>> can use low power inter-chip signalling, you can use high
>>> efficiency power supplies with their economies of scale.  If you
>>> look inside
>>> a PC there are two blocks doing useful work - memory and CPUs, and
>>> a whole board full of useless crap.  Look inside a machine designed
>>> to be a cluster and there should be nothing there but cpus and  
>>> memory.
>>> --
>>> -Larry / Sector IX
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list