[Beowulf] GPU based cluster fo machine learning purposes

Sun Apr 13 16:50:16 PDT 2014

On Fri, Apr 11, 2014 at 09:25:05AM -0400, Douglas Eadline wrote:
> 
> > On Thu, Apr 10, 2014 at 02:07:40PM +0000, Lux, Jim (337C) wrote:
> >> On 4/10/14 5:28 AM, "Piotr Król" <pietrushnic at gmail.com> wrote:
> >
> > The question is if they can beat $14.7/DP GFLOPS (13GFLOPS/$192) ?
> 
> Here are some data points (based on actual measurements):
> 
> For about $3,000 (US) you can easily put together a 4 node
> X86 system with $6/DP GFLOPS running HPL. Doubling the
> number of nodes would probably preserve this
> price-to-performance and get you close to 1 DP TFLOP.
> 
> In terms of Power/Performance it would run at
> around .75 Watt/DP GFLOP running HPL.

Thanks Douglas,
these numbers force me to do some research. As I wrote in other e-mails I
care more about small size and power consumption than performance but I
would like to get best available on market. I know that there is no free
lunch. I found this two boards very interesting:

- Arndale Octa Board (Exynos 5420) with Mali™-T628 MP6 - theoretical max
  for Mali GPU is 109 GFLOPS (what gives $179/102 DP GFLOPS =
  1.79$/GFLOPS) with spec I read that board is powered by 5V/3A adapters
  (so max power consumption is 15W - 15W/102 DP GFLOPS = 0.14 W/DP
  GFLOPS).

- Odroid-XU with SGX544 MP3 - $169/51.1 DP GFLOPS = 3.3$/GFLOPS, max
  power 20W, so 20W/51.1 DP GFLOPS = 0.39W/GFLOPS.
  Performance data came from here:
  http://kyokojap.myweb.hinet.net/gpu_gflops/

That was best what I can found.

I also took a look on Teensy3.1 suggested by Jim. Unfortunately even if
I forget about Linux it can compute 0.6 MFLOPS so $19.8/0.9 = 33$/MFLOP
:( and power 0.032A*3.3V=0.1056W 0.1056/0.6 = 0.176 W/MFLOPS -> 176W/GFLOPS !?

Any suggestion about things that I should take care of when building
this kind of low power cluster ?

Regards,
Piotr Król