[Beowulf] single machine with 500 GB of RAM

Wed Jan 9 10:31:00 PST 2013

On Jan 9, 2013, at 5:29 PM, Jörg Saßmannshausen wrote:

> Dear all,
>
> many thanks for the quick reply and all the suggestions.
>
> The code we want to use is that one here:
>
> http://www.cpfs.mpg.de/~kohout/dgrid.html
>
> Feel free to download and dig into the code. I am no expert in  
> Fortran so I
> won't be able to help you much if you got specific questions to the  
> code :-(
> However, my understanding is that it will only run on one core/thread.
>
> As for the budget: That is where it is getting a bit tricky. The  
> ceiling is
> 10k GBP. I know that machines with less memory, say 256 GB, are  
> cheaper, so
> one solution would be to get two of the beast so we can do two  
> calculations at
> the same time. If there are enough slots free, we could upgrade to  
> 500 GB once
> we got another pot of money.
>
> I guess I would go for DDR3, simply as it is faster. Waiting 2  
> weeks for a
> calculation is no fun, so if we can save a bit of time here (faster  
> RAM) we
> gain actually quite a bit here.
>
> I am not convinced with the AMD Bulldozer to be honest. From what I  
> understand
> the Sandybridge has the faster memory access (higher bandwidth). Is  
> that
> correct or do I miss out something here.

You must not confuse 2 socket machines with 4.
The latencies you see are just for 2 sockets.
Intel total dominates in 2 socket space.
  In 4 it's a different song that gets sung once you get outside of  
the caches,
and i bet you will isn't it?

If you adress 1 block of memory at 500GB and are outside of the caches,
then the AMD box is going to be faster than intel in latency of course.

Cache coherency, snooping, intel always struggled there at 4 socket  
area.
That's why no one hears much from those 4 socket intel machines.  
Latencies to the ram are UGLY at it.
At AMD latencies already were ugly, yet at 4 sockets it doesn't get  
much worse.

You will need to fill every box with 4 cpu's anyway fill it up with  
the amount of RAM you want
and look at the price difference.

If you use open source, i'm sure someone will want to test for you.
Just test it at both manufacturers @ 500 GB ram block and you'll see.

What was it some years ago, a 80 core intel box with 8 sockets and  
512 GB ram?  $200k or so?

>
> I gather that the idea of just using one CPU is not a good one. So  
> we need to
> have a dual CPU machine, which is fine with me.
>
> I am wondering about the vSMP / ScaleMP suggestion from Joe. If I  
> am using an
> InfiniBand network here, would I be able to spread the  
> 'bottlenecks' a bit
> better? What I am after is, when I tested out the InfiniBand on the  
> new cluster
> we got, I noticed that if you are running a job in parallel between  
> nodes, the
> same amount of cores are marginally faster. At the time I put that  
> down due to
> a slightly faster memory access as there was no bottleneck to the RAM.
> I am not familiar with vSMP (i.e. I never used it), but is it  
> possible to
> aggregate RAM from a number of nodes (say 40) and use it as a large  
> virtual
> SMP? So one node would be slaving away with the calculations and  
> the other
> nodes are only doing memory IO. Is that possible with vSMP?
> In a related context, how about NUMAScale?
>
> The idea of the aggregates SDD is nice as well. I know some storage  
> vendors
> are using a mixture of RAM and SDD for their meta-data (fast  
> access) and that
> seems to work quite well. So that would be a large swap file /  
> partition or is
> there another way to use disc-space as RAM? I need to read the  
> paper of
> NVMalloc I suppose. Is that actually used or is that just a good  
> idea and we
> got a working example here?
>
> I don't think there is much disc IO here. There is most certainly  
> no network
> bound traffic as it is a single thread. A fast CPU would be of  
> advantage as
> well, however, I gut the feeling the trade-off would be the memory  
> access speed
> (bandwidth).
>
> I have tried to answer the questions raised. Let me know whether  
> there are
> still some unclear points.
>
> Thanks for all your help and suggestions so far. I will need to  
> digest that.
>
> All the best from a sunny London
>
> Jörg
>
> -- 
> *************************************************************
> Jörg Saßmannshausen
> University College London
> Department of Chemistry
> Gordon Street
> London
> WC1H 0AJ
>
> email: j.sassmannshausen at ucl.ac.uk
> web: http://sassy.formativ.net
>
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf