[Beowulf] single machine with 500 GB of RAM
Andrew Holway
andrew.holway at gmail.com
Wed Jan 9 09:00:57 PST 2013
As its a single thread I doubt that faster memory is going to help you much. It's going to suck whatever you do.
Am 9 Jan 2013 um 17:29 schrieb Jörg Saßmannshausen <j.sassmannshausen at ucl.ac.uk>:
> Dear all,
>
> many thanks for the quick reply and all the suggestions.
>
> The code we want to use is that one here:
>
> http://www.cpfs.mpg.de/~kohout/dgrid.html
>
> Feel free to download and dig into the code. I am no expert in Fortran so I
> won't be able to help you much if you got specific questions to the code :-(
> However, my understanding is that it will only run on one core/thread.
>
> As for the budget: That is where it is getting a bit tricky. The ceiling is
> 10k GBP. I know that machines with less memory, say 256 GB, are cheaper, so
> one solution would be to get two of the beast so we can do two calculations at
> the same time. If there are enough slots free, we could upgrade to 500 GB once
> we got another pot of money.
>
> I guess I would go for DDR3, simply as it is faster. Waiting 2 weeks for a
> calculation is no fun, so if we can save a bit of time here (faster RAM) we
> gain actually quite a bit here.
>
> I am not convinced with the AMD Bulldozer to be honest. From what I understand
> the Sandybridge has the faster memory access (higher bandwidth). Is that
> correct or do I miss out something here.
>
> I gather that the idea of just using one CPU is not a good one. So we need to
> have a dual CPU machine, which is fine with me.
>
> I am wondering about the vSMP / ScaleMP suggestion from Joe. If I am using an
> InfiniBand network here, would I be able to spread the 'bottlenecks' a bit
> better? What I am after is, when I tested out the InfiniBand on the new cluster
> we got, I noticed that if you are running a job in parallel between nodes, the
> same amount of cores are marginally faster. At the time I put that down due to
> a slightly faster memory access as there was no bottleneck to the RAM.
> I am not familiar with vSMP (i.e. I never used it), but is it possible to
> aggregate RAM from a number of nodes (say 40) and use it as a large virtual
> SMP? So one node would be slaving away with the calculations and the other
> nodes are only doing memory IO. Is that possible with vSMP?
> In a related context, how about NUMAScale?
>
> The idea of the aggregates SDD is nice as well. I know some storage vendors
> are using a mixture of RAM and SDD for their meta-data (fast access) and that
> seems to work quite well. So that would be a large swap file / partition or is
> there another way to use disc-space as RAM? I need to read the paper of
> NVMalloc I suppose. Is that actually used or is that just a good idea and we
> got a working example here?
>
> I don't think there is much disc IO here. There is most certainly no network
> bound traffic as it is a single thread. A fast CPU would be of advantage as
> well, however, I gut the feeling the trade-off would be the memory access speed
> (bandwidth).
>
> I have tried to answer the questions raised. Let me know whether there are
> still some unclear points.
>
> Thanks for all your help and suggestions so far. I will need to digest that.
>
> All the best from a sunny London
>
> Jörg
>
> --
> *************************************************************
> Jörg Saßmannshausen
> University College London
> Department of Chemistry
> Gordon Street
> London
> WC1H 0AJ
>
> email: j.sassmannshausen at ucl.ac.uk
> web: http://sassy.formativ.net
>
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list