[Beowulf] NASTRAN on cluster
Lombard, David N
david.n.lombard at intel.com
Tue Apr 12 06:57:37 PDT 2005
From: Mark Hahn on Monday, April 11, 2005 9:40 PM
>
>> > We just installed a small cluster and are running NASTRAN 2005 on
>> it...
>
>(on a cluster of intel duals, and it doesn't seem to scale well
>when both cpus on a node are busy.)
>
>> Nastran doesn't really want to run more than one job (MPI rank) per
>> node.
>
>I bet that isn't true on dual-opterons.
Pretty much true. Think disk I/O.
>> The distro can/will have a significant impact on allocatable memory.
>> Nastran uses brk(2) to allocate memory, so the TASK_UNMAPPED_BASE is
>> significant.
[...]
>
>on ia32, TASK_UNMAPPED_BASE, by default, is at 1GB rather than ~1.3.
>easy to change, though, at least to give 2-2.5 GB on ia32.
It may *not* be easy to change, depending on the distro and glibc. But,
if you do whatever work is needed, you can push the memory allocation up
to about 2.1-2.3 GiB.
>> I can't comment on SATA, but PATA disks are a really bad choice, as
they
>> require too much effort from the CPU to drive them--SCSI is MUCH
>> preferred in that case.
>
>this is one of the longest-lived fallacies I've ever personally
>experienced.
>it was true 10+ years ago when PIO was the norm for ATA disks.
>busmastering
>has been the norm for PATA for a long while.
Benchmarks prove my point... PATA disks are *horrible* for Nastran.
And, as I stated above, I don't have info on SATA.
>> As for CPU v. I/O. The factors are (in no order):
>>
>> fp performance
>> memory b/w
>> disk b/w
>> memory size
>>
>> Which of the above dominates the analysis depends on the analysis.
>
>for the reported symptoms (poor scaling when using the second
processor),
>the first doesn't fit. memory bw certainly does, and is Intel's main
>weak spot right now. then again, disk bw and memory size could also
fit
>the symptoms (since they're also resources shared on a dual), but would
be
>diagnosable by other means (namely, both would result in low %CPU
>utilization;
>the latter (thrashing) would be pretty obvious from swap
>traffic/utilization.)
Correct on possible causes, also noted above, but I don't recall any
information on other symptoms to identify a specific causality.
>like I said, I bet the problem is memory bandwidth. mainly because I
just
>don't
>see programs waiting on disk that much anymore
Nastran can be one of those. Think terabytes of I/O to tens of gigabyte
files.
> it does happen, but large
>writes these days will stream at 50+ MB/s, and reads are often cached.
50 MB/s? That's *very* slow.
For example, if typically running large modal frequency analyses, you
should be providing >300 MB/s uncached on ia32 and >600 MB/s uncached on
IPF.
>I should mention that if HT is enabled on these duals, the problem
could be
>poor HT support in your kernels. (HT creates two virtual processors
for
>each physical one. if the scheduler treats HT-virtual processors as
real,
>you will get very poor speedup. this would also be diagnosable by
simply
>running 'top' during a test.)
I'm fairly certain this was fixed in 2.6, but could be wrong.
To repeat, "which of [several factors] dominates the analysis depends on
the analysis".
--
David N. Lombard
My comments represent my opinions, not those of Intel Corporation.
More information about the Beowulf
mailing list