[Beowulf] NASTRAN on cluster

Mon Apr 11 21:40:29 PDT 2005

> > We just installed a small cluster and are running NASTRAN 2005 on
> it...

(on a cluster of intel duals, and it doesn't seem to scale well
when both cpus on a node are busy.)

> Nastran doesn't really want to run more than one job (MPI rank) per
> node.

I bet that isn't true on dual-opterons.

> The distro can/will have a significant impact on allocatable memory.
> Nastran uses brk(2) to allocate memory, so the TASK_UNMAPPED_BASE is
> significant.

can nastran run on amd64?  it might even run nicely as a ia32 process
on amd64.  just for curiosity:
#include <stdio.h>
int main() { 
    char command[1000];
    sprintf(command,"cat /proc/%d/maps",getpid());
    system(command);
    sleep(60);
    return 0;
}

on amd64, this gives:
[hahn at node1 hahn]$ ./showmap
08048000-08049000 r-xp 00000000 00:11 5139                               /home/hahn/showmap
08049000-0804a000 rwxp 00000000 00:11 5139                               /home/hahn/showmap
55555000-5556a000 r-xp 00000000 00:0d 441379                             /lib/ld-2.3.3.so
5556a000-5556b000 r-xp 00014000 00:0d 441379                             /lib/ld-2.3.3.so
5556b000-5556c000 rwxp 00015000 00:0d 441379                             /lib/ld-2.3.3.so
55576000-55577000 rwxp 55576000 00:00 0 
55577000-5568c000 r-xp 00000000 00:0d 441661                             /lib/tls/libc-2.3.3.so
5568c000-5568e000 r-xp 00115000 00:0d 441661                             /lib/tls/libc-2.3.3.so
5568e000-55690000 rwxp 00117000 00:0d 441661                             /lib/tls/libc-2.3.3.so
55690000-55692000 rwxp 55690000 00:00 0 
ffffc000-ffffe000 rw-p ffffc000 00:00 0 
ffffe000-fffff000 ---p 00000000 00:00 0 

on ia32, TASK_UNMAPPED_BASE, by default, is at 1GB rather than ~1.3.
easy to change, though, at least to give 2-2.5 GB on ia32.

> I can't comment on SATA, but PATA disks are a really bad choice, as they
> require too much effort from the CPU to drive them--SCSI is MUCH
> preferred in that case.

this is one of the longest-lived fallacies I've ever personally experienced.
it was true 10+ years ago when PIO was the norm for ATA disks.  busmastering
has been the norm for PATA for a long while.

> As for CPU v. I/O.  The factors are (in no order):
> 
> fp performance
> memory b/w
> disk b/w
> memory size
> 
> Which of the above dominates the analysis depends on the analysis.

for the reported symptoms (poor scaling when using the second processor), 
the first doesn't fit.  memory bw certainly does, and is Intel's main 
weak spot right now.  then again, disk bw and memory size could also fit
the symptoms (since they're also resources shared on a dual), but would be 
diagnosable by other means (namely, both would result in low %CPU utilization; 
the latter (thrashing) would be pretty obvious from swap traffic/utilization.)

like I said, I bet the problem is memory bandwidth.  mainly because I just don't
see programs waiting on disk that much anymore - it does happen, but large 
writes these days will stream at 50+ MB/s, and reads are often cached.

I should mention that if HT is enabled on these duals, the problem could be 
poor HT support in your kernels.  (HT creates two virtual processors for 
each physical one.  if the scheduler treats HT-virtual processors as real,
you will get very poor speedup.  this would also be diagnosable by simply
running 'top' during a test.)