[Beowulf] HPL as a learning experience
Carsten Aulbert
carsten.aulbert at aei.mpg.de
Tue Mar 16 08:27:30 PDT 2010
Hi all,
I wanted to run high performance linpack mostly for fun (and of course to
learn more about it and stress test a couple of machines). However, so far
I've had very mixed results.
I downloaded the 2.0 version released in September 2008 and managed it to
compile with mpich 1.2.7 on Debian Lenny. The resulting xhpl file is
dynamically linked like this:
linux-vdso.so.1 => (0x00007fffca372000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007fb47bca8000)
librt.so.1 => /lib/librt.so.1 (0x00007fb47ba9f000)
libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007fb47b7c4000)
libm.so.6 => /lib/libm.so.6 (0x00007fb47b541000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fb47b32a000)
libc.so.6 => /lib/libc.so.6 (0x00007fb47afd7000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb47bec4000)
Then I wanted to run a couple of tests on a single quad-CPU node (with 12 GB
physical RAM), I used
http://www.advancedclustering.com/faq/how-do-i-tune-my-hpldat-file.html
to generate files for a single and a dual core test [1] and [2].
Starting the single core run does not pose any problem:
/usr/bin/mpirun.mpich -np 1 -machinefile machines /nfs/xhpl
where machines is just a simple file containing 4 times the name of this host.
So far so good.
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
WR11C2R4 14592 128 1 1 407.94 5.078e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0087653 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0209927 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0045327 ...... PASSED
============================================================================
When starting the two core run, I receive the following error message after a
couple of seconds (after RSS hits the VIRT RAM value in top):
/usr/bin/mpirun.mpich -np 2 -machinefile machines /nfs/xhpl
p0_20535: p4_error: interrupt SIGSEGV: 11
rm_l_1_20540: (1.804688) net_send: could not write to fd=5, errno = 32
SIGSEGV with p4_error indicates a seg fault within hpl - that's as far as I've
come with google, but right now I have no idea how to proceed. I somehow doubt
that this venerable program is so buggy that I'd hit it on my first day ;)
Any ideas where I might do something wrong?
Cheers
Carsten
[1]
single core test
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
8 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
14592 Ns
1 # of NBs
128 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
1 Ps
1 Qs
16.0 threshold
1 # of panel fact
2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0 Number of additional problem sizes for PTRANS
1200 10000 30000 values of N
0 number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64 values of NB
[2]
dual core setup
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
8 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
14592 Ns
1 # of NBs
128 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
1 Ps
2 Qs
16.0 threshold
1 # of panel fact
2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0 Number of additional problem sizes for PTRANS
1200 10000 30000 values of N
0 number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64 values of NB
More information about the Beowulf
mailing list