[Beowulf] HPL as a learning experience
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Carsten Aulbert carsten.aulbert at aei.mpg.deTue Mar 16 08:27:30 PDT 2010
- Previous message: [Beowulf] Q: IB message rate & large core counts (per node)?
- Next message: [Beowulf] HPL as a learning experience
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all,
I wanted to run high performance linpack mostly for fun (and of course to
learn more about it and stress test a couple of machines). However, so far
I've had very mixed results.
I downloaded the 2.0 version released in September 2008 and managed it to
compile with mpich 1.2.7 on Debian Lenny. The resulting xhpl file is
dynamically linked like this:
linux-vdso.so.1 => (0x00007fffca372000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007fb47bca8000)
librt.so.1 => /lib/librt.so.1 (0x00007fb47ba9f000)
libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007fb47b7c4000)
libm.so.6 => /lib/libm.so.6 (0x00007fb47b541000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fb47b32a000)
libc.so.6 => /lib/libc.so.6 (0x00007fb47afd7000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb47bec4000)
Then I wanted to run a couple of tests on a single quad-CPU node (with 12 GB
physical RAM), I used
http://www.advancedclustering.com/faq/how-do-i-tune-my-hpldat-file.html
to generate files for a single and a dual core test [1] and [2].
Starting the single core run does not pose any problem:
/usr/bin/mpirun.mpich -np 1 -machinefile machines /nfs/xhpl
where machines is just a simple file containing 4 times the name of this host.
So far so good.
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
WR11C2R4 14592 128 1 1 407.94 5.078e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0087653 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0209927 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0045327 ...... PASSED
============================================================================
When starting the two core run, I receive the following error message after a
couple of seconds (after RSS hits the VIRT RAM value in top):
/usr/bin/mpirun.mpich -np 2 -machinefile machines /nfs/xhpl
p0_20535: p4_error: interrupt SIGSEGV: 11
rm_l_1_20540: (1.804688) net_send: could not write to fd=5, errno = 32
SIGSEGV with p4_error indicates a seg fault within hpl - that's as far as I've
come with google, but right now I have no idea how to proceed. I somehow doubt
that this venerable program is so buggy that I'd hit it on my first day ;)
Any ideas where I might do something wrong?
Cheers
Carsten
[1]
single core test
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
8 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
14592 Ns
1 # of NBs
128 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
1 Ps
1 Qs
16.0 threshold
1 # of panel fact
2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0 Number of additional problem sizes for PTRANS
1200 10000 30000 values of N
0 number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64 values of NB
[2]
dual core setup
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
8 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
14592 Ns
1 # of NBs
128 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
1 Ps
2 Qs
16.0 threshold
1 # of panel fact
2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0 Number of additional problem sizes for PTRANS
1200 10000 30000 values of N
0 number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64 values of NB
- Previous message: [Beowulf] Q: IB message rate & large core counts (per node)?
- Next message: [Beowulf] HPL as a learning experience
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
