[Beowulf] Need guidelines for NASA's NAS Parallel Benchmarks
Sangamesh B
forum.san at gmail.com
Sat Jul 12 04:56:22 PDT 2008
Dear all,
This is the first time am doing benchmark of a system - Intel quad core,
quad processor with RHEL 5 64 bit OS.
After unpacking the NAS NPB, NPB3.3.tar.gz package, I got following
directories:
Changes.log NPB3.3-HPF.README NPB3.3-JAV.README NPB3.3-MPI NPB3.3-OMP
NPB3.3-SER README
I need to do the both MPI and OpenMP version benchmarks.
I did some benchmarks on a test machine from NBP3.3-MPI. It has:
BENCHMARK NAME CLASS TYPE
[9] [7] [4]
BT S FULL
CG W SIMPLE
DT A FORTRAN
EP B EPIO
FT C
IS D
LU E
MG
SP
Obviously, the numebr of benchmarks will be 9 * 7 * 4 = 252.
So I need to get the benchmarks for all 252?
A sample benchmark:
[root at test NPB3.3-MPI]# make BT NPROCS=4 CLASS=S SUBTYPE=full VERSION=VEC
Since this benchmark is done on a test machine - dual core, dual opteron
AMD64 processor , I used MPICH2 and GNU Compilers.
To run the benchmark, I used the sample input data file given with NPB:
[root at test btbin]# mpdtrace -l
test_33638 (10.1.1.1)
[root at test btbin]# mpiexec -np 4 ./bt.S.4.mpi_io_full ./inputbt.data
NAS Parallel Benchmarks 3.3 -- BT Benchmark
Reading from input file inputbt.data
collbuf_nodes 0
collbuf_size 1000000
Size: 64x 64x 64
Iterations: 200 dt: 0.0008000
Number of active processes: 4
BTIO -- FULL MPI-IO write interval: 5
0 1 32 32 32
Problem size too big for compiled array sizes
1 1 32 32 32
Problem size too big for compiled array sizes
2 1 32 32 32
Problem size too big for compiled array sizes
3 1 32 32 32
Problem size too big for compiled array sizes
[2] 48 at [0x00000000006c1088], mpid_vc.c[62]
[0] 48 at [0x00000000006be4b8], mpid_vc.c[62]
[1] 48 at [0x00000000006bfdf8], mpid_vc.c[62]
[3] 48 at [0x00000000006c1088], mpid_vc.c[62]
[root at test btbin]#
Looks like 'run' is not successful. What's the wrong? The input file
contains:
[root at test btbin]# cat inputbt.data
200 number of time steps
0.0008d0 dt for class A = 0.0008d0. class B = 0.0003d0 class C = 0.0001d0
64 64 64
5 0 write interval (optional read interval) for BTIO
0 1000000 number of nodes in collective buffering and buffer size for BTIO
[root at test btbin]#
As I doing the benchmarks first time, have no idea to prepare a new input
file.
What parameters should be changed? How these data will affect the benchmark
results?
Is it ok, if I just run
[root at test btbin]# mpiexec -np 4 ./bt.S.4.mpi_io_full
without using any input file? The output of above dry run is:
[root at test btbin]# mpiexec -np 4 ./bt.S.4.mpi_io_full
NAS Parallel Benchmarks 3.3 -- BT Benchmark
No input file inputbt.data. Using compiled defaults
Size: 12x 12x 12
Iterations: 60 dt: 0.0100000
Number of active processes: 4
BTIO -- FULL MPI-IO write interval: 5
Time step 1
Writing data set, time step 5
Writing data set, time step 10
Writing data set, time step 15
Time step 20
Writing data set, time step 20
Writing data set, time step 25
Writing data set, time step 30
Writing data set, time step 35
Time step 40
Writing data set, time step 40
Writing data set, time step 45
Writing data set, time step 50
Writing data set, time step 55
Time step 60
Writing data set, time step 60
Reading data set 1
Reading data set 2
Reading data set 3
Reading data set 4
Reading data set 5
Reading data set 6
Reading data set 7
Reading data set 8
Reading data set 9
Reading data set 10
Reading data set 11
Reading data set 12
Verification being performed for class S
accuracy setting for epsilon = 0.1000000000000E-07
Comparison of RMS-norms of residual
1 0.1703428370954E+00 0.1703428370954E+00 0.6680519237820E-14
2 0.1297525207005E-01 0.1297525207003E-01 0.9351949888112E-12
3 0.3252792698950E-01 0.3252792698949E-01 0.4859455174690E-12
4 0.2643642127515E-01 0.2643642127517E-01 0.7155062549945E-12
5 0.1921178413174E+00 0.1921178413174E+00 0.9101712010679E-14
Comparison of RMS-norms of solution error
1 0.1149036328945E+02 0.1149036328945E+02 0.4854294277047E-13
2 0.9156788904727E+00 0.9156788904727E+00 0.4195107810359E-13
3 0.2857899428614E+01 0.2857899428614E+01 0.9649723729104E-13
4 0.2598273346734E+01 0.2598273346734E+01 0.1391264769245E-12
5 0.2652795397547E+02 0.2652795397547E+02 0.3629324024933E-13
Verification Successful
BTIO -- statistics:
I/O timing in seconds : 0.02
I/O timing percentage : 16.06
Total data written (MB)[0] 712 at [0x00000000006d7b98], dataloop.c[505]
[0] 296 at [0x00000000006d79c8], dataloop.c[324]
[0] 288 at [0x00000000006d7398], dataloop.c[324]
[0] 648 at [0x00000000006d7698], dataloop.c[324]
[0] 296 at [0x00000000006d71c8], dataloop.c[324]
[0] 288 at [0x00000000006d6ff8], dataloop.c[324]
[0] 56 at [0x00000000006bf458], mpid_datatype_contents.c[62]
[0] 72 at [0x00000000006bee68], mpid_datatype_contents.c[62]
[0] 72 at [0x00000000006bf538], mpid_datatype_contents.c[62]
[0] 864 at [0x00000000006bea58], dataloop.c[505]
[0] 368 at [0x00000000006d6b18], dataloop.c[324]
[0] 368 at [0x00000000006d68f8], dataloop.c[324]
[0] 648 at [0x00000000006d65c8], dataloop.c[324]
[0] 368 at [0x00000000006d63a8], dataloop.c[324]
[0] 368 at [0x00000000006bf238], dataloop.c[324]
[0] 56 at [0x00000000006be778], mpid_datatype_contents.c[62]
[0] 80 at [0x00000000006be958], mpid_datatype_contents.c[62]
[0] 80 at [0x00000000006be858], mpid_datatype_contents.c[62]
[0] 72 at [0x00000000006be688], dataloop.c[324]
[0] 72 at [0x000000000[1] 720 at [0x00000000006d7b98], dataloop.c[505]
[1] 296 at [0x00000000006d79c8], dataloop.c[324]
[1] 296 at [0x00000000006d7398], dataloop.c[324]
[1] 648 at [0x00000000006d7698], dataloop.c[324]
[1] 296 at [0x00000000006d71c8], dataloop.c[324]
[1] 296 at [0x00000000006d6ff8], dataloop.c[324]
[1] 56 at [0x00000000006c0d98], mpid_datatype_contents.c[62]
[1] 72 at [0x00000000006c07a8], mpid_datatype_contents.c[62]
[1] 72 at [0x00000000006c0e78], mpid_datatype_contents.c[62]
[1] 864 at [0x00000000006c0398], dataloop.c[505]
[1] 368 at [0x00000000006d6b18], dataloop.c[324]
[1] 368 at [0x00000000006d68f8], dataloop.c[324]
[1] 648 at [0x00000000006d65c8], dataloop.c[324]
[1] 368 at [0x00000000006d63a8], dataloop.c[324]
[1] 368 at [0x00000000006c0b78], dataloop.c[324]
[1] 56 at [0x00000000006c00b8], mpid_datatype_contents.c[62]
[1] 80 at [0x00000000006c0298], mpid_datatype_contents.c[62]
[1] 80 at [0x00000000006c0198], mpid_datatype_contents.c[62]
[1] 72 at [0x00000000006bffc8], dataloop.c[324]
[1] 72 at [0x000000000[2] 720 at [0x00000000006d7b28], dataloop.c[505]
[2] 296 at [0x00000000006d7958], dataloop.c[324]
[2] 296 at [0x00000000006d6c98], dataloop.c[324]
[2] 648 at [0x00000000006d7628], dataloop.c[324]
[2] 296 at [0x00000000006d7158], dataloop.c[324]
[2] 296 at [0x00000000006d6f88], dataloop.c[324]
[2] 56 at [0x00000000006d5b98], mpid_datatype_contents.c[62]
[2] 72 at [0x00000000006d5d68], mpid_datatype_contents.c[62]
[2] 72 at [0x00000000006d5c78], mpid_datatype_contents.c[62]
[2] 864 at [0x00000000006d5788], dataloop.c[505]
[2] 368 at [0x00000000006d68f8], dataloop.c[324]
[2] 368 at [0x00000000006d66d8], dataloop.c[324]
[2] 648 at [0x00000000006d63a8], dataloop.c[324]
[2] 368 at [0x00000000006d6188], dataloop.c[324]
[2] 368 at [0x00000000006d5f68], dataloop.c[324]
[2] 56 at [0x00000000006c1348], mpid_datatype_contents.c[62]
[2] 80 at [0x00000000006d5688], mpid_datatype_contents.c[62]
[2] 80 at [0x00000000006c1428], mpid_datatype_contents.c[62]
[2] 72 at [0x00000000006c1258], dataloop.c[324]
[2] 72 at [0x00000000006be598], dataloop.c[324]
[0] 32 at [0x00000000006be318], mpid_datatype_contents.c[62]
[0] 48 at [0x00000000006be4b8], mpid_vc.c[62]
06bfed8], dataloop.c[324]
[1] 32 at [0x00000000006bfd28], mpid_datatype_contents.c[62]
[1] 48 at [0x00000000006bfdf8], mpid_vc.c[62]
[3] 720 at [0x00000000006d7b28], dataloop.c[505]
[3] 296 at [0x00000000006d7958], dataloop.c[324]
[3] 296 at [0x00000000006d6c98], dataloop.c[324]
[3] 648 at [0x00000000006d7628], dataloop.c[324]
[3] 296 at [0x00000000006d7158], dataloop.c[324]
[3] 296 at [0x00000000006d6f88], dataloop.c[324]
[3] 56 at [0x00000000006d5b98], mpid_datatype_contents.c[62]
[3] 72 at [0x00000000006d5d68], mpid_datatype_contents.c[62]
[3] 72 at [0x00000000006d5c78], mpid_datatype_contents.c[62]
[3] 864 at [0x00000000006d5788], dataloop.c[505]
[3] 368 at [0x00000000006d68f8], dataloop.c[324]
[3] 368 at [0x00000000006d66d8], dataloop.c[324]
[3] 648 at [0x00000000006d63a8], dataloop.c[324]
[3] 368 at [0x00000000006d6188], dataloop.c[324]
[3] 368 at [0x00000000006d5f68], dataloop.c[324]
[3] 56 at [0x00000000006c1348], mpid_datatype_contents.c[62]
[3] 80 at [0x00000000006d5688], mpid_datatype_contents.c[6206c1168],
dataloop.c[324]
[2] 32 at [0x00000000006c0fb8], mpid_datatype_contents.c[62]
[2] 48 at [0x00000000006c1088], mpid_vc.c[62]
]
[3] 80 at [0x00000000006c1428], mpid_datatype_contents.c[62]
[3] 72 at [0x00000000006c1258], dataloop.c[324]
[3] 72 at [0x00000000006c1168], dataloop.c[324]
[3] 32 at [0x00000000006c0fb8], mpid_datatype_contents.c[62]
[3] 48 at [0x00000000006c1088], mpid_vc.c[62]
: 0.83
I/O data rate (MB/sec) : 49.85
BT Benchmark Completed.
Class = S
Size = 12x 12x 12
Iterations = 60
Time in seconds = 0.10
Total processes = 4
Compiled procs = 4
Mop/s total = 2204.23
Mop/s/process = 551.06
Operation type = floating point
Verification = SUCCESSFUL
Version = 3.3
Compile date = 12 Jul 2008
Compile options:
MPIF77 = /opt/libs/mpi/mpich2/1.0.6p1/bin/mpif77
FLINK = $(MPIF77)
FMPI_LIB = (none)
FMPI_INC = (none)
FFLAGS = -O
FLINKFLAGS = -O
RAND = (none)
Please send the results of this run to:
NPB Development Team
Internet: npb at nas.nasa.gov
If email is not available, send this to:
MS T27A-1
NASA Ames Research Center
Moffett Field, CA 94035-1000
Fax: 650-604-3957
[root at test btbin]#
Can any one this list have the experience on NAS Parallel Benchmarks? If so,
give some guidelines to do the benchmarks properly.
I need to produce the Benchmark results within three days. Is this can be
done?
Thanks in advance,
Sangamesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20080712/a706cfce/attachment.html>
More information about the Beowulf
mailing list