[Beowulf] HPCC MPIRandomAccess Error
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
James Evans iamjamesevans at gmail.comTue Oct 17 14:19:58 PDT 2006
- Previous message: [Beowulf] Sorry, no webcast for the BayBUG meeting today. BWBUG webcast as usual
- Next message: [Beowulf] Sun Project "blackbox"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I am testing a cluster using HPLinpack - HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK and every so often, the test will stop at MPIRandomAccess. Most recently, this has happened after 70 hours of running, however, it has also happened after 10 hours, or even 10 minutes on rare occassions. I simply receive "Begin of MPIRandomAccess section." at the end of the log file 'hpccoutf.txt'. If I look at the nodes, HPCC is still shown to be running, but no CPU is being used. I have noticed that MPIRandomAccess runs different tests depending if the number of CPUs is equal to a power of two, but this makes no difference as to whether it successfully runs or not. Has anyone seen this before? Any ideas on how to debug the problem? Thanks! PS. Here is my hpccinf.txt: HPLinpack benchmark input file Innovative Computing Laboratory, University of Tennessee HPL.out output file name (if any) 8 device out (6=stdout,7=stderr,file) 1 # of problems sizes (N) 10000 Ns 1 # of NBs 184 NBs 0 PMAP process mapping (0=Row-,1=Column-major) 1 # of process grids (P x Q) 4 Ps 4 Qs 16.0 threshold 1 # of panel fact 2 PFACTs (0=left, 1=Crout, 2=Right) 1 # of recursive stopping criterium 4 NBMINs (>= 1) 1 # of panels in recursion 2 NDIVs 1 # of recursive panel fact. 1 RFACTs (0=left, 1=Crout, 2=Right) 1 # of broadcast 1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM) 1 # of lookahead depth 1 DEPTHs (>=0) 2 SWAP (0=bin-exch,1=long,2=mix) 64 swapping threshold 0 L1 in (0=transposed,1=no-transposed) form 0 U in (0=transposed,1=no-transposed) form 1 Equilibration (0=no,1=yes) 8 memory alignment in double (> 0) ##### This line (no. 32) is ignored (it serves as a separator). ###### 0 Number of additional problem sizes for PTRANS 1200 10000 30000 values of N 0 number of additional blocking sizes for PTRANS 40 9 8 13 13 20 16 32 64 values of NB
- Previous message: [Beowulf] Sorry, no webcast for the BayBUG meeting today. BWBUG webcast as usual
- Next message: [Beowulf] Sun Project "blackbox"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
