HPL residual check failure

Yoon Jae Ho yoon at bh.kyungpook.ac.kr
Wed Nov 7 23:11:01 PST 2001


I found your e-mail today. 

I can't find your system information in your e-mail. 

but I guess if you use Myrinet instead of 10/100 LAN. then please check the Cable & Myrinet mpich version.

If you use 10/100 LAN, then I guess your failure for the matrix size 23,000 is related to RAM .

Please check the RAM size & Physical Problems of your Workstations RAM first.

There may be problems in the heat but I think in the normal temperature, Workstation must be endured without fail.

Have a NIce Day.
  

---------------------------------------------------------------------
Yoon Jae Ho
Economist
POSCO Research Institute
 
yoon at bh.kyungpook.ac.kr
jhyoon at mail.posri.re.kr
http://ie.korea.ac.kr/~supercom/  Korea Beowulf Supercomputer
http://members.ud.com/services/teams/team.htm?id=264C68D5-CB71-429F-923D-8614F419065D     Help the people with your PC 
 
Imagination is more important than knowledge.  A. Einstein
"상상력이야말로 창조의 출발점" 버나드 쇼우, " 창조는 학습의 결과" 토드 사일러 박사(천재처럼 생각하기 프로그램 저자)
"상상력이란 '이미지를 만들어내는 능력'이라고 볼 수 있는데, 상상력 훈련은 무엇보다도 문학적 훈련이다."   시인 김정란   
"수학적 능력 배양은 논리적 상상력 훈련에 도움이 된다"  윤재호 2000.4.22
"상상력은 집중력과 명상 그리고 기도에 의해서 성장한다" 윤 재호 2000.4.29
"상상력의 실현은 믿음과 부단한 노력이 필요하다" 윤 재호 2000.4.24
http://www.kichun.co.kr   2001.1.6
http://www.c3tv.com    2001.1.10 

------------------------------------------------------------------------

----- Original Message ----- 
From: 연규정 <kjyoun at netstech.com>
To: <beowulf at beowulf.org>
Sent: Monday, September 03, 2001 9:22 PM
Subject: HPL residual check failure


> Hi
> When I was doing HPL benchmark test using big matrix(bigger than 20,000 ) with many linux server(more than 20), sometimes I got residual check error as attached. 
> When I got residual check error, I turned off my linux servers for several hours and then tried again. And usually it worked - I don't know the reason.
> Heat is suspicious. But, is it really heat problem?
> Is there anybody who have experienced similar problem or know the reason?
> please help me.
> 
> Thanks in advance! 
> 
> Keaton
> 
> 
> HPL result files------------------------------------------------------------
> 
> ============================================================================
> T/V                N    NB     P     Q               Time             Gflops
> ----------------------------------------------------------------------------
> W11R2C4        21000   200     6     6             702.80          8.786e+00
> ----------------------------------------------------------------------------
> ||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.0272768 ...... PASSED
> ||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0140749 ...... PASSED
> ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0026585 ...... PASSED
> ============================================================================
> T/V                N    NB     P     Q               Time             Gflops
> ----------------------------------------------------------------------------
> W11R2C4        23000   200     6     6             866.35          9.364e+00
> ----------------------------------------------------------------------------
> ||Ax-b||_oo / ( eps * ||A||_1  * N        ) =     3255.3898794 ...... FAILED
> ||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =     7833.1904572 ...... FAILED
> ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =     1364.3123654 ...... FAILED
> ||Ax-b||_oo  . . . . . . . . . . . . . . . . . =           0.000049
> ||A||_oo . . . . . . . . . . . . . . . . . . . =        5827.145943
> ||A||_1  . . . . . . . . . . . . . . . . . . . =        5836.795619
> ||x||_oo . . . . . . . . . . . . . . . . . . . =           2.390054
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 



More information about the Beowulf mailing list