Issues with 2466 based cluster

Abhishek SINHA aby_sinha at
Fri Oct 18 17:58:54 PDT 2002

Hello List members

I am monitoring an 8 node Dual AMD Athlon, Tyan 2466
with 4GB ram based athlon cluster.I would have to add
to Prof Brown's experience that it was hard to make
them work at the first go , but once we got them
stabilised we have seen good performance and lot of
stableness from these systems.Now the problem!

One of our systems failed.  On the screen it showed
that the kernal had paniced.  I tried rebooting the
system and sometimes it would boot and sometimes it
would not.  One time when the system was up I checked
the message log and could find no errors listed.  When
we put a load on the system it would panic again.  Of 
course putting a load on the system causes the CPU to
run hotter.

Since I could not find any errors in the log and
getting no beeps when booting Idecided to pull the
cover off and take a look.  This was when I noticed
the fan was turning slowly and making a lot of
noise.Well it was kind of hummmmmmm sound and not
exactly a huge breaking or wierd noise

At this point I contacted vendor asked for a
replacement fan.  While waiting forthe new fan I
decided to remove the failing fan and oil it.  Oiling
the fan seemed to fix the fan, at least temporarily. 
If fact the fan I oiled turns better than the other
CPU fan (it spins 3 or 4 seconds longer than the other
fan when the power is turned off)!! 

So, I put the cover back and proceeded to power up the
system.  The system booted fine.  When I put a load on
the system (The same one that crashed the application)
the job fails with strange errors.  Sometimes the
system panics.  I was beginning to think one of the
CPUs was damaged.

When I received the replacement CPU fans from the
vendor; I opened the cabinet to replace the failing
fan.  I noticed that the other CPU fan was barely
The one I had oiled was spinning away.  I replaced
both fans and they are working correctly.  The system
is running fine.  I stressed it today and it 
passed with flying colors.

Now we are using the fans that come with AMD ,They are
really big with nice blue fans on the top.They are the
same fans which the system came with and are the same
fans which i replaced.I am now wondering the cause and
the effect of this.Does it make sense to really
replace all the fans. I know AMD's are known to run
hot and if we dont do anything i know we are gonna
burn out the processor.

I need to understand if i can do anything abt it..I
saw some of the systems and the fans seem to run
slow.Atleast when i continously see them :)
While i wait for all of your expert opinions on the
issue i will go and oil the fans....
They call it system administration :)


