[Beowulf] Clic 2.0 lockup problems

Wed Oct 27 12:16:42 PDT 2004


Thanks for your help.  No, eth1, because that is the only external
interface.  In other words, I tried to logon from an external machine. 
Ssh works fine on other two interfaces though.  Thanks again.


> Hi,
>> I just finished installing Clic 2.0 on a cluster of 1 server and 12
>> nodes.
>>  After running the setup_auto_cluster script I got everything installed.
>> I created a "cluster user" and proceeded to test out some of the
>> included
>> mpi sample code.  This ran fine.  I next tried to start this code
>> remotely
>> (through SSH), but when I did this, the server locked up and had to be
>> rebooted.  It actually locked up while connecting via SSH, not when
>> executing the sample mpi code.  Any idea what might cause this?  The
>> server has 3 network interfaces:
>> eth0 - administration
>> eth1 - outside (internet)
>> eth2 - message passing (computing)
>> (The nodes each have 2 interfaces, one for administration and one for
>> message passing)
>> It also seems that when I logon as the "cluster user" or root, and try
>> to
>> access an external website (e.g. google), the server will lockup and
>> need
>> to be rebooted again.
>> Any idea why I'm experiencing these lockups?  Is something configured
>> incorrectly?  Is it a faulty network card?  I was able to access outside
>> websites fine before I ran the setup scripts.
> you tried to login to the server on all of the three interfaces, to check
> whether it's really completely down and not only one interface? - Reuti


