Mysterious kernel hangs
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Felix Rauch rauch at inf.ethz.chThu Mar 15 05:34:22 PST 2001
- Previous message: changing IP with scyld
- Next message: Mysterious kernel hangs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
We recently bought a new 16 node cluster with dual 1 GHz PentiumIII nodes, but machines mysteriously freeze :-( The nodes have STL2 boards (Version A28808-301), onboard adaptec SCSI controllers (7899P), onboard intel Fast Ethernet adapters (82557 [Ethernet Pro 100]) and additional Packet Engines Hamachi GNIC-II Gigabit Ethernet cards. We tried kernels 2.2.x, 2.4.1 and now even 2.4.2-ac20, but it seems to be the same problem with all kernels: When we run experiments which use the network intensively, any of the machines will just freeze after a few hours. The frozen machine does not respond to anything and up to now we were not able to see any log-entries related to the freeze on virtual console 10 :-( We switched now on all the "Kernel Hacking" stuff in the kernel configuration (especially the logging) and we will try again, hopefuly we will at least see some log outputs. The freezes do also happen if we let non-network-intensive jobs run on the machines (e.g. SETI at home), but clearly they happen less often. Does anyone of you have any ideas what could go wrong or what we could try to find the cause of the problems? Regards, Felix -- Felix Rauch | Email: rauch at inf.ethz.ch Institute for Computer Systems | Homepage: http://www.cs.inf.ethz.ch/~rauch/ ETH Zentrum / RZ H18 | Phone: ++41 1 632 7489 CH - 8092 Zuerich / Switzerland | Fax: ++41 1 632 1307
- Previous message: changing IP with scyld
- Next message: Mysterious kernel hangs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
