MP S2460 Problem
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduWed Feb 26 09:50:46 PST 2003
- Previous message: MP S2460 Problem
- Next message: MP S2460 Problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 26 Feb 2003, John Morelle wrote: > Hi all, > > I would like to notice here to everyone about our complaining problems on > the TYAN MP based : the Tiger S2460 motherboard. > We bought this mobo since its official launch and we have integrated almost > more than a hundred pieces of this kind in our beowulf cluster, but we have > almost changed more than the half of them by the next one : Tiger S2466, > because of "hang on" problems. > I saw here and here many people who tells about their technical problems on > this Tiger MP board. > And we are trying to taking back all the informations that users could send > us about the same problem. > So, please free to mail us briefly your experience. > Thanks in advance. You can probably find plenty of them in the list archives. I've detailed ours numerous times. a) They persist in crashing (even to this day) when hammered by a memory-intensive application. We have some 23 2460's, and probably 2 or 3 times that many 2466's. When the owners run the computation(s) on them that they got the workstations for, they crash, often with three days to a week. b) Even that is maddeningly inconsistent. The same job run with the same parameters might crash in one day one time, four days another time, and not crash at all on a different 2460 -- until the time it does. c) We had incredibly horrible problems initially getting them to work with off the shelf risers. Some cards would work, in some slots, sometimes. Some of the same cards that failed would work in some of the slots on the motherboard if plugged directly in (no riser at all). We're not talking odd cards, either -- things like 3c905's and off-the-shelf PCI video. d) When we finally found risers that would work, and cards that would work in the risers, we started struggling with the BIOS, which needed reflashing. For that matter, the 2466's have had plenty of BIOS problems, some of which are just plain stupid design. e) Such as the fact that if you flash the BIOS, it resets the serial console (which doesn't work horribly well, as it requires a keyboard to be plugged directly in if you want to do all sorts of important things but which does work). So if you actually bought 2466's WITHOUT a video card, expecting to use the serial console, if you reflash the BIOS you have to disassemble the case, insert a video card, reenter the BIOS, turn the serial console on again, shut it down and take out the video card, rerack it, power it up and do whatever via the serial console, and God help you if you made any sort of mistake or anything failed to "take" because you then get to do it all over again. f) Then there is their general sensitivity to heat, power supply, memory, and the phase of the moon. We replaced all the power supplies and all the cooling fans once just trying to find a combination that would stabilize them. Overall, the 2460's are just plain broken unstable pieces of shit that suck systems administration time like a black hole and have cost us something like 1/3 of the productivity of the cluster in question and infinite annoyance at the management level. We are finally biting the bullet and trying to gradually replace them with 2466's (reusing all the rest of the hardware). If you can get Tyan to replace them for free, please let us know. God knows that they should -- they should replace ours as well and those belonging to any other poor suckers who bought them. These systems overall drove us to seriously consider e.g. dual Xeons (at a fairly similar price) just because they are relatively stable. Alas, the Xeons don't run my particular problem as well as Athlons... rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: MP S2460 Problem
- Next message: MP S2460 Problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
