SMP robustness
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Pierre Brua brua at paralline.comWed Jul 26 15:21:03 PDT 2000
- Previous message: SMP robustness
- Next message: SMP robustness
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Someone wrote: > This is not the case with the Origin 2000. The O2000 will shutdown > problem processors and continue running. > There are a number of Origin Issues (pricing, performance, etc) but > reliability and failover have not been problems in my three years of > Origin Admin. Maybe we are not talking about the same computers, you are talking about Origin2000 from sgi, aren't you ? If you have a little one you may not experience much problems like that, but with big configurations that's another matter... Let me give 3 examples : * alim block failure There is one of those for each 8 processors block. It happened approximately one time per year and per 8cpu-block with the config I had. In that case the whole supercomputer goes down. * scsi controller lock Some problems with backup devices can lock the O2K scsi controller badly. In that case only a reboot can correct it. A reboot of the processor controlling the scsi bus where the device is ? No way, a reboot of the whole supercomputer. You can even read that black on white in some O2K docs. * processor bug If a processor is buggy/burned, you have to shut down your entire supercomputer to replace it. Beowulf systems, viewed as a lot of little independent hardware pieces, are quite more solid from that point of view. Like a bunch of ants. That's not to say O2K are not good supercomputers of course, but their integrated hardware has some unexpected "features" in that area. Pierre -- PARALLINE Pierre BRUA Parallelism & Linux Solutions 71,av. des Vosges Phone:+33 388 141 740 mailto:brua at paralline.com F-67000 STRASBOURG Fax:+33 388 141 741 http://www.paralline.com
- Previous message: SMP robustness
- Next message: SMP robustness
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
