[Beowulf] Odd AMD quad core SuperMicro power off issues
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Chris Samuel csamuel at vpac.orgThu Jul 2 22:17:17 PDT 2009
- Previous message: [Beowulf] Sell a U41 rack w/ computers and switches
- Next message: [Beowulf] Odd AMD quad core SuperMicro power off issues
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
----- "Chris Samuel" <csamuel at vpac.org> wrote: In April I wrote: > Well we've been gradually replacing the Barcelona chips > with Shanghai (same clockspeed) and we are yet to see a > power off on a Shanghai node! Since I wrote that we have seen far fewer with 2.3GHz Shanghai (2376, a 75W part), *but* we have some nodes upgraded to the ULP 2.4 GHz Shanghai (2379 HE, a 55W part) which do exhibit this issue very regularly! :-( Gaussian is still a classic for doing this, but we've also been able to trigger it with VASP, Amber and (far less frequently) InterProScan. The compute nodes are using SuperMicro H8DM8-2 based with 32GB of ECC RAM. The boxes are running CentOS 5.3 with mainline kernels (currently 2.6.28.9, though we have demonstrated it with 2.6.30-rc6 and the EDAC patches which catch nothing before it dies). We've seen the same behaviour with the standard CentOS kernels too. This is driving us up the wall! Is nobody else seeing this ? cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
- Previous message: [Beowulf] Sell a U41 rack w/ computers and switches
- Next message: [Beowulf] Odd AMD quad core SuperMicro power off issues
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
