[Beowulf] Problems with Dell M620 and CPU power throttling
samuel at unimelb.edu.au
Sun Sep 1 20:49:09 PDT 2013
-----BEGIN PGP SIGNED MESSAGE-----
On 30/08/13 23:03, Bill Wichser wrote:
> Since January, when we installed an M620 Sandybridge cluster from
> Dell, we have had issues with power and performance to compute
> nodes. Dell apparently continues to look into the problem but the
> usual responses have provided no solution. Firmware, BIOS, OS
> updates all are fruitless.
One question, have you seen either the kernel or the BMC reporting
thermal throttling? For instance dmesg should show you something like:
CPU0: Core temperature above threshold, cpu clock throttled (total
events = 545939)
CPU0: Core temperature/speed normal
If you're not then there is one other possibility that you may like to
test, which is tell the kernel to not automatically turn on all
powersaving modes as that introduces a heap of latency (and
potentially other issues).
We pass through:
on our SandyBridge nodes for just that reason.
If you don't then the kernel will say "Oh, this is an Intel CPU, I
know this!" (to paraphrase Jurassic Park) and enable every power
saving feature it can find, regardless of what your BIOS/UEFI is set to.
Best of luck!
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
-----END PGP SIGNATURE-----
More information about the Beowulf