[Beowulf] bizarre scaling behavior on a Nehalem
Rahul Nabar
rpnabar at gmail.com
Wed Aug 12 11:00:41 PDT 2009
On Wed, Aug 12, 2009 at 11:32 AM, Craig Tierney<Craig.Tierney at noaa.gov> wrote:
> What do you mean normally? I am running Centos 5.3 with 2.6.18-128.2.1
> right now on a 448 node Nehalem cluster. I am so far happy with how things work.
> The original Centos 5.3 kernel, 2.6.18-128.1.10 had bugs in Nelahem support
> where nodes would just start randomly run slow. Upgrading the kernel
> fixed that. But that performance problem was either all or none, I don't recall
> it exhibiting itself in the way that Rahul described.
>
I was trying another angle. Playing with the power profiles. Just
downloaded cpufreq-utils via yum. Tried to see what profile was
loaded:
cpufreq-info
cpufrequtils 005: cpufreq-info (C) Dominik Brodowski 2004-2006
Report errors and bugs to cpufreq at vger.kernel.org, please.
analyzing CPU 0:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 1:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 2:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 3:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 4:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 5:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 6:
no or unknown cpufreq driver is active on this CPU
analyzing CPU 7:
no or unknown cpufreq driver is active on this CPU
Is this lack of the right drivers indicative of a deeper fault or is
this fairly local to this issue? This could be a clue or a red
herring. Just thought that I ought to post it.
--
Rahul
More information about the Beowulf
mailing list