[Beowulf] Execution time measurements

Daniel.Kidger at bull.co.uk Daniel.Kidger at bull.co.uk
Fri Jun 24 08:52:57 PDT 2011


I still think that there could be a NUMA issue here

With no NUMA binding:
 - the one process case can migrate between cores on the core sockets - if 
its memory is on the first socket, then it will run a little slower when 
scheduled on the second socket.
- with two process on a node, the first maybe be inhibited from moving to 
the other socket because there is already a process there consuming cpu. 
and vice versa.
 hence both will always run with local memory.


From:   "David Mathog" <mathog at caltech.edu>
To:     beowulf at beowulf.org
Date:   24/05/2011 19:27
Subject:        Re: [Beowulf] Execution time measurements
Sent by:        beowulf-bounces at beowulf.org

Another message from Mikhail Kuzminsky, who for some reason or other 
cannot currently post directly to the list:


1st of all, I should mention that the effect is observed only for
Opteron 2350/OpenSuSE 10.3.
Execution of the same job w/the same binaries on Nehalem E5520/OpenSuSe
11.1 gives the same time for 1
and 2 simultaneously runnung jobs.

Mon, 23 May 2011 12:32:33 -0700 письмо от "David Mathog"
<mathog at caltech.edu>:
> Mon, 23 May 2011 09:40:13 -0700 письмо от
"David Mathog"
> <mathog at caltech.edu>:
> > > On Fri, May 20, 2011 at 02:26:31PM -0400, Mark Hahn forwarded a
> > > > When I run 2 identical examples of the same batch job
> > simultaneously, execution time of *each* job is
> > > > LOWER than for single job run !

> I thought also about cpus frequency variations, but I think that null
> of
> lsmod|grep freq
> is enough for fixed CPU frequency.

> Regarding the frequencies, better to use
> cat /proc/cpuinfo | grep MHz

I looked to cpuinfo, but only manually - some times (i.e. I didn't run
any script w/periodical looking for CPU frequencies).
All the frequencies of cores were fixed.

> Did you verify that the results for each of the two simultaneous runs
> are both correct? 
Yes, the results are the same. I looked also to number of iterations etc.
But I'll check outputs again.

>Ideally, tweak some parameter so they are slightly
> different from each other.

But I don't understand - if I change slightly some of input parameters,
what may it give ?

> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech

Fri, 20 May 2011 20:11:15 -0400 message from Serguei Patchkovskii
<serguei.patchkovskii at gmail.com>:
>    Suse 10.3 is quite old; it uses a kernel which is less than perfect
at scheduling jobs and allocating resources for >NUMA systems. Try
running your  test job using:
>    numactl --cpunodebind=0 --membind=0 g98

numactl w/all things  bound to node 1 gives "big" execution time ( 1 day
4 hours; 2 simultaneous jobs run faster), for forcing different nodes
for cpu and memory - execution time is even  higher (+1 h). Therefore
effect observed don't looks as result of numa allocations :-(



My point about the two different parameter sets on the jobs was to
determine if the two were truly independent, or if they might not be
interacting with each other through checkpoint files or shared memory,
or the like.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20110624/c60658a9/attachment.html>

More information about the Beowulf mailing list