[Beowulf] Execution time measurements
Daniel.Kidger at bull.co.uk
Daniel.Kidger at bull.co.uk
Fri Jun 24 08:52:57 PDT 2011
Mikhail,
I still think that there could be a NUMA issue here
With no NUMA binding:
- the one process case can migrate between cores on the core sockets - if
its memory is on the first socket, then it will run a little slower when
scheduled on the second socket.
- with two process on a node, the first maybe be inhibited from moving to
the other socket because there is already a process there consuming cpu.
and vice versa.
hence both will always run with local memory.
Daniel
From: "David Mathog" <mathog at caltech.edu>
To: beowulf at beowulf.org
Date: 24/05/2011 19:27
Subject: Re: [Beowulf] Execution time measurements
Sent by: beowulf-bounces at beowulf.org
Another message from Mikhail Kuzminsky, who for some reason or other
cannot currently post directly to the list:
BEGIN FORWARD
1st of all, I should mention that the effect is observed only for
Opteron 2350/OpenSuSE 10.3.
Execution of the same job w/the same binaries on Nehalem E5520/OpenSuSe
11.1 gives the same time for 1
and 2 simultaneously runnung jobs.
Mon, 23 May 2011 12:32:33 -0700 пиÑьмо от "David Mathog"
<mathog at caltech.edu>:
> Mon, 23 May 2011 09:40:13 -0700 ÿøÑÂьüþ þт
"David Mathog"
> <mathog at caltech.edu>:
> > > On Fri, May 20, 2011 at 02:26:31PM -0400, Mark Hahn forwarded a
message:
> > > > When I run 2 identical examples of the same batch job
> > simultaneously, execution time of *each* job is
> > > > LOWER than for single job run !
> I thought also about cpus frequency variations, but I think that null
output
> of
> lsmod|grep freq
> is enough for fixed CPU frequency.
>
> END FORWARD
> Regarding the frequencies, better to use
> cat /proc/cpuinfo | grep MHz
I looked to cpuinfo, but only manually - some times (i.e. I didn't run
any script w/periodical looking for CPU frequencies).
All the frequencies of cores were fixed.
> Did you verify that the results for each of the two simultaneous runs
> are both correct?
Yes, the results are the same. I looked also to number of iterations etc.
But I'll check outputs again.
>Ideally, tweak some parameter so they are slightly
> different from each other.
But I don't understand - if I change slightly some of input parameters,
what may it give ?
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
Fri, 20 May 2011 20:11:15 -0400 message from Serguei Patchkovskii
<serguei.patchkovskii at gmail.com>:
> Suse 10.3 is quite old; it uses a kernel which is less than perfect
at scheduling jobs and allocating resources for >NUMA systems. Try
running your test job using:
>
> numactl --cpunodebind=0 --membind=0 g98
numactl w/all things bound to node 1 gives "big" execution time ( 1 day
4 hours; 2 simultaneous jobs run faster), for forcing different nodes
for cpu and memory - execution time is even higher (+1 h). Therefore
effect observed don't looks as result of numa allocations :-(
Mikhail
END FORWARD
My point about the two different parameter sets on the jobs was to
determine if the two were truly independent, or if they might not be
interacting with each other through checkpoint files or shared memory,
or the like.
Regards,
David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20110624/c60658a9/attachment.html>
More information about the Beowulf
mailing list