[Beowulf] precise synchronization of system clocks

Robert G. Brown rgb at phy.duke.edu
Wed Oct 1 07:18:20 PDT 2008


On Tue, 30 Sep 2008, Nifty niftyompi Mitch wrote:

> Also while focusing on network/transport in this discussion none of us
> made a comment on rotational latency as a source of uncertainty for the
> kernel state.   If we had the ability to synchronize the systems exactly
> starting a process would lag for want of rotational/seek disk latency
> in beowulf'y.

Sure, but starting processes is presumed to occur once and then (an
efficient computation) proceeds for a long time.  All the parallelized
tasks effectively arrive at the same point at the computation at the
first barrier AFTER all this is finished.  If the communications involve
giving the kernel a slice at (say) the end when the barrier is released
for all nodes at "the same time", then the distributed kernels CAN have
roughly the same state IF one suppresses enough of the "random" sources
of state-noise -- asynchronous demands for the kernel's attention.  To
the extent that the kernel can accumulate all of its regular
housekeeping and do it an elective time, if one gets the system clocks
together within a usec and the kernel does its elective work on clock
ticks, that work will end up being done (mostly) synchronously across
the nodes.

Truthfully, all one is trying to do is to generalize your parallel
process to have a double synchronous barrier, with one phase of the
computation being "kernel housekeeping".

    Compute (in parallel) -> barrier (IPCs) -> Kernel (in parallel) ->
barrier -> Compute -> barrier -> Kernel -> barrier ... ad inifinitum.

If the work done by the kernel is fairly tightly bounded -- predictably
completes in (say) 100 usec (which is a LOT of time, far more than one
almost ever sees one STILL has 900 usec per tick to work on your compute
task.  If the kernel (more reasonably) completes in 1-10 usec your
cluster should have a 99+% duty cycle but avoid the "noise" that
desynchronizes everything.

>
> Shared memory machines and transports will behave differently.
>
> The very high accuracy and high precision clock synchronization is a very real
> problem for some data gathering systems.  Once the data is gathered the
> computation should be less sensitive.  These are different problems and
> might be addressed by the data sampling devices.
>
> Synchronization brings problems.... for example a well synchronized campus
> can hammer yp server and file servers when cron triggers the same actions on
> 5000+ systems...  I try never to fetchmail at the hour, half hour...
>
> I suspect that some system cron tasks should no longer run from cron.   Common
> housekeeping tasks necessary for system health should be run via the batch system
> in a way that is fashionably late enough to not hammer site services.

Absolutely.  In fact, you'd want the nodes to be isolated and not
running ANY of this stuff, I'd guess.  You'd want the nodes to have
quiescent, non-demanding hardware (except for devices doing the bidding
of the running parallel process) so that nothing "random" needed to be
done that couldn't be saved for the kernel slices.

> One site service of interest is AC power.   A modern processor sitting
> in an idle state that then starts a well optimized loop will jump from
> a couple of watts to 100 watts in as many clocks as the set of pipelines
> is deep behind the instruction decode and instruction cache fill.  A 1000
> processor (4000 cores) might jump from 4000 watts to 100000 watts in the
> blink of an eye (err did the lights blink).   Buffer that dI/dT through
> the PS and it is less but still interesting on the mains which are synchronized.

Interesting.  I never have seen the lights blink although I don't run
synchronous computations.  One wonders if the power supply capacitors
(which should be quite large, I would think) don't soak up the
transient, though, even on very large clusters.  Also, I think that the
power differential is smaller than you are allowing for -- I don't think
most idle processors draw "no" power...

    rgb

>
>
>
>

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977



More information about the Beowulf mailing list