[Beowulf] precise synchronization of system clocks

Lux, James P james.p.lux at jpl.nasa.gov
Tue Sep 30 06:54:20 PDT 2008

On 9/30/08 2:53 AM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:

> Hmm,
> 1 uS accuracy whereas the cpu has a hardware counter for all this.
> To be honest i find 1 microsecond very inaccurate now that cards have
> latencies near that.


> Doing that a couple of thousands of times, we should get a fairly
> accurate
> timing in B, far more accurate than 1 microsecond, as the deviation in
> one way pingpong latency isn't real big. It's quite constant.

Unfortunately, it doesn't work out that way. The distribution of times is
not nicely distributed, so it actually needs more statistical processing to
come up with the delay (for instance, isn't what you really want the minima,
not the mode or median).

One could use such a scheme, with sufficient processing, to measure the
difference between the clock frequencies.

If you can control the actual packets on the wire so they are all identical,
and they're the only packets, that helps too.

> Only the deviation of that latency is a measure for the accuracy at
> which you can
> synchronize the clocktime.
> Now this is a simple 2 node example. It is of course possible for a
> cluster to use
> the measurements of many nodes and synchronize to that, just like the
> coordinate calculation
> for GPS uses several satellites. Using many nodes that'll get the
> average
> error down. Of course to synchronize many nodes each node uses its
> own clock as
> new 'source' of measurement; if for the synchronization accuracy we
> always assume the
> same clock from node A, then getting the error down is a lot tougher.

The GPS synchronization problem is actually substantially easier.  The
propagation delay from satellite to receiver is varying in a very
predictable manner (in fact, the nav solution solves for it); the signal is
specifically designed for accurate timing (i.e. A PN  code generated from a
Cs clock is a darn good way to transmit timing and frequency information)

The challenge in synch over Ethernet (without added hardware a'la IEEE-1588)
is that simple NTP style ping ponging rapidly gets you to where the
measurement uncertainty is comparable to the uncertainty and variability of
the exceedingly cheap oscillators on the NIC.

FWIW, in the lab, most people would not be satisfied with sync to 1
microsecond (after all, how many thousand instructions is that?). You could
probably get to 1 microsecond with running a wire between serial ports (Hook
to the IRQ off the Ring Indicator, for instance) You want to think in terms
of nanoseconds, and no straight ethernet scheme without added hardware can
get there.

Keeping it beowulf'y, if you want fine grained synchronization so that you
don't lose performance when doing barriers, you're probably going to need
some sort of common clock.  The typical microprocessor crystal just isn't
good enough.  Actually, though, when talking about this sort of sync, aren't
we getting close to SIMD sort of processing?  Is a "cluster of commodity
computers" actually a "good" way to be doing this sort of thing?

Jim Lux

> Vincent
> On Sep 29, 2008, at 11:21 PM, Lombard, David N wrote:
>> On Mon, Sep 29, 2008 at 01:10:49PM -0700, Prentice Bisbal wrote:
>>> In the previous thread I instigated about running services in cluster
>>> nodes, there was some mentioning of precisely synchronizing the
>>> system
>>> clocks and this issue is also mentioned in this paper:
>>> "The Case of Missing Supercomputer Performance: Achieving Optimal
>>> Performance on the 8,192 processor ASCI Q" (Petrini, Kerbisin and
>>> Pakin)
>>> http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf
>>> I've also read a few other papers on the topic, and it seems you
>>> need to
>>> sync the system clocks to ~1 uS. On top of that, I imagine you
>>> also need
>>> to synch the activities of each system so they all stop to do the
>>> same
>>> system-level tasks at the same time.
>> The IEEE-1588 "Precision Time Protocol" can provide such levels of
>> global clock
>> synchronization.
>> Shameless plug: See "Hardware Assisted Precision Time Protocol
>> (PTP, IEEE-1588)
>> - Design and Case Study" presented at the recent LCI conference;
>> <http://www.linuxclustersinstitute.org/conferences/archive/2008/
>> technicalpapers.html>
>> --
>> David N. Lombard, Intel, Irvine, CA
>> I do not speak for Intel Corporation; all comments are strictly my
>> own.
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list