[Beowulf] precise synchronization of system clocks
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Lawrence Stewart larry.stewart at sicortex.comMon Sep 29 15:00:04 PDT 2008
- Previous message: [Beowulf] precise synchronization of system clocks
- Next message: [Beowulf] Re: MOSIX2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sep 29, 2008, at 4:10 PM, Prentice Bisbal wrote: > In the previous thread I instigated about running services in cluster > nodes, there was some mentioning of precisely synchronizing the system > clocks and this issue is also mentioned in this paper: > > "The Case of Missing Supercomputer Performance: Achieving Optimal > Performance on the 8,192 processor ASCI Q" (Petrini, Kerbisin and > Pakin) > http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf > > I've also read a few other papers on the topic, and it seems you > need to > sync the system clocks to ~1 uS. On top of that, I imagine you also > need > to synch the activities of each system so they all stop to do the same > system-level tasks at the same time. > > The papers I read all mentioned different OSes, or at least > specialized > hardware. Can this level of synchronization be achieved in Linux on > commodity hardware? I imagine NTP doesn't have the resolution needed > for this, and Don Becker has some strong feelings against NTP. The SiCortex systems I work on are not commodity, but they do run Linux. All the node chips in the machine are frequency locked to the same oscillator, so the core cycle counters (MIPS standard) advance at the same rate, but because the cores are released from reset at different times, they are not initially synchronized. We recently added a global clock synchronization step to booting the system by timestamping messages sent over an out-of-band channel of the interconnect. After some futzing around, we're able to synchronize all the cycle counters to within about 50 nanoseconds. The timer interrupts then happen at the same counter values system wide, which naturally synchronizes most of the daemons that wake up. I don't think we've gone to the trouble of gang scheduling them as well, which would also be a good idea. We tried reducing the standard 1000Hz timer interrupts to 100 Hz, but a bunch of stuff in the IP network stack reacted badly, slowing down IP communications. We haven't tracked it all down yet. As one would expect from the papers you cite, the clock synchronization has had a very dramatic effect on large scale collectives - a 5800 rank 8-byte allreduce is now down to 36 microseconds, where it was something like 170 microseconds before the clock project. Since clusters built from commodity servers run on independent oscillators, it it much harder to synchronize them - NTP will do a very good job estimating the relative frequencies, but all those oscillators will drift independently with temperature and aging, so you have to run NTP continually. However, the problem to solve - synchronizing local clocks with each other, is different from the one NTP is intended to solve. You don't really care what the wall clock time is, you only care that all the systems have the same time. I've seen some other papers on the subject of using LAN timestamps to provide much more accurate local synchronization. Here's one that cites 10 microsecond results: High-Precision Relative Clock Synchronization Using Time Stamp Counters Guo-Song Tian; Yu-Chu Tian; Fidge, C. Engineering of Complex Computer Systems, 2008. ICECCS 2008. 13th IEEE International Conference on Volume , Issue , March 31 2008-April 3 2008 Page(s):69 - 78 > Incidently, a good way to measure the effects of OS noise locally is to write a program that reads the core cycle counter in a tight loop, and keeps statistics on the intervals between successive samples. You can find out how often and for how long your OS is going out to lunch. _larry
- Previous message: [Beowulf] precise synchronization of system clocks
- Next message: [Beowulf] Re: MOSIX2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
