[scyld-users] Re: [Support] Setting up ntpd on compute nodes

Tue Feb 10 12:10:01 PST 2004

Don,

Well the main reason behind this is the odd date/time stamps that I'm
seeing on processes on the compute nodes.  For instance here's the output
of 3 commands, an uptime, a date, and a ps -ef.  Note that the compute
node's time (via the date command) is about 18 seconds off of the master
node's time.  Also note that it has a process (the ps -ef) that started
running 10 DAYS in the FUTURE!

# bpsh 3 uptime; bpsh 3 date; bpsh 3 ps -ef
  2:48pm  up 14 days, 19:42,  0 users,  load average: 0.00, 0.00, 0.00
Tue Feb 10 14:48:36 UTC 2004
UID        PID  PPID  C STIME TTY          TIME CMD
root     15212 15211  0 Feb06 ?        00:12:20 /usr/bin/sendstats 3
root     15265     1  0 Jan26 ?        00:00:01 syslogd -m 0
root     27389 27388  0 Feb20 ?        00:00:00 ps -ef

root at hrunting.gsfc.nasa.gov (bash)     Tue Feb 10     14:48:18
/root
# 

This behavior varies by node, for instance here is the same set of
commands run on node 8, notice that in this case the ps command is only
running 8 hours and 17 minutes in the future even though its clock (via
date) appears to be almost a minute faster than the host node (49 sec):

# bpsh 8 uptime; bpsh 8 date; bpsh 8 ps -ef
  2:52pm  up 24 days, 16:49,  0 users,  load average: 0.00, 0.00, 0.00
Tue Feb 10 14:52:38 UTC 2004
UID        PID  PPID  C STIME TTY          TIME CMD
root     12075 12073  0 Jan17 ?        00:24:55 /usr/bin/sendstats 8
root     12140     1  0 Jan16 ?        00:00:02 syslogd -m 0
root     27408 27407  0 23:09 ?        00:00:00 ps -ef

root at hrunting.gsfc.nasa.gov (bash)     Tue Feb 10     14:51:49

I don't care whether we use ntpd or bdate via cron, so long as the delta's
in time are eliminated.  I'm also concerned about the future STIME's
listed since this has caused some confusion when diagnosing issues - not
to mention the fact that it's disconcerting to see something so obviously
wrong.

Tony

+-----------------------------------+
|  Tony Stocker                     |
|  Systems Administrator            |
|  TSDIS/TRMM Code 902              |
|  301-614-5738 (office)            |
|  301-614-5269 (fax)               |
|  Anton.K.Stocker.1 at gsfc.nasa.gov  |
+-----------------------------------+

On Mon, 9 Feb 2004, Donald Becker wrote:

> On Fri, 6 Feb 2004, Tony Stocker wrote:
> 
> > How do we go about setting up ntpd on our compute nodes, with the ntp
> > server being the host node?
> 
> What aspect of NTP do you need?
> 
> This same topic came up during my meeting with Panasas this past Friday:
> if you need time synchronization only for the filesystem, our current
> approach will work.
> 
> We prefer not to run the standard NTP daemon, or any daemon, on compute
> nodes. Running daemons on compute nodes results in unpredictable scheduling.
> This becomes a significant issue with lock-step computation and
> larger node counts, as the slowest node sets the step rate.
> 
> Instead Scyld provides 'bdate', which explicitly sets the time
> (settimeofday(), including microseconds) on compute nodes from the
> master's clock.  This is called at node boot time, and optionally
> periodically with 'cron'.  In both cases it follows the Scyld approach
> of cluster operation being controlled by a master machine, rather than
> compute nodes having independent operations or relying on distributed,
> persistent configuation files.
> 
> If the exact behavior of 'ntp' is required, it's simple to configure
> 'ntpd' to start automatically on node boot.  Create a start-up script
>   /etc/beowulf/init.d/ntp
> that calls
>   bpsh -n $NODE /usr/sbin/ntpd -m -g
> (or the appropriate options for your needs).
> 
> Please let us know what your time sync requirements are -- we can likely
> efficiently provide the functionality needed, but are reluctant to
> include the 'ntpd' approach in our default node configuration.  It is
> more intrusive, complex and configuration-intensive than is needed for a
> tightly coupled cluster.
> 
> -- 
> Donald Becker				becker at scyld.com
> Scyld Computing Corporation		http://www.scyld.com
> 914 Bay Ridge Road, Suite 220		Scyld Beowulf cluster systems
> Annapolis MD 21403			410-990-9993
> 
>