msnitzer at plogic.com
Mon Jun 24 09:09:28 PDT 2002
Josip Loncaric (josip at icase.edu) said:
> "Robert G. Brown" wrote:
> > I personally think that networked systems, nodes or not, should have the
> > time network synchronized if at all possible.
> > We use ntpd to keep everything sync'd.
> I concur -- ntpd is very good, and it can be very easy on your internal
> network (within your cluster, just use broadcast client/server model
> instead of polling).
> Unsynchronized systems can misbehave, particularly with schedulers etc.
Josip et. al.,
I've tried to configure a test cluster with the broadcast client/server
model and I've yet to get it working... ntpd yields little output that is
of any help. So if you or anyone else on the list can offer some
assistance based on the following data, I'd really appreciate it.
I have Redhat's ntp-4.1.1-1 running on all nodes.
/etc/ntp.conf on the master has:
server clock.psu.edu # arbitrary choice for external NTP server
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
broadcast 192.168.0.255 ttl 6
/etc/ntp.conf on the slaves has:
broadcastclient # also tried with broadcastclient 192.168.0.255
/etc/ntp/step-tickers on the slaves has:
NOTE: this really shouldn't needed in a properly configured broadcast
client/server config correct? I've read that when ntpd starts as a client
it handshakes a bit with an available ntpd server like a normal polling
config and then reverts to using the broadcasts.
starting ntpd on a slave yields:
[root at LC1 ~]# rsh LC2 /sbin/service ntpd restart
Shutting down ntpd: [ OK ]
ntpd: Synchronizing with time server: [ OK ]
Starting ntpd: [ OK ]
/var/log/messages on the slave has:
Jun 24 11:23:27 LC2 ntpd: ntpd shutdown succeeded
Jun 24 11:23:27 LC2 ntpdate: step time server 192.168.0.1 offset -0.005725 sec
Jun 24 11:23:27 LC2 ntpd: succeeded
Jun 24 11:23:27 LC2 ntpd: ntpd 4.1.1 at 1.786 Mon Apr 8 06:30:52 EDT 2002 (1)
Jun 24 11:23:27 LC2 ntpd: ntpd startup succeeded
tcpdump from the master shows:
[root at LC1 ~]# tcpdump -i eth0 "port 123"
tcpdump: listening on eth0
11:35:46.125187 LC1.ntp > 192.168.0.255.ntp: v4 bcast strat 3 poll 6 prec -17 (DF) [tos 0x10]
11:36:49.125308 LC1.ntp > 192.168.0.255.ntp: v4 bcast strat 3 poll 6 prec -17 (DF) [tos 0x10]
tcpdump from a slave shows:
[root at LC2 ~]# tcpdump -i eth0 "port 123"
tcpdump: listening on eth0
11:35:46.108483 LC1.ntp > 192.168.0.255.ntp: v4 bcast strat 3 poll 6 prec -17 (DF) [tos 0x10]
11:36:49.110939 LC1.ntp > 192.168.0.255.ntp: v4 bcast strat 3 poll 6 prec -17 (DF) [tos 0x10]
SO... the master's ntpd appears to be doing what it should be and it
would appear the client (slave) is getting the packet(s) it needs.. I
suppose now it's just a question of whether or not the ntpd on the slave
is actually listening for the broadcast.
ntpq -p on a slave:
[root at LC2 ~]# ntpq -p
No association ID's returned
NOTE: shouldn't the client (slave) ntpd see 192.168.0.255 as a source _IF_
it's ntpd were doing what it should?
If anyone any ideas why the clients aren't reporting any association IDs
or can provide the details on how you got your broadcast client/server
ntp config working please let me know.
More information about the Beowulf