in Index   Prev   Next

RFC 5905

Network Time Protocol Version 4: Protocol and Algorithms Specification

Pages: 110
Proposed Standard
Obsoletes:  13054330
Updated by:  782285739109
Part 3 of 5 – Pages 39 to 60
First   Prev   Next

Top   ToC   RFC5905 - Page 39   prevText

11. System Process

As each new sample (theta, delta, epsilon, jitter, t) is produced by the clock filter algorithm, all peer processes are scanned by the mitigation algorithms consisting of the selection, cluster, combine, and clock discipline algorithms in the system process. The selection algorithm scans all associations and casts off the falsetickers, which have demonstrably incorrect time, leaving the truechimers as result. In a series of rounds, the cluster algorithm discards the association statistically furthest from the centroid until a specified minimum number of survivors remain. The combine algorithm produces the best and final statistics on a weighted average basis. The final offset is passed to the clock discipline algorithm to steer the system clock to the correct time. The cluster algorithm selects one of the survivors as the system peer. The associated statistics (theta, delta, epsilon, jitter, t) are used to construct the system variables inherited by dependent servers and clients and made available to other applications running on the same machine.
Top   ToC   RFC5905 - Page 40

11.1. System Process Variables

Figure 23 summarizes the common names, formula names, and a short description of each system variable. Unless noted otherwise, all variables have assumed prefix s. +-----------+------------+------------------------+ | Name | Formula | Description | +-----------+------------+------------------------+ | t | t | update time | | p | p | system peer identifier | | leap | leap | leap indicator | | stratum | stratum | stratum | | precision | rho | precision | | offset | THETA | combined offset | | jitter | PSI | combined jitter | | rootdelay | DELTA | root delay | | rootdisp | EPSILON | root dispersion | | v | v | survivor list | | refid | refid | reference ID | | reftime | reftime | reference time | | NMIN | 3 | minimum survivors | | CMIN | 1 | minimum candidates | +-----------+------------+------------------------+ Figure 23: System Process Variables Except for the t, p, offset, and jitter variables and the NMIN and CMIN constants, the variables have the same format and interpretation as the peer variables of the same name. The NMIN and CMIN parameters are used by the selection and cluster algorithms described in the next section. The t variable is the seconds counter at the time of the last update. An example is shown by the clock_update() routine in Appendix A.5.5.4. The p variable is the system peer identifier determined by the cluster() routine in Section 11.2.2. The precision variable has the same format as the packet variable of the same name. The precision is defined as the larger of the resolution and time to read the clock, in log2 units. For instance, the precision of a mains-frequency clock incrementing at 60 Hz is 16 ms, even when the system clock hardware representation is to the nanosecond. The offset and jitter variables are determined by the combine algorithm in Section 11.2.3. These values represent the best and final offset and jitter used to discipline the system clock.
Top   ToC   RFC5905 - Page 41
   Initially, all variables are cleared to zero, then the leap is set to
   3 (unsynchronized) and stratum is set to MAXSTRAT (16).  Remember
   that MAXSTRAT is mapped to zero in the transmitted packet.

11.2. System Process Operations

Figure 24 summarizes the system process operations performed by the clock select routine. The selection algorithm described in Section 11.2.1 produces a majority clique of presumed correct candidates (truechimers) based on agreement principles. The cluster algorithm described in Section 11.2.2 discards outliers to produce the most accurate survivors. The combine algorithm described in Section 11.2.3 provides the best and final offset for the clock discipline algorithm. An example is described in Appendix A.5.5.6. If the selection algorithm cannot produce a majority clique, or if it cannot produce at least CMIN survivors, the system process exits without disciplining the system clock. If successful, the cluster algorithm selects the statistically best candidate as the system peer and its variables are inherited as the system variables.
Top   ToC   RFC5905 - Page 42
                          | clock_select()  |
   .                               V          .
   .      yes +---------+ +-----------------+ .
   .       +--| accept? | | scan candidates | .
   .       |  +---------+ |                 | .
   .       V        no |  |                 | .
   .  +---------+      |  |                 | .
   .  | add peer|      |  |                 | .
   .  +----------      |  |                 | .
   .       |           V  |                 | .
   .       +---------->-->|                 | .
   .                      |                 | .
   . Selection Algorithm  +-----------------+ .
                       no +-------------------+
            +-------------|     survivors?    |
            |             +-------------------+
            |                       | yes
            |                       V
            |             +-------------------+
            |             | Cluster Algorithm |
            |             +-------------------+
            |                       |
            |                       V
            V         yes +-------------------+
            |<------------|     n < CMIN?     |
            |             +-------------------+
            V                       |
     +-----------------+            V no
     |   s.p = NULL    |  +-------------------+
     +-----------------+  |   s.p = v_0.p     |
            |             +-------------------+
            V                       |
     +-----------------+            V
     | return (UNSYNC) |  +-------------------+
     +-----------------+  |   return (SYNC)   |

                      Figure 24: Clock Select Routine
Top   ToC   RFC5905 - Page 43

11.2.1. Selection Algorithm

Note that the selection and cluster algorithms are described separately, but combined in the code skeleton. The selection algorithm operates to find an intersection interval containing a majority clique of truechimers using Byzantine agreement principles originally proposed by Marzullo [ref6], but modified to improve accuracy. An overview of the algorithm is given below and described in the first half of the clock_select() routine in Appendix A.5.5.1. First, those servers that are unusable according to the rules of the protocol are detected and discarded as shown by the accept() routine in Appendix A.5.5.3. Next, a set of tuples (p, type, edge) is generated for the remaining candidates. Here, p is the association identifier and type identifies the upper (+1), middle (0), and lower (-1) endpoints of a correctness interval centered on theta for that candidate. This results in three tuples, lowpoint (p, -1, theta - lambda), midpoint (p, 0, theta), and highpoint (p, +1, theta + lambda), where lambda is the root synchronization distance. An example of this calculation is shown by the rootdist() routine in Appendix A.5.1.1. The steps of the algorithm are: 1. For each of m associations, place three tuples as defined above on the candidate list. 2. Sort the tuples on the list by the edge component. Order the lowpoint, midpoint, and highpoint of these intervals from lowest to highest. Set the number of falsetickers f = 0. 3. Set the number of midpoints d = 0. Set c = 0. Scan from lowest endpoint to highest. Add one to c for every lowpoint, subtract one for every highpoint, add one to d for every midpoint. If c >= m - f, stop; set l = current lowpoint. 4. Set c = 0. Scan from highest endpoint to lowest. Add one to c for every highpoint, subtract one for every lowpoint, add one to d for every midpoint. If c >= m - f, stop; set u = current highpoint. 5. Is d = f and l < u? If yes, then follow step 5A; else, follow step 5B. 5A. Success: the intersection interval is [l, u]. 5B. Add one to f. Is f < (m / 2)? If yes, then go to step 3 again. If no, then go to step 6. 6. Failure; a majority clique could not be found. There are no suitable candidates to discipline the system clock.
Top   ToC   RFC5905 - Page 44
   The algorithm is described in detail in Appendix A.5.5.1.  Note that
   it starts with the assumption that there are no falsetickers (f = 0)
   and attempts to find a non-empty intersection interval containing the
   midpoints of all correct servers, i.e., truechimers.  If a non-empty
   interval cannot be found, it increases the number of assumed
   falsetickers by one and tries again.  If a non-empty interval is
   found and the number of falsetickers is less than the number of
   truechimers, a majority clique has been found and the midpoint of
   each truechimer (theta) represents the candidates available to the
   cluster algorithm.

   If a majority clique is not found, or if the number of truechimers is
   less than CMIN, there are insufficient candidates to discipline the
   system clock.  CMIN defines the minimum number of servers consistent
   with the correctness requirements.  Suspicious operators would set
   CMIN to ensure multiple redundant servers are available for the
   algorithms to mitigate properly.  However, for historic reasons the
   default value for CMIN is one.

11.2.2. Cluster Algorithm

The candidates of the majority clique are placed on the survivor list v in the form of tuples (p, theta_p, psi_p, lambda_p), where p is an association identifier, theta_p, psi_p, and stratum_p the current offset, jitter and stratum of association p, respectively, and lambda_p is a merit factor equal to stratum_p * MAXDIST + lambda, where lambda is the root synchronization distance for association p. The list is processed by the cluster algorithm below. An example is shown by the second half of the clock_select() algorithm in Appendix A.5.5.1. 1. Let (p, theta_p, psi_p, lambda_p) represent a survivor candidate. 2. Sort the candidates by increasing lambda_p. Let n be the number of candidates and NMIN the minimum required number of survivors. 3. For each candidate, compute the selection jitter psi_s: +----- -----+^1/2 | n-1 | | --- | | 1 \ 2 | psi_s = | ---- * / (theta_s - theta_j) | | n-1 --- | | j=1 | +----- -----+ 4. Select psi_max as the candidate with maximum psi_s.
Top   ToC   RFC5905 - Page 45
   5.  Select psi_min as the candidate with minimum psi_p.

   6.  Is psi_max < psi_min or n <= NMIN?  If yes, follow step 6A;
   otherwise, follow step 6B.

   6A. Done.  The remaining candidates on the survivor list are ranked
   in the order of preference.  The first entry on the list represents
   the system peer; its variables are used later to update the system

   6B. Delete the outlier candidate with psi_max; reduce n by one and go
   back to step 3.

   The algorithm operates in a series of rounds where each round
   discards the statistical outlier with maximum selection jitter psi_s.
   However, if psi_s is less than the minimum peer jitter psi_p, no
   improvement is possible by discarding outliers.  This and the minimum
   number of survivors represent the terminating conditions of the
   algorithm.  Upon termination, the final value of psi_max is saved as
   the system selection jitter PSI_s for use later.

11.2.3. Combine Algorithm

The clock combine route processes the remaining survivors to produce the best and final data for the clock discipline algorithm. The routine processes peer offset and jitter statistics to produce the combined system offset THETA and system peer jitter PSI_p, where each server statistic is weighted by the reciprocal of the root synchronization distance and the result normalized. An example is shown by the clock_combine() routine in Appendix A.5.5.5 The combined THETA is passed to the clock update routine. The first candidate on the survivor list is nominated as the system peer with identifier p. The system peer jitter PSI_p is a component of the system jitter PSI. It is used along with the selection jitter PSI_s to produce the system jitter: PSI = [(PSI_s)^2 + (PSI_p)^2]^1/2 Each time an update is received from the system peer, the clock update routine is called. By rule, an update is discarded if its time of arrival p.t is not strictly later than the last update used s.t. The labels IGNOR, PANIC, ADJ, and STEP refer to return codes from the local clock routine described in the next section. IGNORE means the update has been ignored as an outlier. PANIC means the offset is greater than the panic threshold PANICT (1000 s) and SHOULD cause the program to exit with a diagnostic message to the
Top   ToC   RFC5905 - Page 46
   system log.  STEP means the offset is less than the panic threshold,
   but greater than the step threshold STEPT (125 ms).  In this case,
   the clock is stepped to the correct offset, but since this means all
   peer data have been invalidated, all associations MUST be reset and
   the client begins as at initial start.

   ADJ means the offset is less than the step threshold and thus a valid
   update.  In this case, the system variables are updated from the peer
   variables as shown in Figure 25.

                  | System Variable <-- System Peer Variable  |        |
                  | s.leap      <-- p.leap                    |
                  | s.stratum   <-- p.stratum + 1             |
                  | s.offset    <-- THETA                     |
                  | s.jitter    <-- PSI                       |
                  | s.rootdelay <-- p.delta_r + delta         |
                  | s.rootdisp  <-- p.epsilon_r + p.epsilon + |
                  |                 p.psi + PHI * (s.t - p.t) |
                  |                 + |THETA|                 |
                  | s.refid     <-- p.refid                   |
                  | s.reftime   <-- p.reftime                 |
                  | s.t         <-- p.t                       |

                    Figure 25: System Variables Update

   There is an important detail not shown.  The dispersion increment
   (p.epsilon + p.psi + PHI * (s.t - p.t) + |THETA|) is bounded from
   below by MINDISP.  In subnets with very fast processors and networks
   and very small delay and dispersion this forces a monotone-definite
   increase in s.rootdisp (EPSILON), which avoids loops between peers
   operating at the same stratum.

   The system variables are available to dependent application programs
   as nominal performance statistics.  The system offset THETA is the
   clock offset relative to the available synchronization sources.  The
   system jitter PSI is an estimate of the error in determining this
   value, elsewhere called the expected error.  The root delay DELTA is
   the total round-trip delay relative to the primary server.  The root
   dispersion EPSILON is the dispersion accumulated over the network
   from the primary server.  Finally, the root synchronization distance
   is defined as:
Top   ToC   RFC5905 - Page 47

   which represents the maximum error due all causes and is designated
   the root synchronization distance.

   An example of the clock update routine is provided in
   Appendix A.5.5.4.

11.3. Clock Discipline Algorithm

The NTPv4 clock discipline algorithm, shortened to discipline in the following, functions as a combination of two quite philosophically different feedback control systems. In a phase-locked loop (PLL) design, periodic phase updates at update intervals mu seconds are used directly to minimize the time error and indirectly the frequency error. In a frequency-locked loop (FLL) design, periodic frequency updates at intervals mu are used directly to minimize the frequency error and indirectly the time error. As shown in [ref7], a PLL usually works better when network jitter dominates, while an FLL works better when oscillator wander dominates. This section contains an outline of how the NTPv4 design works. An in-depth discussion of the design principles is provided in [ref7], which also includes a performance analysis. The discipline is implemented as the feedback control system shown in Figure 26. The variable theta_r represents the combine algorithm offset (reference phase) and theta_c the VFO offset (control phase). Each update produces a signal V_d representing the instantaneous phase difference theta_r - theta_c. The clock filter for each server functions as a tapped delay line, with the output taken at the tap selected by the clock filter algorithm. The selection, cluster, and combine algorithms combine the data from multiple filters to produce the signal V_s. The loop filter, with impulse response F(t), produces the signal V_c, which controls the VFO frequency omega_c and thus the integral of the phase theta_c which closes the loop. The V_c signal is generated by the clock-adjust process in Section 12. The detailed equations that implement these functions are best presented in the routines of Appendices A.5.5.6 and A.5.6.1.
Top   ToC   RFC5905 - Page 48
                theta_r + +---------\        +----------------+
            NTP --------->|  Phase   \  V_d  |                | V_s
                theta_c - | Detector  ------>|  Clock Filter  |----+
                +-------->|          /       |                |    |
                |         +---------/        +----------------+    |
                |                                                  |
              -----                                                |
             /     \                                               |
             | VFO |                                               |
             \     /                                               |
              -----    .......................................     |
                ^      .            Loop Filter              .     |
                |      . +---------+   x  +-------------+    .     |
                | V_c  . |         |<-----|             |    .     |
                +------.-|  Clock  |   y  | Phase/Freq  |<---------+
                       . | Adjust  |<-----| Prediction  |    .
                       . |         |      |             |    .
                       . +---------+      +-------------+    .

                 Figure 26: Clock Discipline Feedback Loop

   Ordinarily, the pseudo-linear feedback loop described above operates
   to discipline the system clock.  However, there are cases where a
   non-linear algorithm offers considerable improvement.  One case is
   when the discipline starts without knowledge of the intrinsic clock
   frequency.  The pseudo-linear loop takes several hours to develop an
   accurate measurement and during most of that time the poll interval
   cannot be increased.  The non-linear loop described below does this
   in 15 minutes.  Another case is when occasional bursts of large
   jitter are present due to congested network links.  The state machine
   described below resists error bursts lasting less than 15 minutes.

   Figure 27 contains a summary of the variables and parameters
   including the variable (lowercase) or parameter (uppercase) name,
   formula name, and short description.  Unless noted otherwise, all
   variables have assumed prefix c.  The variables t, tc, state, hyster,
   and count are integers; the remaining variables are floating doubles.
   The function of each will be explained in the algorithm descriptions
Top   ToC   RFC5905 - Page 49
                | Name   | Formula    | Description              |
                | t      | timer      | seconds counter          |
                | offset | theta      | combined offset          |
                | resid  | theta_r    | residual offset          |
                | freq   | phi        | clock frequency          |
                | jitter | psi        | clock offset jitter      |
                | wander | omega      | clock frequency wander   |
                | tc     | tau        | time constant (log2)     |
                | state  | state      | state                    |
                | adj    | adj        | frequency adjustment     |
                | hyster | hyster     | hysteresis counter       |
                | STEPT  | 125        | step threshold (.125 s)  |
                | WATCH  | 900        | stepout thresh(s)        |
                | PANICT | 1000       | panic threshold (1000 s) |
                | LIMIT  | 30         | hysteresis limit         |
                | PGATE  | 4          | hysteresis gate          |
                | TC     | 16         | time constant scale      |
                | AVG    | 8          | averaging constant       |

           Figure 27: Clock Discipline Variables and Parameters

   The process terminates immediately if the offset is greater than the
   panic threshold PANICT (1000 s).  The state transition function is
   described by the rstclock() function in Appendix A.5.5.7.  Figure 28
   shows the state transition function used by this routine.  It has
   four columns showing, respectively, the state name, predicate and
   action if the offset theta is less than the step threshold, the
   predicate and actions otherwise, and finally some comments.
Top   ToC   RFC5905 - Page 50
      | State | theta < STEP        | theta > STEP      | Comments     |
      | NSET  | ->FREQ              | ->FREQ            | no frequency |
      |       | adjust time         | step time         | file         |
      | FSET  | ->SYNC              | ->SYNC            | frequency    |
      |       | adjust time         | step time         | file         |
      | SPIK  | ->SYNC              | if < 900 s ->SPIK | outlier      |
      |       | adjust freq         | else ->SYNC       | detected     |
      |       | adjust time         | step freq         |              |
      |       |                     | step time         |              |
      | FREQ  | if < 900 s ->FREQ   | if < 900 s ->FREQ | initial      |
      |       | else ->SYNC         | else ->SYNC       | frequency    |
      |       | step freq           | step freq         |              |
      |       | adjust time         | adjust time       |              |
      | SYNC  | ->SYNC              | if < 900 s ->SPIK | normal       |
      |       | adjust freq         | else ->SYNC       | operation    |
      |       | adjust time         | step freq         |              |
      |       |                     | step time         |              |

                   Figure 28: State Transition Function

   In the table entries, the next state is identified by the arrow ->
   with the actions listed below.  Actions such as adjust time and
   adjust frequency are implemented by the PLL/FLL feedback loop in the
   local_clock() routine.  A step clock action is implemented by setting
   the clock directly, but this is done only after the stepout threshold
   WATCH (900 s) when the offset is more than the step threshold STEPT
   (.125 s).  This resists clock steps under conditions of extreme
   network congestion.

   The jitter (psi) and wander (omega) statistics are computed using an
   exponential average with weight factor AVG.  The time constant
   exponent (tau) is determined by comparing psi with the magnitude of
   the current offset theta.  If the offset is greater than PGATE (4)
   times the clock jitter, the hysteresis counter hyster is reduced by
   two; otherwise, it is increased by one.  If hyster increases to the
   upper limit LIMIT (30), tau is increased by one; if it decreases to
   the lower limit -LIMIT (-30), tau is decreased by one.  Normally, tau
   hovers near MAXPOLL, but quickly decreases if a temperature spike
   causes a frequency surge.
Top   ToC   RFC5905 - Page 51

12. Clock-Adjust Process

The actual clock-adjust process runs at one-second intervals to add the frequency correction and a fixed percentage of the residual offset theta_r. The theta_r is, in effect, the exponential decay of the theta value produced by the loop filter at each update. The TC parameter scales the time constant to match the poll interval for convenience. Note that the dispersion EPSILON increases by PHI at each second. The clock-adjust process includes a timer interrupt facility driving the seconds counter c.t. It begins at zero when the service starts and increments once each second. At each interrupt, the clock_adjust() routine is called to incorporate the clock discipline time and frequency adjustments, then the associations are scanned to determine if the seconds counter equals or exceeds the state variable defined in the next section. If so, the poll process is called to send a packet and compute the next value. An example of the clock-adjust process is shown by the clock_adjust() routine in Appendix A.5.6.1.

13. Poll Process

Each association supports a poll process that runs at regular intervals to construct and send packets in symmetric, client, and broadcast server associations. It runs continuously, whether or not servers are reachable in order to manage the clock filter and reach register.

13.1. Poll Process Variables

Figure 29 summarizes the common names, formula names, and a short description of the poll process variables (lowercase) and parameters (uppercase). Unless noted otherwise, all variables have assumed prefix p.
Top   ToC   RFC5905 - Page 52
                   | Name    | Formula | Description        |
                   | hpoll   | hpoll   | host poll exponent |
                   | last    | last    | last poll time     |
                   | next    | next    | next poll time     |
                   | reach   | reach   | reach register     |
                   | unreach | unreach | unreach counter    |
                   | UNREACH | 24      | unreach limit      |
                   | BCOUNT  | 8       | burst count        |
                   | BURST   | flag    | burst enable       |
                   | IBURST  | flag    | iburst enable      |

             Figure 29: Poll Process Variables and Parameters

   The poll process variables are allocated in the association data
   structure along with the peer process variables.  The following is a
   detailed description of the variables.  The parameters will be called
   out in the following text.

   hpoll: signed integer representing the poll exponent, in log2 seconds

   last: integer representing the seconds counter when the most recent
   packet was sent

   next: integer representing the seconds counter when the next packet
   is to be sent

   reach: 8-bit integer shift register shared by the peer and poll

   unreach: integer representing the number of seconds the server has
   been unreachable

13.2. Poll Process Operations

As described previously, once each second the clock-adjust process is called. This routine calls the poll routine for each association in turn. If the time for the next poll message is greater than the seconds counter, the routine returns immediately. Symmetric (modes 1, 2), client (mode 3), and broadcast server (mode 5) associations routinely send packets. A broadcast client (mode 6) association runs the routine to update the reach and unreach variables, but does not send packets. The poll process calls the transmit process to send a packet. If in a burst (burst > 0), nothing further is done except call the poll update routine to set the next poll interval.
Top   ToC   RFC5905 - Page 53
   If not in a burst, the reach variable is shifted left by one bit,
   with zero replacing the rightmost bit.  If the server has not been
   heard for the last three poll intervals, the clock filter routine is
   called to increase the dispersion.  An example is shown in
   Appendix A.5.7.3.

   If the BURST flag is lit and the server is reachable and a valid
   source of synchronization is available, the client sends a burst of
   BCOUNT (8) packets at each poll interval.  The interval between
   packets in the burst is two seconds.  This is useful to accurately
   measure jitter with long poll intervals.  If the IBURST flag is lit
   and this is the first packet sent when the server has been
   unreachable, the client sends a burst.  This is useful to quickly
   reduce the synchronization distance below the distance threshold and
   synchronize the clock.

   If the P_MANY flag is lit in the p.flags word of the association,
   this is a manycast client association.  Manycast client associations
   send client mode packets to designated multicast group addresses at
   MINPOLL intervals.  The association starts out with a TTL of 1.  If
   by the time of the next poll there are fewer than MINCLOCK servers
   have been mobilized, the ttl is increased by one.  If the ttl reaches
   the limit TTLMAX, without finding MINCLOCK servers, the poll interval
   increases until reaching BEACON, when it starts over from the

   The poll() routine includes a feature that backs off the poll
   interval if the server becomes unreachable.  If reach is nonzero, the
   server is reachable and unreach is set to zero; otherwise, unreach is
   incremented by one for each poll to the maximum UNREACH.  Thereafter
   for each poll hpoll is increased by one, which doubles the poll
   interval up to the maximum MAXPOLL determined by the poll_update()
   routine.  When the server again becomes reachable, unreach is set to
   zero, hpoll is reset to the tc system variable, and operation resumes

   A packet is sent by the transmit process.  Some header values are
   copied from the peer variables left by a previous packet and others
   from the system variables.  Figure 30 shows which values are copied
   to each header field.  In those implementations, using floating
   double data types for root delay and root dispersion, these must be
   converted to NTP short format.  All other fields are either copied
   intact from peer and system variables or struck as a timestamp from
   the system clock.
Top   ToC   RFC5905 - Page 54
                   | Packet Variable <--   Variable    |
                   | x.leap        <--     s.leap      |
                   | x.version     <--     s.version   |
                   | x.mode        <--     s.mode      |
                   | x.stratum     <--     s.stratum   |
                   | x.poll        <--     s.poll      |
                   | x.precision   <--     s.precision |
                   | x.rootdelay   <--     s.rootdelay |
                   | x.rootdisp    <--     s.rootdisp  |
                   | x.refid       <--     s.refid     |
                   | x.reftime     <--     s.reftime   |
                   |         <--     p.xmt       |
                   | x.rec         <--     p.dst       |
                   | x.xmt         <--     clock       |
                   | x.keyid       <--     p.keyid     |
                   | x.digest      <--     md5 digest  |

                   Figure 30: xmit_packet Packet Header

   The poll update routine is called when a valid packet is received and
   immediately after a poll message has been sent.  If in a burst, the
   poll interval is fixed at 2 s; otherwise, the host poll exponent
   hpoll is set to the minimum of ppoll from the last packet received
   and hpoll from the poll routine, but not less than MINPOLL or greater
   than MAXPOLL.  Thus, the clock discipline can be oversampled but not
   undersampled.  This is necessary to preserve subnet dynamic behavior
   and protect against protocol errors.

   The poll exponent is converted to an interval, which, when added to
   the last poll time variable, determines the value of the next poll
   time variable.  Finally, the last poll time variable is set to the
   current seconds counter.

14. Simple Network Time Protocol (SNTP)

Primary servers and clients complying with a subset of NTP, called the Simple Network Time Protocol (SNTPv4) [RFC4330], do not need to implement the mitigation algorithms described in Section 9 and following sections. SNTP is intended for primary servers equipped with a single reference clock, as well as for clients with a single upstream server and no dependent clients. The fully developed NTPv4 implementation is intended for secondary servers with multiple upstream servers and multiple downstream servers or clients. Other than these considerations, NTP and SNTP servers and clients are completely interoperable and can be intermixed in NTP subnets.
Top   ToC   RFC5905 - Page 55
   An SNTP primary server implementing the on-wire protocol described in
   Section 8 has no upstream servers except a single reference clock.
   In principle, it is indistinguishable from an NTP primary server that
   has the mitigation algorithms and therefore capable of mitigating
   between multiple reference clocks.

   Upon receiving a client request, an SNTP primary server constructs
   and sends the reply packet as described in Figure 31.  Note that the
   dispersion field in the packet header must be updated as described in
   Section 5.

                   | Packet Variable <--   Variable    |
                   | x.leap        <--     s.leap      |
                   | x.version     <--     r.version   |
                   | x.mode        <--     4           |
                   | x.stratum     <--     s.stratum   |
                   | x.poll        <--     r.poll      |
                   | x.precision   <--     s.precision |
                   | x.rootdelay   <--     s.rootdelay |
                   | x.rootdisp    <--     s.rootdisp  |
                   | x.refid       <--     s.refid     |
                   | x.reftime     <--     s.reftime   |
                   |         <--     r.xmt       |
                   | x.rec         <--     r.dst       |
                   | x.xmt         <--     clock       |
                   | x.keyid       <--     r.keyid     |
                   | x.digest      <--     md5 digest  |

                    Figure 31: fast_xmit Packet Header

   An SNTP client implementing the on-wire protocol has a single server
   and no dependent clients.  It can operate with any subset of the NTP
   on-wire protocol, the simplest approach using only the transmit
   timestamp of the server packet and ignoring all other fields.
   However, the additional complexity to implement the full on-wire
   protocol is minimal so that a full implementation is encouraged.

15. Security Considerations

NTP security requirements are even more stringent than most other distributed services. First, the operation of the authentication mechanism and the time synchronization mechanism are inextricably intertwined. Reliable time synchronization requires cryptographic keys that are valid only over a designated time interval; but, time intervals can be enforced only when participating servers and clients
Top   ToC   RFC5905 - Page 56
   are reliably synchronized to UTC.  In addition, the NTP subnet is
   hierarchical by nature, so time and trust flow from the primary
   servers at the root through secondary servers to the clients at the

   An NTP client can claim to have authentic time to dependent
   applications only if all servers on the path to the primary servers
   are authenticated.  In NTP each server authenticates the next lower
   stratum servers and authenticates by induction the lowest stratum
   (primary) servers.  It is important to note that authentication in
   the context of NTP does not necessarily imply the time is correct.
   An NTP client mobilizes a number of concurrent associations with
   different servers and uses a crafted agreement algorithm to pluck
   truechimers from the population possibly including falsetickers.

   The NTP specification assumes that the goal of the intruder is to
   inject false time values, disrupt the protocol, or clog the network,
   servers, or clients with spurious packets that exhaust resources and
   deny service to legitimate applications.  There are a number of
   defense mechanisms already built in the NTP architecture, protocol,
   and algorithms.  The on-wire timestamp exchange scheme is inherently
   resistant to spoofing, packet-loss, and replay attacks.  The
   engineered clock filter, selection and clustering algorithms are
   designed to defend against evil cliques of Byzantine traitors.  While
   not necessarily designed to defeat determined intruders, these
   algorithms and accompanying sanity checks have functioned well over
   the years to deflect improperly operating but presumably friendly
   scenarios.  However, these mechanisms do not securely identify and
   authenticate servers to clients.  Without specific further
   protection, an intruder can inject any or all of the following

   1.  An intruder can intercept and archive packets forever, as well as
       all the public values ever generated and transmitted over the

   2.  An intruder can generate packets faster than the server, network
       or client can process them, especially if they require expensive
       cryptographic computations.

   3.  In a wiretap attack, the intruder can intercept, modify, and
       replay a packet.  However, it cannot permanently prevent onward
       transmission of the original packet; that is, it cannot break the
       wire, only tell lies and congest it.  Generally, the modified
       packet cannot arrive at the victim before the original packet,
       nor does it have the server private keys or identity parameters.
Top   ToC   RFC5905 - Page 57
   4.  In a middleman or masquerade attack, the intruder is positioned
       between the server and client, so it can intercept, modify and
       replay a packet and prevent onward transmission of the original
       packet.  However, the middleman does not have the server private

   The NTP security model assumes the following possible limitations:

   1.  The running times for public key algorithms are relatively long
       and highly variable.  In general, the performance of the time
       synchronization function is badly degraded if these algorithms
       must be used for every NTP packet.

   2.  In some modes of operation, it is not feasible for a server to
       retain state variables for every client.  It is however feasible
       to regenerated them for a client upon arrival of a packet from
       that client.

   3.  The lifetime of cryptographic values must be enforced, which
       requires a reliable system clock.  However, the sources that
       synchronize the system clock must be trusted.  This circular
       interdependence of the timekeeping and authentication functions
       requires special handling.

   4.  Client security functions must involve only public values
       transmitted over the net.  Private values must never be disclosed
       beyond the machine on which they were created, except in the case
       of a special trusted agent (TA) assigned for this purpose.

   Unlike the Secure Shell (SSH) security model, where the client must
   be securely authenticated to the server, in NTP the server must be
   securely authenticated to the client.  In SSH, each different
   interface address can be bound to a different name, as returned by a
   reverse-DNS query.  In this design, separate public/private key pairs
   may be required for each interface address with a distinct name.  A
   perceived advantage of this design is that the security compartment
   can be different for each interface.  This allows a firewall, for
   instance, to require some interfaces to authenticate the client and
   others not.

   In the case of NTP as specified herein, NTP broadcast clients are
   vulnerable to disruption by misbehaving or hostile SNTP or NTP
   broadcast servers elsewhere in the Internet.  Such disruption can be
   minimized by several approaches.  Filtering can be employed to limit
   the access of NTP clients to known or trusted NTP broadcast servers.
   Such filtering will prevent malicious traffic from reaching the NTP
   clients.  Cryptographic authentication at the client will only allow
Top   ToC   RFC5905 - Page 58
   timing information from properly signed NTP messages to be utilized
   in synchronizing its clock.  Higher levels of authentication may be
   gained by the use of the Autokey mechanism [RFC5906].

   Section 8 describes a potential security concern with the replay of
   client requests.  Following the recommendations in that section
   provides protection against such attacks.

   It should be noted that this specification is describing an existing
   implementation.  While the security shortfalls of the MD5 algorithm
   are well-known, its use in the NTP specification is consistent with
   widescale deployment in the Internet community.

16. IANA Considerations

UDP/TCP Port 123 was previously assigned by IANA for this protocol. The IANA has assigned the IPv4 multicast group address and the IPv6 multicast address ending :101 for NTP. This document introduces NTP extension fields allowing for the development of future extensions to the protocol, where a particular extension is to be identified by the Field Type sub-field within the extension field. IANA has established and will maintain a registry for Extension Field Types associated with this protocol, populating this registry with no initial entries. As future needs arise, new Extension Field Types may be defined. Following the policies outlined in [RFC5226], new values are to be defined by IETF Review. The IANA has created a new registry for NTP Reference Identifier codes. This includes the current codes defined in Section 7.3, and may be extended on a First-Come-First-Serve (FCFS) basis. The format of the registry is: +------+----------------------------------------------------------+ | ID | Clock Source | +------+----------------------------------------------------------+ | GOES | Geosynchronous Orbit Environment Satellite | | GPS | Global Position System | | ... | ... | +------+----------------------------------------------------------+ Figure 32: Reference Identifier Codes The IANA has created a new registry for NTP Kiss-o'-Death codes. This includes the current codes defined in Section 7.4, and may be extended on a FCFS basis. The format of the registry is:
Top   ToC   RFC5905 - Page 59
   | Code |                           Meaning                          |
   | ACST | The association belongs to a unicast server.               |
   | AUTH | Server authentication failed.                              |
   | ...  | ...                                                        |

                           Figure 33: Kiss Codes

   For both Reference Identifiers and Kiss-o'-Death codes, IANA is
   requested to never assign a code beginning with the character "X", as
   this is reserved for experimentation and development.

17. Acknowledgements

The editors would like to thank Karen O'Donoghue, Brian Haberman, Greg Dowd, Mark Elliot, Harlan Stenn, Yaakov Stein, Stewart Bryant, and Danny Mayer for technical reviews and specific text contributions to this document.

18. References

18.1. Normative References

[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, August 1980. [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981. [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 1992. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

18.2. Informative References

[CGPM] Bureau International des Poids et Mesures, "Comptes Rendus de la 15e CGPM", 1976. [ITU-R_TF.460] International Telecommunications Union, "ITU-R TF.460 Standard-frequency and time-signal emissions", February 2002.
Top   ToC   RFC5905 - Page 60
   [RFC1305]       Mills, D., "Network Time Protocol (Version 3)
                   Specification, Implementation and Analysis",
                   RFC 1305, March 1992.

   [RFC1345]       Simonsen, K., "Character Mnemonics and Character
                   Sets", RFC 1345, June 1992.

   [RFC4330]       Mills, D., "Simple Network Time Protocol (SNTP)
                   Version 4 for IPv4, IPv6 and OSI", RFC 4330,
                   January 2006.

   [RFC5226]       Narten, T. and H. Alvestrand, "Guidelines for Writing
                   an IANA Considerations Section in RFCs", BCP 26,
                   RFC 5226, May 2008.

   [RFC5906]       Haberman, B., Ed. and D. Mills, "Network Time
                   Protocol Version 4: Autokey Specification", RFC 5906,
                   June 2010.

   [ref6]          Marzullo and S. Owicki, "Maintaining the time in a
                   distributed system", ACM Operating Systems Review 19,
                   July 1985.

   [ref7]          Mills, D.L., "Computer Network Time Synchronization -
                   the Network Time Protocol", CRC Press, 304 pp, 2006.

   [ref9]          Mills, D.L., Electrical and Computer Engineering
                   Technical Report 06-6-1, NDSS, June 2006, "Network
                   Time Protocol Version 4 Reference and Implementation
                   Guide", 2006.

(next page on part 4)

Next Section