tech-invite

# Network Time Protocol (Version 3) Specification, Implementation and Analysis

Part 4 of 4, p. 100 to 109

```sub 0~+~tau ) |~+~{0.8~<<~epsilon sub e sup 2 ( tau )~>> } over sqrt {
<<~epsilon sub i sup 2 ( tau )~>> }> ,

where <\$Eepsilon sub i ( tau )> and <\$Eepsilon sub e ( tau )> are the
accumulated error of the ith clock and entire clock ensemble,
respectively. The accumulated error of the entire ensemble is

<\$E<<~epsilon sub e sup 2 ( tau )~>>~=~left [ sum from i=1 to n~1 over {
<<~epsilon sub i sup 2 ( tau )~>> } right ] sup {~-1}>.

Finally, the weight factor for the ith clock is calculated as

<\$Ew sub i ( tau )~=~ { <<~epsilon sub e sup 2 ( tau )~>> } over {
<<~epsilon sub i sup 2 ( tau )~>> }> .

When all estimators and weight factors have been updated, the origin of
the estimation interval is shifted and the new value of t0 becomes the
old value of <\$Et sub 0 ~+~ tau>.

While not entering into the above calculations, it is useful to estimate
the frequency error, since the ensemble clocks can be located some
distance from each other and become isolated for some time due to
network failures. The frequency-offset error in Ri is equivalent to the
fractional frequency yi,

<\$Ey sub i~=~{ nu sub i~-~nu sub I } over nu sub I>

measured between the ith timescale and the standard timescale I.
Temporarily dropping the subscript i for clarity, consider a sequence of
N independent frequency-offset samples <\$Ey(j)~ (j~=~1,~2,~... ,~N)>
where the interval between samples is uniform and equal to T. Let
<\$Etau> be the nominal interval over which these samples are averaged.
The Allan variance <\$Esigma sub y sup 2 ( N,~T,~tau )> [ALL74a] is
defined as
<\$E<< sigma sub y sup 2 ( N,~T,~tau )~>>~=~<< ~ 1 over { N~-~1 }~ left [
sum from j=1 to N~y (j) sup 2~-~1 over N~left ( sum from j=1 to N~y(j)
right ) sup 2 right ]~>>> ,

A particularly useful formulation is <\$EN~=~2> and <\$ET~=~tau>:

<\$E<< sigma sub y sup 2 (N~=~2,~T~=~tau ,~tau )>>~==~sigma sub y sup 2 (
tau )~=~<< {[y(j~+~1)~-~y(j)] sup 2 } over 2 >>> ,

so that

<\$Esigma sub y sup 2 ( tau )~=~1 over {2(N~-~1)}sum from { j = 1 } to
{n-1 }~[y(j~+~1)~-~y(j)] sup 2> .

While the Allan variance has found application when estimating errors in
ensembles of cesium clocks, its application to NTP is limited due to the
```
```computation and storage burden. As described in the next section, it is
possible to estimate errors with some degree of confidence using normal
byproducts of NTP processing algorithms.

Application to NTP

The NTP clock model is somewhat less complex than the general model
described above. For instance, at the present level of development it is
not necessary to separately estimate the time and frequency of all peer
clocks, only the time and frequency of the local clock. If the
timekeeping reference is the local clock itself, then the offsets
available in the peer.offset peer variables can be used directly for the
<\$ET sub ij> quantities above. In addition, the NTP local-clock model
incorporates a type-II phase-locked loop, which itself reliably
estimates frequency errors and corrects accordingly. Thus, the
requirement for estimating frequency is entirely eliminated.

There remains the problem of how to determine a robust and easily
computable error estimate <\$Eepsilon sub i>. The method described above,
although analytically justified, is most difficult to implement.
Happily, as a byproduct of the NTP clock-filter algorithm, a useful
error estimate is available in the form of the dispersion. As described
in the NTP specification, the dispersion includes the absolute value of
the weighted average of the offsets between the chosen offset sample and
the <\$En~-~1> other samples retained for selection. The effectiveness of
this estimator was compared with the above estimator by simulation using
observed timekeeping data and found to give quite acceptable results.

The NTP clock-combining algorithm can be implemented with only minor
modifications to the algorithms as described in the NTP specification.
Although elsewhere in the NTP specification the use of general-purpose
multiply/divide routines has been successfully avoided, there seems to
be no way to avoid them in the clock-combining algorithm. However, for
best performance the local-clock algorithm described elsewhere in this
document should be implemented as well, since the combining algorithms
result in a modest increase in phase noise which the revised local-clock
algorithm is designed to suppress.

Clock-Combining Procedure

The result of the NTP clock-selection procedure is a set of survivors
(there must be at least one) that represent truechimers, or correct
clocks. When clock combining is not implemented, one of these peers,
chosen as the most likely candidate, becomes the synchronization source
and its computed offset becomes the final clock correction.
Subsequently, the system variables are adjusted as described in the NTP
clock-update procedure. When clock combining is implemented, these
actions are unchanged, except that the final clock correction is
computed by the clock-combining procedure.
```
```The clock-combining procedure is called from the clock-select procedure.
It constructs from the variables of all surviving peers the final clock
correction <\$ETHETA>. The estimated error required by the algorithms
previously described is based on the synchronization distance <\$ELAMBDA>
computed by the distance procedure, as defined in the NTP specification.
The reciprocal of <\$ELAMBDA> is the weight of each clock-offset
contribution to the final clock correction. The following pseudo-code
describes the procedure.

begin clock-combining procedure
<\$Etemp1~<<-~0>;
<\$Etemp2~<<-~0>;
for (each peer remaining on the candidate list)         /* scan
all survivors */
<\$ELAMBDA~<<-~roman distance (peer)>;
<\$Etemp~<<-~1 over roman
{peer.stratum~times~NTP.MAXDISPERSE~+~LAMBDA }>;
<\$Etemp1~<<-~temp1~+~temp>;             /* update weight
and offset */
<\$Etemp2~<<-~temp2~+~temp~times~roman peer.offset>;
endif;
<\$ETHETA~<<-~temp2 over temp1>;
/* compute final correction */
end clock-combining procedure;

The value <\$ETHETA> is the final clock correction used by the local-
clock procedure to adjust the clock.

Appendix G. Computer Clock Modelling and Analysis

A computer clock includes some kind of reference oscillator, which is
stabilized by a quartz crystal or some other means, such as the power
grid. Usually, the clock includes a prescaler, which divides the
oscillator frequency to a standard value, such as 1 MHz or 100 Hz, and a
counter, implemented in hardware, software or some combination of the
two, which can be read by the processor. For systems intended to be
synchronized to an external source of standard time, there must be some
means to correct the phase and frequency by occasional vernier
adjustments produced by the timekeeping protocol. Special care is
necessary in all timekeeping system designs to insure that the clock
indications are always monotonically increasing; that is, system time
never <169>runs backwards.<170>

Computer Clock Models

The simplest computer clock consists of a hardware latch which is set by
overflow of a hardware counter or prescaler, and causes a processor
interrupt or tick. The latch is reset when acknowledged by the
processor, which then increments the value of a software clock counter.
The phase of the clock is adjusted by adding periodic corrections to the
```
```counter as necessary. The frequency of the clock can be adjusted by
changing the value of the increment itself, in order to make the clock
run faster or slower. The precision of this simple clock model is
limited to the tick interval, usually in the order of 10 ms; although in
some systems the tick interval can be changed using a kernel variable.

This software clock model requires a processor interrupt on every tick,
which can cause significant overhead if the tick interval is small, say
in the order less 1 ms with the newer RISC processors. Thus, in order to
achieve timekeeping precisions less than 1 ms, some kind of hardware
assist is required. A straightforward design consists of a voltage-
controlled oscillator (VCO), in which the frequency is controlled by a
buffered, digital/analog converter (DAC). Under the assumption that the
VCO tolerance is 10-4 or 100 parts-per-million (ppm) (a reasonable value
for inexpensive crystals) and the precision required is 100 <\$Emu roman
s> (a reasonable goal for a RISC processor), the DAC must include at
least ten bits.

A design sketch of a computer clock constructed entirely of hardware
logic components is shown in Figure 10a<\$&fig10>. The clock is read by
first pulsing the read signal, which latches the current value of the
clock counter, then adding the contents of the clock-counter latch and a
64-bit clock-offset variable, which is maintained in processor memory.
The clock phase is adjusted by adding a correction to the clock-offset
to the DAC latch. In principle, this clock model can be adapted to any
precision by changing the number of bits of the prescaler or clock
counter or changing the VCO frequency. However, it does not seem useful
to reduce precision much below the minimum interrupt latency, which is
in the low microseconds for a modern RISC processor.

If it is not possible to vary the oscillator frequency, which might be
the case if the oscillator is an external frequency standard, a design
such as shown in Figure 10b may be used. It includes a fixed-frequency
oscillator and prescaler which includes a dual-modulus swallow counter
that can be operated in either divide-by-10 or divide-by-11 modes as
controlled by a pulse produced by a programmable divider (PD). The PD is
loaded with a value representing the frequency offset. Each time the
divider overflows a pulse is produced which switches the swallow counter
from the divide-by-10 mode to the divide-by-11 mode and then back again,
which in effect <169>swallows<170> or deletes a single pulse of the
prescaler pulse train.

The pulse train produced by the prescaler is controlled precisely over a
small range by the contents of the PD. If programmed to emit pulses at a
low rate, relatively few pulses are swallowed per second and the
frequency counted is near the upper limit of its range; while, if
programmed to emit pulses at a high rate, relatively many pulses are
swallowed and the frequency counted is near the lower limit. Assuming
some degree of freedom in the choice of oscillator frequency and
```
```prescaler ratios, this design can compensate for a wide range of
oscillator frequency tolerances.

In all of the above designs it is necessary to limit the amount of
adjustment incorporated in any step to insure that the system clock
indications are always monotonically increasing. With the software clock
model this is assured as long as the increment is never negative. When
the magnitude of a phase adjustment exceeds the tick interval (as
adjustments over mulitple tick intervals. This strategy amounts to a
deliberate frequency offset sustained for an interval equal to the total
number of ticks required and, in fact, is a feature of the Unix clock
model discussed below.

In the hardware clock models the same considerations apply; however, in
these designs the tick interval amounts to a single pulse at the
prescaler output, which may be in the order of 1 ms. In order to avoid
decreasing the indicated time when a negative phase correction occurs,
it is necessary to avoid modifying the clock-offset variable in
processor memory and to confine all adjustments to the VCO or prescaler.
Thus, all phase adjustments must be performed by means of programmed
frequency adjustments in much the same way as with the software clock
model described previously.

It is interesting to conjecture on the design of a processor assist that
could provide all of the above functions in a compact, general-purpose
hardware interface. The interface might consist of a multifunction timer
chip such as the AMD 9513A, which includes five 16-bit counters, each
with programmable load and hold registers, plus an onboard crystal
oscillator, prescaler and control circuitry. A 48-bit hardware clock
counter would utilize three of the 16-bit counters, while the fourth
would be used as the swallow counter and the fifth as the programmable
divider. With the addition of a programmable-array logic device and
architecture-specific host interface, this compact design could provide
all the functions necessary for a comprehensive timekeeping system.

The Fuzzball Clock Model

The Fuzzball clock model uses a combination of hardware and software to
provide precision timing with a minimum of software and processor
overhead. The model includes an oscillator, prescaler and hardware
counter; however, the oscillator frequency remains constant and the
hardware counter produces only a fraction of the total number of bits
required by the clock counter. A typical design uses a 64-bit software
clock counter and a 16-bit hardware counter which counts the prescaler
output. A hardware-counter overflow causes the processor to increment
the software counter at the bit corresponding to the frequency <\$E2 sup
N f sub p>, where N is the number of bits of the hardware counter and fp
is the counted frequency at the prescaler output. The processor reads
the clock counter by first generating a read pulse, which latches the
```
```hardware counter, and then adding its contents, suitably aligned, to the
software counter.

The Fuzzball clock can be corrected in phase by adding a (signed)
adjustment to the software clock counter. In practice, this is done only
when the local time is substantially different from the time indicated
by the clock and may violate the monotonicity requirement. Vernier phase
adjustments determined in normal system operation must be limited to no
more than the period of the counted frequency, which is 1 kHz for LSI-11
Fuzzballs. In the Fuzzball model these adjustments are performed at
intervals of 4 s, called the adjustment interval, which provides a
opportunities are created using the interval-timer facility, which is a
feature of most operating systems and independent of the time-of-day
clock. However,  if the counted frequency is increased from 1 kHz to 1
MHz for enhanced precision, the adjustment frequency must be increased
to 250 Hz, which substantially increases processor overhead. A modified
design suitable for high precision clocks is presented in the next
section.

In some applications involving the Fuzzball model, an external pulse-
per-second (pps) signal is available from a reference source such as a
cesium clock or GPS receiver. Such a signal generally provides much
higher accuracy than the serial character string produced by a radio
timecode receiver, typically in the low nanoseconds. In the Fuzzball
model this signal is processed by an interface which produces a hardware
interrupt coincident with the arrival of the pps pulse. The processor
then reads the clock counter and computes the residual modulo 1 s of the
clock counter. This represents the local-clock error relative to the pps
signal.

Assuming the seconds numbering of the clock counter has been determined
by a reliable source, such as a timecode receiver, the offset within the
second is determined by the residual computed above. In the NTP local-
clock model the timecode receiver or NTP establishes the time to within
<F128M>æ<F255D>128 ms, called the aperture, which guarantees the seconds
numbering to within the second. Then, the pps residual can be used
directly to correct the oscillator, since the offset must be less than
the aperture for a correctly operating timecode receiver and pps signal.

The above technique has an inherent error equal to the latency of the
interrupt system, which in modern RISC processors is in the low tens of
microseconds. It is possible to improve accuracy by latching the
hardware time-of-day counter directly by the pps pulse and then reading
the counter in the same way as usual. This requires additional circuitry
to prioritize the pps signal relative to the pulse generated by the
program to latch the counter.
The Unix Clock Model

The Unix 4.3bsd clock model is based on two system calls, settimeofday
```
```and adjtime, together with two kernel variables tick and tickadj. The
settimeofday call unceremoniously resets the kernel clock to the value
given, while the adjtime call slews the kernel clock to a new value
numerically equal to the sum of the present time of day and the (signed)
argument given in the adjtime call. In order to understand the behavior
of the Unix clock as controlled by the Fuzzball clock model described
above, it is helpful to explore the operations of adjtime in more
detail.

The Unix clock model assumes an interrupt produced by an onboard
frequency source, such as the clock counter and prescaler described
previously, to deliver a pulse train in the 100-Hz range. In priniciple,
the power grid frequency can be used, although it is much less stable
than a crystal oscillator. Each interrupt causes an increment called
tick to be added to the clock counter. The value of the increment is
chosen so that the clock counter, plus an initial offset established by
the settimeofday call, is equal to the time of day in microseconds.

The Unix clock can actually run at three different rates, one
corresponding to tick, which is related to the intrinsic frequency of
the particular oscillator used as the clock source, one to
rate corresponding to tick is used; but, if adjtime is called, the
argument <\$Edelta> given is used to calculate an interval <\$EDELTA
t~=~delta~tick over tickadj> during which one or the other of the two
rates are used, depending on the sign of <\$Edelta>. The effect is to
slew the clock to a new value at a small, constant rate, rather than
incorporate the adjustment all at once, which could cause the clock to
be set backward. With common values of <\$Etick~=~10> ms and
<\$E+- tickadj over tick~=~+- {5~roman x~10 sup -6} over {10 sup -2}> or
<F128M>æ<F255D>500 ppm. Even larger ranges may be required in the case
of some workstations (e.g., SPARCstations) with extremely poor component
tolerances.

When precisions not less than about 1 ms are required, the Fuzzball
clock model can be adapted to the Unix model by software simulation, as
described in Section 5 of the NTP specification, and calling adjtime at
each adjustment interval. When precisions substantially better than this
are required, the hardware microsecond clock provided in some
workstations can be used together with certain refinements of the
Fuzzball and Unix clock models. The particular design described below is
appropriate for a maximum oscillator frequency tolerance of 100 ppm
(.01%), which can be obtained using a relatively inexpensive quartz
crystal oscillator, but is readily scalable for other assumed
tolerances.

The clock model requires the capability to slew the clock frequency over
the range <F128M>æ<F255D>100 ppm with an intrinsic oscillator frequency
error as great as <F128M>æ<F255D>100 ppm. Figure 11<\$&fig11> shows the
```
```timing relationships at the extremes of the requirements envelope.
Starting from an assumed offset of nominal zero and an assumed error of
+100 ppm at time 0 s, the line AC shows how the uncorrected offset grows
with time. Let <\$Esigma> represent the adjustment interval and a the
interval AB, in seconds, and let r be the slew, or rate at which
corrections are introduced, in ppm. For an accuracy specification of 100
<\$Emu roman s>, then

<\$Esigma~<<=~{100~mu roman s} over {100~roman ppm}~+~{100~mu roman s}
over {(r~-~100)~roman ppm}~=~r over {r~-~100}> .
The line AE represents the extreme case where the clock is to be steered
<F128M>-<F255D>100 ppm. Since the slew must be complete at the end of

<\$Ea~<<=~{(r~-~200)~sigma} over r>.

These relationships are satisfied only if <\$Er~>>~200~roman ppm> and
<\$Esigma~<<~2~roman s>. Using <\$Er~=~300~roman ppm> for convenience,
<\$Esigma~=~1.5~roman s> and <\$Ea~<<=~0.5~roman s>. For the Unix clock
model with <\$Etick~=~10~roman ms>, this results in the value of

One of the assumptions made in the Unix clock model is that the period
of adjustment computed in the adjtime call must be completed before the
next call is made. If not, this results in an error message to the
system log. However, in order to correct for the intrinsic frequency
offset of the clock oscillator, the NTP clock model requires adjtime to
be called at regular adjustment intervals of <\$Esigma> s. Using the
algorithms described here and the architecture constants in the NTP
specification, these adjustments will always complete.

Mathematical Model of the NTP Logical Clock

The NTP logical clock can be represented by the feedback-control model
shown in Figure 12<\$&fig12>. The model consists of an adaptive-
parameter, phase-lock loop (PLL), which continuously adjusts the phase
and frequency of an oscillator to compensate for its intrinsic jitter,
wander and drift. A mathematical analysis of this model developed along
the lines of [SMI86] is presented in following sections, along with a
design example useful for implementation guidance in operating-systems
environments such as Unix and Fuzzball. Table 9<\$&tab9> summarizes the
quantities ordinarily treated as variables in the model. By convention,
<\$Ev> is used for internal loop variables, <\$Etheta> for phase,
<\$Eomega> for frequency and <\$Etau> for time. Table 10<\$&tab10>
summarizes those quantities ordinarily fixed as constants in the model.
Note that these are all expressed as a power of two in order to simplify
the implementation.

In Figure 12 the variable <\$Etheta sub r> represents the phase of the
reference signal and <\$Etheta sub o> the phase of the voltage-controlled
```
```oscillator (VCO). The phase detector (PD) produces a voltage <\$Ev sub d>
representing the phase difference <\$Etheta sub r~-~theta sub o> . The
clock filter functions as a tapped delay line, with the output <\$Ev sub
s> taken at the tap selected by the clock-filter algorithm described in
the NTP specification. The loop filter, represented by the equations
given below, produces a VCO correction voltage <\$Ev sub c>, which
controls the oscillator frequency and thus the phase <\$Etheta sub o>.

The PLL behavior is completely determined by its open-loop, Laplace
transfer function <\$EG(s)> in the s domain. Since both frequency and
phase corrections are required, an appropriate design consists of a
type-II PLL, which is defined by the function

<\$EG(s)~=~{omega sub c sup 2} over {tau sup 2 s sup 2}~( 1 ~+~{tau s}
over omega sub z )> ,

where <\$Eomega sub c> is the crossover frequency (also called loop
gain), <\$Eomega sub z> is the corner frequency (required for loop
stability) and <\$Etau> determines the PLL time constant and thus the
bandwidth. While this is a first-order function and some improvement in
phase noise might be gained from a higher-order function, in practice
the improvement is lost due to the effects of the clock-filter delay, as
described below.

The open-loop transfer function <\$EG(s)> is constructed by breaking the
loop at point a on Figure 12 and computing the ratio of the output phase
<\$Etheta sub o (s)> to the reference phase <\$Etheta sub r (s)>. This
function is the product of the individual transfer functions for the
phase detector, clock filter, loop filter and VCO. The phase detector
delivers a voltage <\$Ev sub d (t)~=~ theta sub r (t)>, so its transfer
function is simply <\$EF sub d (s)~=~1>, expressed in V/rad. The VCO
delivers a frequency change <\$EDELTA omega ~=~{roman d~theta sub o (t)}
over {roman dt}~=~alpha {v sub c (t)}>, where <\$Ealpha> is the VCO gain
in rad/V-sec and <\$Etheta sub o (t)~=~alpha~int v sub c (t)~dt>. Its
transfer function is the Laplace transform of the integral, <\$EF sub o
(s)~=~alpha over s>, expressed in rad/V. The clock filter contributes a
stochastic delay due to the clock-filter algorithm; but, for present
purposes, this delay will be assumed a constant T, so its transfer
function is the Laplace transform of the delay, <\$EF sub s (s)~=~e sup
{- Ts}>. Let <\$EF(s)> be the transfer function of the loop filter, which
has yet to be determined. The open-loop transfer function <\$EG(s)> is
the product of these four individual transfer functions:

<\$EG(s)~=~{omega sub c sup 2} over {tau sup 2 s sup 2}~( 1 ~+~{tau s}
over omega sub z )~=~F sub d (s) F sub s (s) F(s) F sub o (s)~=~1e sup
{-Ts}~F(s)~alpha over s> .

For the moment, assume that the product <\$ETs> is small, so that <\$Ee
sup {-Ts}~approx ~1>. Making the following substitutions,
```
```<\$Eomega sub c sup 2~=~alpha over { K sub f}~~~~> and <\$E~~~~omega sub
z~=~K sub g over {K sub f}>

and rearranging yields

<\$EF(s)~=~1 over {K sub g~tau}~+~1 over {K sub f~tau sup 2 s }> ,

which corresponds to a constant term plus an integrating term scaled by
the PLL time constant <\$Etau>. This form is convenient for
implementation as a sampled-data system, as described later.

With the parameter values given in Table 10, the Bode plot of the open-
loop transfer function <\$EG(s)> consists of a <196>12 dB/octave line
which intersects the 0-dB baseline at <\$Eomega sub c~=~2 sup -12> rad/s,
together with a +6 dB/octave line at the corner frequency <\$Eomega sub
z~=~2 sup -14> rad/s. The damping factor <\$Ezeta~=~omega sub c over {2
omega sub z}~=~2> suggests the PLL will be stable and have a large phase
margin together with a low overshoot. However, if the clock-filter delay
T is not small compared to the loop delay, which is approximately equal
to <\$E1 over omega sub c>, the above analysis becomes unreliable and the
loop can become unstable. With the values determined as above, T is
ordinarily small enough to be neglected.

Assuming the output is taken at <\$Ev sub s>, the closed-loop transfer
function <\$EH(s)> is

<\$EH(s)~==~{v sub s (s)} over {theta sub r (s)}~=~{F sub d (s) e sup {-
Ts}} over {1~+~G(s)}> .

If only the relative response is needed and the clock-filter delay can
be neglected, <\$EH(s)> can be written

<\$EH(s)~=~1 over {1~+~G(s)}~=~s sup 2 over {s sup 2~+~omega sub c sup 2
over {omega sub z~tau} s~+~omega sub c sup 2 over tau sup 2}> .

For some input function <\$EI(s)> the output function <\$EI(s)H(s)> can be
inverted to find the time response. Using a unit-step input <\$EI(s)~=~1
over s> and the values determined as above, This yields a PLL risetime
1.7 hours and a settling time to within one percent of the initial
Parameter Management

A very important feature of the NTP PLL design is the ability to adapt
its behavior to match the prevailing stability of the local oscillator
and transmission conditions in the network. This is done using the
<\$Ealpha> and <\$Etau> parameters shown in Table 10. Mechanisms for doing
this are described in following sections.

```
```The <\$Ealpha> parameter is determined by the maximum frequency tolerance
of the local oscillator and the maximum jitter requirements of the
timekeeping system. This parameter is usually an architecture constant
and fixed during system operation. In the implementation model described
below, the reciprocal of <\$Ealpha>, called the adjustment interval
<\$Esigma>, determines the time between corrections of the local clock,
and thus the value of <\$Ealpha>. The value of <\$Esigma> can be
determined by the following procedure.

The maximum frequency tolerance for board-mounted, uncompensated quartz-
crystal oscillators is probably in the range of 10-4 (100 ppm). Many if
not most Internet timekeeping systems can tolerate jitter to at least
the order of the intrinsic local-clock resolution, called precision in
the NTP specification, which is commonly in the range from one to 20 ms.
Assuming 10-3 s peak-to-peak as the most demanding case, the interval
between clock corrections must be no more than <\$Esigma~=~10 sup -3 over
{2 roman~x~10 sup -4}~=~5> sec. For the NTP reference model
<\$Esigma~=~4> sec in order to allow for known features of the Unix
operating-system kernel. However, in order to support future anticipated
improvements in accuracy possible with faster workstations, it may be
useful to decrease <\$Esigma> to as little as one-tenth the present
value.

Note that if <\$Esigma> is changed, it is necessary to adjust the
parameters <\$EK sub f> and <\$EK sub g> in order to retain the same loop
bandwidth; in particular, the same <\$Eomega sub c> and <\$Eomega sub z>.
Since <\$Ealpha> varies as the reciprocal of <\$Esigma>, if <\$Esigma> is
changed to something other than 22, as in Table 10, it is necessary to
divide both <\$EK sub f> and <\$EK sub g> by <\$Esigma over 4> to obtain
the new values.

A key feature of the type-II PLL design is its capability to compensate
for the intrinsic frequency errors of the local oscillator. This
requires a initial period of adaptation in order to refine the frequency
estimate (see later sections of this appendix). The <\$Etau> parameter
determines the PLL time constant and thus the loop bandwidth, which is
approximately equal to <\$E{omega sub c} over tau>. When operated with a
relatively large bandwidth (small <\$Etau>), as in the analysis above,
the PLL adapts quickly to changes in the input reference signal, but has
poor long term stability. Thus, it is possible to accumulate substantial
errors if the system is deprived of the reference signal for an extended
period. When operated with a relatively small bandwidth (large <\$Etau>),
the PLL adapts slowly to changes in the input reference signal, and may
even fail to lock onto it. Assuming the frequency estimate has
stabilized, it is possible for the PLL to coast for an extended period
without external corrections and without accumulating significant error.
```
```In order to achieve the best performance without requiring individual
tailoring of the loop bandwidth, it is necessary to compute each value
of <\$Etau> based on the measured values of offset, delay and dispersion,
as produced by the NTP protocol itself. The traditional way of doing
this in precision timekeeping systems based on cesium clocks, is to
relate <\$Etau> to the Allan variance, which is defined as the mean of
the first-order differences of sequential samples measured during a
specified interval <\$Etau>,

<\$Esigma sub y sup 2 ( tau )~=~1 over {2(N~-~1)}sum from { i = 1 } to
{N-1 }~[y(i~+~1)~-~y(i)] sup 2> ,

where y is the fractional frequency measured with respect to the local
timescale and N is the number of samples.

In the NTP local-clock model the Allan variance (called the compliance,
h in Table 11) is approximated on a continuous basis by exponentially
averaging the first-order differences of the offset samples using an
empirically determined averaging constant. Using somewhat ad-hoc mapping
functions determined from simulation and experience, the compliance is
manipulated to produce the loop time constant and update interval.

The NTP Clock Model

The PLL behavior can also be described by a set of recurrence equations,
which depend upon several variables and constants. The variables and
parameters used in these equations are shown in Tables 9, 10 and
11<\$&tab11>. Note the use of powers of two, which facilitates
implementation using arithmetic shifts and avoids the requirement for a
multiply/divide capability.

A capsule overview of the design may be helpful in understanding how it
operates. The logical clock is continuously adjusted in small increments
at fixed intervals of <\$Esigma>. The increments are determined while
updating the variables shown in Tables 9 and 11, which are computed from
computed from these messages occur at discrete times as each is
range up to about 17 minutes. As part of update processing the
compliance h is computed and used to adjust the PLL time constant
<\$Etau>. Finally, the update interval <\$Erho> for transmitted NTP
messages is determined as a fixed multiple of <\$Etau>.

Updates are numbered from zero, with those in the neighborhood of the
ith update shown in Figure 13<\$&fig13>. All variables are initialized at
<\$Ei~=~0> to zero, except the time constant <\$Etau (0)~=~tau>, poll
interval <\$Emu (0)~=~tau> (from Table 10) and compliance <\$Eh (0)~=~K
sub s>. After an interval <\$Emu (i)> (<\$Ei~>>~0>) from the previous
update the ith update arrives at time <\$Et(i)> including the time
offset <\$Ev sub s (i)>. Then, after an interval <\$Emu (i~+~1)> the
```
```<\$Ei+1 roman th> update arrives at time <\$Et(i~+~1)> including the time
offset <\$Ev sub s (i~+~1)>. When the update <\$Ev sub s (i)> is received,
the frequency error <\$Ef(i~+~1)> and phase error <\$Eg(i~+~1)> are
computed:

<\$Ef(i~+~1)~=~f(i)~+~{mu (i) v sub s (i)} over {tau (i) sup 2 }>
,<\$E~~~~~g(i~+~1)~=~{v sub s (i)} over {tau (i)}> .

Note that these computations depend on the value of the time constant
<\$Etau (i)> and poll interval <\$Emu (i)> previously computed from the
<\$Ei-1 roman th> update. Then, the time constant for the next interval
is computed from the current value of the compliance <\$Eh(i)>

<\$Etau (i~+~1)~=~roman max [K sub s~-~|~h(i)|,~1]> .

Next, using the new value of <\$Etau>, called <\$Etau prime> to avoid
confusion, the poll interval is computed

<\$Erho (i~+~1)~=~K sub u~tau prime> .

Finally, the compliance <\$Eh(i~+~1)> is recomputed for use in the <\$Ei+1
roman th> update:
<\$Eh(i~+~1)~=~h(i)~+~{K sub t~tau prime v sub s (i)~-~h(i) }over K sub
h> .

The factor <\$Etau prime> in the above has the effect of adjusting the
bandwidth of the PLL as a function of compliance. When the compliance
has been low over some relatively long period, <\$Etau prime> is
increased and the bandwidth is decreased. In this mode small timing
fluctuations due to jitter in the network are suppressed and the PLL
attains the most accurate frequency estimate. On the other hand, if the
compliance becomes high due to greatly increased jitter or a systematic
frequency offset, <\$Etau prime> is decreased and the bandwidth is
increased. In this mode the PLL is most adaptive to transients which can
occur due to reboot of the system or a major timing error. In order to
maintain optimum stability, the poll interval <\$Erho> is varied directly
with <\$Etau>.

A model suitable for simulation and parameter refinement can be
constructed from the above recurrence relations. It is convenient to set
the temporary variable <\$Ea~=~g(i~+~1)>. At each adjustment interval
<\$Esigma> the quantity <\$Ea over K sub g~+~{f(i~+~1)} over K sub f> is
added to the local-clock phase and the quantity <\$Ea over K sub g> is
subtracted from a. For convenience, let n be the greatest integer in
<\$E{mu (i)} over sigma>; that is, the number of adjustments that occur
in the ith interval. Thus, at the end of the ith interval just before
the <\$Ei+1 roman th> update, the VCO control voltage is:

<\$Ev sub c (i~+~1)~=~v sub c (i)~+~{[1~-~(1~-~1 over K sub g ) sup n
]}~{g(i~+~1)} ~+~n over {K sub f }~{ f(i~+~1)}~.>
```
```Detailed simulation of the NTP PLL with the values specified in Tables
9, 10 and 11 and the clock filter described in the NTP specification
results in the following characteristics: For a 100-ms phase change the
loop reaches zero error in 39 minutes, overshoots 7 ms at 54 minutes and
settles to less than 1 ms in about six hours. For a 50-ppm frequency
change the loop reaches 1 ppm in about 16 hours and 0.1 ppm in about 26
hours. When the magnitude of correction exceeds a few milliseconds or a
few ppm for more than a few updates, the compliance begins to increase,
which causes the loop time constant and update interval to decrease.
When the magnitude of correction falls below about 0.1 ppm for a few
hours, the compliance begins to decrease, which causes the loop time
constant and update interval to increase. The effect is to provide a
broad capture range exceeding 4 s per day, yet the capability to resolve
oscillator skew well below 1 ms per day. These characteristics are
appropriate for typical crystal-controlled oscillators with or without
temperature compensation or oven control.

Appendix H. Analysis of Errors and Correctness Principles

Introduction

This appendix contains an analysis of errors arising in the generation
and processing of NTP timestamps and the determination of delays and
offsets. It establishes error bounds as a function of measured roundtrip
delay and dispersion to the root (primary reference source) of the
synchronization subnet. It also discusses correctness assertions about
these error bounds and the time-transfer, filtering and selection
algorithms used in NTP.

The notation <\$Ew~=~[u,~v]> in the following describes the interval in
which u is the lower limit and v the upper limit, inclusive. Thus,
<\$Eu~=~min (w)~<<=~v~=~max (w)>, and for scalar a,
<\$Ew~+~a~=~[u~+~a,~v~+~a]>. Table 12<\$&tab12> shows a summary of other
notation used in the analysis. The notation <\$E<<~x~>>> designates the
(infinite) average of x, which is usually approximated by an exponential
average, while the notation <\$Ex hat> designates an estimator for x. The
lower-case Greek letters <\$Etheta>, <\$Edelta> and <\$Eepsilon> are used
to designate measurement data for the local clock to a peer clock, while
the upper-case Greek letters <\$ETHETA>, <\$EDELTA> and <\$EEPSILON> are
used to designate measurement data for the local clock relative to the
primary reference source at the root of the synchronization subnet.
Exceptions will be noted as they arise.

Timestamp Errors

The standard second (1 s) is defined as <169>9,192,631,770 periods of
the radiation corresponding to the transition between the two hyperfine
levels of the ground state of the cesium-133 atom<170> [ALL74b], which
implies a granularity of about 1.1x10-10 s. Other intervals can be
```
```determined as rational multiples of 1 s. While NTP time has an inherent
resolution of about 2.3x10-10 s, local clocks ordinarily have
resolutions much worse than this, so the inherent error in resolving NTP
time relative to the 1 s can be neglected.

In this analysis the local clock is represented by a counter/divider
which increments at intervals of s seconds and is driven by an
oscillator which operates at frequency <\$Ef sub c~=~n over s> for some
integer n. A timestamp <\$ET(t)> is determined by reading the clock at an
arbitrary time t (the argument t will be usually omitted for
conciseness). Strictly speaking, s is not known exactly, but can be
assumed bounded from above by the maximum reading error <\$Erho>. The
reading error itself is represented by the random variable r bounded by
the interval <\$E[-~rho ,~0]>, where <\$Erho> depends on the particular
clock implementation. Since the intervals between reading the same clock
are almost always independent of and much larger than s, successive
readings can be considered independent and identically distributed. The
frequency error of the clock oscillator is represented by the random
variable f bounded by the interval <\$E[-~phi ,~phi ]>, where <\$Ephi>
represents the maximum frequency tolerance of the oscillator throughout
its service life. While f for a particular clock is a random variable
with respect to the population of all clocks, for any one clock it
ordinarily changes only slowly with time and can usually be assumed a
constant for that clock. Thus, an NTP timestamp can be represented by
the random variable T:

<\$ET~=~t~+~r~+~f tau> ,

where t represents a clock reading, <\$Etau> represents the time interval
since this reading and minor approximations inherent in the measurement
of <\$Etau> are neglected.

In order to assess the nature and expected magnitude of timestamp errors
and the calculations based on them, it is useful to examine the
characteristics of the probability density functions (pdf) <\$Ep sub r
(x)> and <\$Ep sub f (x)> for r and f respectively. Assuming the clock
reading and counting processes are independent, the pdf for r is uniform
over the interval <\$E[-~rho ,~0]>. With conventional manufacturing
processes and temperature variations the pdf for f can be approximated
by a truncated, zero-mean Gaussian distribution with standard deviation
<\$Esigma>. In conventional manufacturing processes <\$Esigma> is
maneuvered so that the fraction of samples rejected outside the interval
<\$E[-~phi ,~phi ]> is acceptable. The pdf for the total timestamp error
<\$Eepsilon (x)> is thus the sum of the r and f contributions, computed
as

<\$Eepsilon (x)~ =~ int~from {- inf } to inf p sub r (t) p sub f (x~-~t)
d t> ,

which appears as a bell-shaped curve, symmetric about <\$E-~rho over 2>
```
```and bounded by the interval

<\$E[ min (r)~+~min (f tau ),~max (r)~+~max (f tau )]~=~[-~rho ~-~phi tau
,~phi tau ]> .
Since f changes only slowly over time for any single clock,

<\$Eepsilon~==~[ min (r)~+~f tau ,~max (r)~+~f tau ]~=~ [-~ rho ,~0]~+~f
tau> ,

where <\$Eepsilon> without argument designates the interval and
<\$Eepsilon (x)> designates the pdf. In the following development
subscripts will be used on various quantities to indicate to which
entity or timestamp the quantity applies. Occasionally, <\$Eepsilon> will
be used to designate an absolute maximum error, rather than the
interval, but the distinction will be clear from context.

Measurement Errors

In NTP the roundtrip delay and clock offset between two peers A and B
are determined by a procedure in which timestamps are exchanged via the
network paths between them. The procedure involves the four most recent
timestamps numbered as shown in Figure 14<\$&fig14>, where the <\$Etheta
sub 0> represents the true clock offset of peer B relative to peer A.
The <\$ET sub 1> and <\$ET sub 4> timestamps are determined relative to
the A clock, while the <\$ET sub 2> and <\$ET sub 3> timestamps are
determined relative to the B clock. The measured roundtrip delay
<\$Edelta> and clock offset <\$Etheta> of B relative to A are given by

<\$Edelta~=~(T sub 4~-~T sub 1 )~-~(T sub 3~-~T sub 2
)~~~~and~~~~theta~=~{(T sub 2~-~T sub 1 )~+~(T sub 3~-~T sub 4 )} over
2> .

The errors inherent in determining the timestamps T1, T2, T3 and T4 are,
respectively,

<\$Eepsilon sub 1~=~[-~rho sub A ,~0]>, <\$E~epsilon sub 2~=~[-~rho sub B
,~0]>, <\$E~epsilon sub 3~=~[-~rho sub B ,~0]~+~f sub B (T sub 3 ~-~T sub
2 )>, <\$E~epsilon sub 4~=~[-~rho sub A ,~0]~+~f sub A (T sub 4 ~-~T sub
1 )> .

For specific peers A and B, where <\$Ef sub A> and <\$Ef sub B> can be
considered constants, the interval containing the maximum error inherent
in determining <\$Edelta> is given by

<\$E[ min ( epsilon sub 4 )~-~max ( epsilon sub 1 )~-~max ( epsilon sub 3
)~+~min ( epsilon sub 2 ),~ max ( epsilon sub 4 )~-~min ( epsilon sub 1
)~-~min ( epsilon sub 3 )~+~max ( epsilon sub 2 )]>
<\$E=~[-~rho sub A~-~rho sub B ,~rho sub A ~+~rho sub B ]~+~f sub A (T
sub 4~-~T sub 1 )~-~f sub B (T sub 3~-~T sub 2 )> .
```
```In the NTP local clock model the residual frequency errors <\$Ef sub A>
and <\$Ef sub B> are minimized through the use of a type-II phase-lock
loop (PLL). Under most conditions these errors will be small and can be
ignored. The pdf for the remaining errors is symmetric, so that <\$Edelta
hat~=~<< delta >>> is an unbiased maximum-likelihood estimator for the
true roundtrip delay, independent of the particular values of <\$Erho sub
A> and <\$Erho sub B>.

However, in order to reliably bound the errors under all conditions of
component variation and operational regimes, the design of the PLL and
the tolerance of its intrinsic oscillator must be controlled so that it
is not possible under any circumstances for <\$Ef sub A> or <\$Ef sub B>
to exceed the bounds <\$E[-~phi sub A ,~phi sub A ]> or <\$E[-~phi sub B
,~phi sub B ]>, respectively. Setting <\$Erho~=~max ( rho sub A ,~rho sub
B )> for convenience, the absolute maximum error <\$Eepsilon sub delta>
inherent in determining roundtrip delay <\$Edelta> is given by

<\$Eepsilon sub delta~==~rho~+~phi sub A (T sub 4~-~T sub 1 )~+~phi sub B
(T sub 3~-~T sub 2 )> ,
neglecting residuals.

As in the case for <\$Edelta>, where <\$Ef sub A> and <\$Ef sub B> can be
considered constants, the interval containing the maximum error inherent
in determining <\$Etheta> is given by

<\$E{[ min ( epsilon sub 2 )~-~max ( epsilon sub 1 )~+~min ( epsilon sub
3 )~-~max ( epsilon sub 4 ),~ max ( epsilon sub 2 )~-~min ( epsilon sub
1 )~+~max ( epsilon sub 3 )~-~min ( epsilon sub 4 )]} over 2>
<\$E=~[ -~rho sub B ,~rho sub A ]~+~{f sub B (T sub 3~-~T sub 2 )~-~f sub
A (T sub 4 ~-~T sub 1 )} over 2> .

Under most conditions the errors due to <\$Ef sub A> and <\$Ef sub B> will
be small and can be ignored. If <\$Erho sub A~=~rho sub B~=~rho>; that
is, if both the A and B clocks have the same resolution, the pdf for the
remaining errors is symmetric, so that <\$Etheta hat~=~<< theta >>> is an
unbiased maximum-likelihood estimator for the true clock offset <\$Etheta
sub 0>, independent of the particular value of <\$Erho>. If <\$Erho sub
A~!=~rho sub B>, <\$E<< theta >>> is not an unbiased estimator; however,
the bias error is in the order of

<\$E{rho sub A~-~rho sub B } over 2> .

and can usually be neglected.

Again setting <\$Erho~=~max ( rho sub A ,~rho sub B )> for convenience,
the absolute maximum error <\$Eepsilon sub theta> inherent in determining
clock offset <\$Etheta> is given by

<\$Eepsilon sub theta~==~{rho~+~phi sub A (T sub 4~-~T sub 1 )~+~phi sub
B (T sub 3~-~T sub 2 )} over 2 > .
```
```Network Errors

In practice, errors due to stochastic network delays usually dominate.
In general, it is not possible to characterize network delays as a
stationary random process, since network queues can grow and shrink in
chaotic fashion and arriving customer traffic is frequently bursty.
However, it is a simple exercise to calculate bounds on clock offset
errors as a function of measured delay. Let <\$ET sub 2~-~T sub 1~=~a>
and <\$ET sub 3~-~T sub 4~=~b>. Then,

<\$Edelta~=~a~-~b~~~~ and ~~~~theta~=~{a~+~b} over 2> .

The true offset of B relative to A is called <\$Etheta sub 0> in Figure
14. Let x denote the actual delay between the departure of a message
from A and its arrival at B. Therefore, <\$Ex~+~theta sub 0~=~T sub 2~-~T
sub 1~==~a>. Since x must be positive in our universe, <\$Ex~=~a~-~theta
sub 0~>>=~0>, which requires <\$Etheta sub 0~<<=~a>. A similar argument
requires that <\$Eb~<<=~theta sub 0>, so surely <\$Eb~<<=~theta sub
0~<<=~a>. This inequality can also be expressed

<\$Eb~=~{a~+~b} over 2~-~{a~-~b} over 2~<<=~theta sub 0~<<=~{a~+~b} over
2~+~{a~-~b} over 2~=~a> ,

which is equivalent to

<\$Etheta~-~delta over 2~<<=~theta sub 0~<<=~theta~+~delta over 2> .

In the previous section bounds on delay and offset errors were
determined. Thus, the inequality can be written

<\$Etheta~-~epsilon sub theta~-~{delta~+~epsilon sub delta} over
2~<<=~theta sub 0~<<=~theta~+~epsilon sub theta~+~{delta~+~ epsilon sub
delta } over 2> ,
where <\$Eepsilon sub theta> is the maximum offset error and <\$Eepsilon
sub delta> is the maximum delay error derived previously. The quantity

<\$Eepsilon~=~epsilon sub theta~+~epsilon sub delta over 2~=~rho~+~phi
sub A (T sub 4~-~T sub 1 )~+~phi sub B (T sub 3~-~T sub 2 )> ,

called the peer dispersion, defines the maximum error in the inequality.
Thus, the correctness interval I can be defined as the interval

<\$EI~=~[ theta~-~delta over 2~-~epsilon ,~theta~+~delta over 2~+~epsilon
]> ,

in which the clock offset <\$EC~=~theta> is the midpoint. By
construction, the true offset <\$Etheta sub 0> must lie somewhere in this
interval.
```
```Inherited Errors

As described in the NTP specification, the NTP time server maintains the
local clock <\$ETHETA>, together with the root roundtrip delay <\$EDELTA>
and root dispersion <\$EEPSILON> relative to the primary reference source
at the root of the synchronization subnet. The values of these variables
are either included in each update message or can be derived as
described in the NTP specification. In addition, the protocol exchange
and clock-filter algorithm provide the clock offset <\$Etheta> and
roundtrip delay <\$Edelta> of the local clock relative to the peer clock,
as well as various error accumulations as described below. The following
discussion establishes how errors inherent in the time-transfer process
accumulate within the subnet and contribute to the overall error budget
at each server.

An NTP measurement update includes three parts: clock offset <\$Etheta>,
roundtrip delay <\$Edelta> and maximum error or dispersion <\$Eepsilon> of
the local clock relative to a peer clock. In case of a primary clock
update, these values are usually all zero, although <\$Eepsilon> can be
tailored to reflect the specified maximum error of the primary reference
source itself. In other cases <\$Etheta> and <\$Edelta> are calculated
directly from the four most recent timestamps, as described in the NTP
specification. The dispersion <\$Eepsilon> includes the following
contributions:

1.

Each time the local clock is read a reading error is incurred due to the
finite granularity or precision of the implementation. This is called
the measurement dispersion <\$Erho>.

2.

Once an offset is determined, an error due to frequency offset or skew
accumulates with time. This is called the skew dispersion <\$Ephi tau>,
where <\$Ephi> represents the skew-rate constant (<\$Eroman NTP.MAXSKEW
over NTP.MAXAGE> in the NTP specification) and <\$Etau> is the interval
since the dispersion was last updated.

3

When a series of offsets are determined at regular intervals and
accumulated in a window of samples, as in the NTP clock-filter
algorithm, the (estimated) additional error due to offset sample
variance is called the filter dispersion <\$Eepsilon sub sigma>.

4.

When a number of peers are considered for synchronization and two or
more are determined to be correctly synchronized to a primary reference
```
```source, as in the NTP clock-selection algorithm, the (estimated)
additional error due to offset sample variance is called the selection
dispersion <\$Eepsilon sub xi>.

Figure 15<\$&fig15> shows how these errors accumulate in the ordinary
course of NTP processing. Received messages from a single peer are
represented by the packet variables. From the four most recent
timestamps T1, T2, T3 and T4 the clock offset and roundtrip delay sample
for the local clock relative to the peer clock are calculated directly.
Included in the message are the root roundtrip delay <\$EDELTA prime> and
root dispersion <\$EEPSILON prime> of the peer itself; however, before
sending, the peer adds the measurement dispersion <\$Erho> and skew
dispersion <\$Ephi tau>, where these quantities are determined by the
peer and <\$Etau> is the interval according to the peer clock since its
clock was last updated.

The NTP clock-filter procedure saves the most recent samples <\$Etheta
sub i> and <\$Edelta sub i> in the clock filter as described in the NTP
specification. The quantities <\$Erho> and <\$Ephi> characterize the local
clock maximum reading error and frequency error, respectively. Each
sample includes the dispersion <\$Eepsilon sub i~=~rho~+~phi (T sub 4~-~T
sub 1 )>, which is set upon arrival. Each time a new sample arrives all
samples in the filter are updated with the skew dispersion <\$Ephi tau
sub i>, where <\$Etau sub i> is the interval since the last sample
arrived, as recorded in the variable peer.update. The clock-filter
algorithm determines the selected clock offset <\$Etheta> (peer.offset),
together with the associated roundtrip delay <\$Edelta> (peer.delay) and
filter dispersion <\$Eepsilon sub sigma>, which is added to the
associated sample dispersion <\$Eepsilon sub i> to form the peer
dispersion <\$Eepsilon> (peer.dispersion).

The NTP clock-selection procedure selects a single peer to become the
synchronization source as described in the NTP specification. The
operation of the algorithm determines the final clock offset <\$ETHETA>
(local clock), roundtrip delay <\$EDELTA> (sys.rootdelay) and dispersion
<\$EEPSILON> (sys.rootdispersion) relative to the root of the
synchronization subnet, as shown in Figure 15. Note the inclusion of the
selected peer dispersion and skew accumulation since the dispersion was
last updated, as well as the select dispersion <\$Eepsilon sub xi>
computed by the clock-select algorithm itself. Also, note that, in order
to preserve overall synchronization subnet stability, the final clock
offset <\$ETHETA> is in fact determined from the offset of the local
clock relative to the peer clock, rather than the root of the subnet.
Finally, note that the packet variables <\$EDELTA prime> and <\$EEPSILON
prime> are in fact determined from the latest message received, not at
the precise time the offset selected by the clock-filter algorithm was
determined. Minor errors arising due to these simplifications will be
ignored. Thus, the total dispersion accumulation relative to the root of
the synchronization subnet is
```
```<\$EEPSILON~=~epsilon~+~phi tau~+~epsilon sub xi~+~| THETA |~+~EPSILON
prime > ,

where <\$Etau> is the time since the peer variables were last updated and
<\$E| THETA |> is the initial absolute error in setting the local clock.

The three values of clock offset, roundtrip delay and dispersion are all
additive; that is, if <\$ETHETA sub i>, <\$EDELTA sub i> and <\$EEPSILON
sub i> represent the values at peer i relative to the root of the
synchronization subnet, the values

<\$ETHETA sub j (t)~==~THETA sub i~+~theta sub j (t)> ,   <\$EDELTA sub j
(t)~==~DELTA sub i~+~delta sub j> ,   <\$EEPSILON sub j (t)~==~EPSILON
sub i~+~epsilon sub i~+~epsilon sub j (t)> ,
represent the clock offset, roundtrip delay and dispersion of peer j at
time t. The time dependence of <\$Etheta sub j (t)> and <\$Eepsilon sub j
(t)> represents the local-clock correction and dispersion accumulated
since the last update was received from peer i, while the term
<\$Eepsilon sub i> represents the dispersion accumulated by peer i from
the time its clock was last set until the latest update was sent to peer
j. Note that, while the offset of the local clock relative to the peer
clock can be determined directly, the offset relative to the root of the
synchronization subnet is not directly determinable, except on a
probabilistic basis and within the bounds established in this and the
previous section.

The NTP synchronization subnet topology is that of a tree rooted at the
primary server(s). Thus, there is an unbroken path from every time
server to the primary reference source. Accuracy and stability are
proportional to synchronization distance <\$ELAMBDA>, defined as

<\$ELAMBDA~==~EPSILON~+~DELTA over 2> .

The selection algorithm favors the minimum-distance paths and thus
maximizes accuracy and stability. Since <\$ETHETA sub 0>, <\$EDELTA sub 0>
and <\$EEPSILON sub 0> are all zero, the sum of the clock offsets,
roundtrip delays and dispersions of each server along the minimum-
distance path from the root of the synchronization subnet to a given
server i are the clock offset <\$ETHETA sub i>, roundtrip delay <\$EDELTA
sub i> and dispersion <\$EEPSILON sub i> inherited by and characteristic
of that server.

Correctness Principles

In order to minimize the occurrence of errors due to incorrect clocks
and maximize the reliability of the service, NTP relies on multiple
peers and disjoint peer paths whenever possible. In the previous
development it was shown that, if the primary reference source at the
root of the synchronization subnet is in fact a correct clock, then the
true offset <\$Etheta sub 0> relative to that clock must be contained in
```
```the interval

<\$E[ THETA~-~LAMBDA ,~THETA~+~LAMBDA ]~==~[ THETA~-~EPSILON~-~DELTA over
2 ,~THETA~+~EPSILON~+~DELTA over 2 ]> .

When a number of clocks are involved, it is not clear beforehand which
are correct and which are not; however, as cited previously, there are a
number of techniques based on clustering and filtering principles which
yield a high probability of detecting and discarding incorrect clocks.
Marzullo and Owicki [MAR85] devised an algorithm designed to find an
appropriate interval containing the correct time given the confidence
intervals of m clocks, of which no more than f are considered incorrect.
The algorithm finds the smallest single intersection containing all
points in at least <\$Em~-~f> of the given confidence intervals.

Figure 16<\$&fig16> illustrates the operation of this algorithm with a
scenario involving four clocks A, B, C and D, with the calculated time
(shown by the <F128>ì<F255> symbol) and confidence interval shown for
each. These intervals are computed as described in previous sections of
this appendix. For instance, any point in the A interval may possibly
represent the actual time associated with that clock. If all clocks are
correct, there must exist a nonempty intersection including all four
intervals; but, clearly this is not the case in this scenario. However,
if it is assumed that one of the clocks is incorrect (e.g., D), it might
be possible to find a nonempty intersection including all but one of the
intervals. If not, it might be possible to find a nonempty intersection
including all but two of the intervals and so on.

The algorithm proposed by DEC for use in the Digital Time Service
[DEC89] is based on these principles. For the scenario illustrated in
Figure 16, it computes the interval for <\$Em~=~4> clocks, three of which
turn out to be correct and one not. The low endpoint of the intersection
is found as follows. A variable f is initialized with the number of
presumed incorrect clocks, in this case zero, and a counter i is
initialized at zero. Starting from the lowest endpoint, the algorithm
increments i at each low endpoint, decrements i at each high endpoint,
and stops when <\$Ei~>>=~m~-~f>. The counter records the number of
intersections and thus the number of presumed correct clocks. In the
example the counter never reaches four, so f is increased by one and the
procedure is repeated. This time the counter reaches three and stops at
the low endpoint of the intersection marked DTS. The upper endpoint of
this intersection is found using a similar procedure.

This algorithm will always find the smallest single intersection
containing points in at least one of the original <\$Em~-~f> confidence
intervals as long as the number of incorrect clocks is less than half
the total <\$Ef~<<~m over 2>. However, some points in the intersection
may not be contained in all <\$Em~-~f> of the original intervals;
moreover, some or all of the calculated times (such as for C in Figure
16) may lie outside the intersection. In the NTP clock-selection
```
```procedure the above algorithm is modified so as to include at least
<\$Em~-~f> of the calculated times. In the modified algorithm a counter c
is initialized at zero. When starting from either endpoint, c is
incremented at each calculated time; however, neither f nor c are reset
between finding the low and high endpoints of the intersection. If after
both endpoints have been found <\$Ec~>>~f>, f is increased by one and the
entire procedure is repeated. The revised algorithm finds the smallest
intersection of <\$Em~-~f> intervals containing at least <\$Em~-~f>
calculated times. As shown in Figure 16, the modified algorithm produces
the intersection marked NTP and including the calculated time for C.

In the NTP clock-selection procedure the peers represented by the clocks
in the final intersection, called the survivors, are placed on a
candidate list. In the remaining steps of the procedure one or more
survivors may be discarded from the list as outlyers. Finally, the
clock-combining algorithm described in Appendix F provides a weighted
average of the remaining survivors based on synchronization distance.
The resulting estimates represent a synthetic peer with offset between
the maximum and minimum offsets of the remaining survivors. This defines
the clock offset <\$ETHETA>, total roundtrip total delay <\$EDELTA> and
total dispersion <\$EEPSILON> which the local clock inherits. In
principle, these values could be included in the time interface provided
by the operating system to the user, so that the user could evaluate the
quality of indications directly.

Appendix I. Selected C-Language Program Listings

Following are C-language program listings of selected algorithms
described in the NTP specification. While these have been tested as part
of a software simulator using data collected in regular operation, they
do not necessarily represent a standard implementation, since many other
implementations could in principle conform to the NTP specification.

Common Definitions and Variables

The following definitions are common to all procedures and peers.

#define NMAX 40                                 /* max clocks */
#define FMAX 8                                  /* max filter size */
#define HZ 1000                                 /* clock rate */
#define MAXSTRAT 15                             /* max stratum */
#define MAXSKEW 1                               /* max skew error per
MAXAGE */
#define MAXAGE 86400                            /* max clock age */
#define MAXDISP 16                              /* max dispersion */
#define MINCLOCK 3                              /* min survivor clocks
*/
#define MAXCLOCK 10                             /* min candidate clocks
*/
#define FILTER .5                                       /* filter weight
```
```*/
#define SELECT .75                              /* select weight */

The folowing are peer state variables (one set for each peer).

double filtp[NMAX][FMAX];                               /* offset
samples */
double fildp[NMAX][FMAX];                       /* delay samples */
double filep[NMAX][FMAX];                       /* dispersion samples */
double tp[NMAX];                                        /* offset */
double dp[NMAX];                                        /* delay */
double ep[NMAX];                                        /* dispersion */
double rp[NMAX];                                        /* last offset
*/
double utc[NMAX];                                       /* update tstamp
*/
int st[NMAX];                                           /* stratum */

The following are system state variables and constants.

double rho = 1./HZ;                                     /* max reading
error */
double phi = MAXSKEW/MAXAGE;            /* max skew rate */
double bot, top;                                        /* confidence
interval limits */
double theta;                                           /* clock offset
*/
double delta;                                           /* roundtrip
delay */
double epsil;                                           /* dispersion */
double tstamp;                                  /* current time */
int source;                                             /* clock source
*/
int n1, n2;                                             /* min/max clock
ids */

The folowing are temporary lists shared by all peers and procedures.

double list[3*NMAX];                            /* temporary list*/
int index[3*NMAX];                                      /* index list */

Clock<196>Filter Algorithm

/*
clock filter algorithm

n = peer id, offset = sample offset, delay = sample delay, disp =
sample dispersion;
computes tp[n] = peer offset, dp[n] = peer delay, ep[n] = peer
dispersion
```
``` */

void filter(int n, double offset, double delay, double disp) {

int i, j, k, m;                                 /* int temps */
double x;                                       /* double temps
*/

for (i = FMAX<196>1; i >> 0; i<196> <196>) {            /*
update/shift filter */
filtp[n][i] = filtp[n][i<196>1]; fildp[n][i] =
fildp[n][i<196>1];
filep[n][i] = filep[n][i<196>1]+phi*(tstamp<196>utc[n]);
}
utc[n] = tstamp; filtp[n][0] = offset<196>tp[0]; fildp[n][0] =
delay; filep[n][0] = disp;
m = 0;                                          /*
construct/sort temp list */
for (i = 0; i << FMAX; i++) {
if (filep[n][i] >>= MAXDISP) continue;
list[m] = filep[n][i]+fildp[n][i]/2.; index[m] = i;
for (j = 0; j << m; j++) {
if (list[j] >> list[m]) {
x = list[j]; k = index[j]; list[j] =
list[m]; index[j] = index[m];
list[m] = x; index[m] = k;
}
}
m = m+1;
}

if (m <<= 0) ep[n] = MAXDISP;           /* compute filter
dispersion */
else {
ep[n] = 0;
for (i = FMAX<196>1; i >>= 0; i<196> <196>) {
if (i << m) x =
fabs(filtp[n][index[0]]<196>filtp[n][index[i]]);
else x = MAXDISP;
ep[n] = FILTER*(ep[n]+x);
}
i = index[0]; ep[n] = ep[n]+filep[n][i]; tp[n] =
filtp[n][i]; dp[n] = fildp[n][i];
}
return;
}

Interval Intersection Algorithm

/*
```
```   compute interval intersection

computes bot = lowpoint, top = highpoint (bot >> top if no
intersection)
*/

void dts() {

int f;                                          /* intersection
ceiling */
int end;                                        /* endpoint
counter */
int clk;                                                /*
falseticker counter */
int i, j, k, m, n;                              /* int temps */
double x, y;                                    /* double temps
*/

m = 0; i = 0;
for (n = n1; n <<= n2; n++) {   /* construct endpoint list */
if (ep[n] >>= MAXDISP) continue;
m = m+1;
list[i] = tp[n]<196>dist(n); index[i] = <196>1; /*
lowpoint */
for (j = 0; j << i; j++) {
if ((list[j] >> list[i]) || ((list[j] ==
list[i]) && (index[j] >> index[i]))) {
x = list[j]; k = index[j]; list[j] =
list[i]; index[j] = index[i];
list[i] = x; index[i] = k;
}
}
i = i+1;

list[i] = tp[n]; index[i] = 0;          /* midpoint */
for (j = 0; j << i; j++) {
if ((list[j] >> list[i]) || ((list[j] ==
list[i]) && (index[j] >> index[i]))) {
x = list[j]; k = index[j]; list[j] =
list[i]; index[j] = index[i];
list[i] = x; index[i] = k;
}
}
i = i+1;

list[i] = tp[n]+dist(n); index[i] = 1;  /* highpoint */
for (j = 0; j << i; j++) {
if ((list[j] >> list[i]) || ((list[j] ==
list[i]) && (index[j] >> index[i]))) {
x = list[j]; k = index[j]; list[j] =
```
```list[i]; index[j] = index[i];
list[i] = x; index[i] = k;
}
}
i = i+1;
}

if (m <<= 0) return;
for (f = 0; f << m/2; f++) {                    /* find
intersection */
clk = 0; end = 0;                       /* lowpoint */
for (j = 0; j << i; j++) {
end = end<196>index[j]; bot = list[j];
if (end >>= (m<196>f)) break;
if (index[j] == 0) clk = clk+1;
}
end = 0;                                /* highpoint */
for (j = i<196>1; j >>= 0; j<196> <196>) {
end = end+index[j]; top = list[j];
if (end >>= (m<196>f)) break;
if (index[j] == 0) clk = clk+1;
}
if (clk <<= f) break;
}
return;
}

Clock<196>Selection Algorithm

/*
select best subset of clocks in candidate list

bot = lowpoint, top = highpoint; constructs index = candidate index
list,
m = number of candidates, source = clock source,
theta = clock offset, delta = roundtrip delay, epsil = dispersion
*/

void select() {

double xi;                                      /* max select
dispersion */
double eps;                                     /* min peer
dispersion */
int i, j, k, n;                                 /* int temps */
double x, y, z;                         /* double temps */

m = 0;
for (n = n1; n <<= n2; n++) {   /* make/sort candidate list */
if ((st[n] >> 0) && (st[n] << MAXSTRAT) && (tp[n] >>=
```
```bot) && (tp[n] <<= top)) {
list[m] = MAXDISP*st[n]+dist(n); index[m] = n;
for (j = 0; j << m; j++) {
if (list[j] >> list[m]) {
x = list[j]; k = index[j];
list[j] = list[m]; index[j] = index[m];
list[m] = x; index[m] = k;
}
}
m = m+1;
}
}
if (m <<= 0) {
source = 0; return;
}
if (m >> MAXCLOCK) m = MAXCLOCK;

while (1) {                                     /* cast out
falsetickers */
xi = 0.; eps = MAXDISP;
for (j = 0; j << m; j++) {
x = 0.;
for (k = m<196>1; k >>= 0; k<196> <196>)
x =
SELECT*(x+fabs(tp[index[j]]<196>tp[index[k]]));
if (x >> xi) {
xi = x; i = j;          /* max(xi) */
}
x = ep[index[j]]+phi*(tstamp<196>utc[index[j]]);
if (x << eps) eps = x;          /* min(eps) */
}
if ((xi <<= eps) || (m <<= MINCLOCK)) break;
if (index[i] == source) source = 0;
for (j = i; j << m<196>1; j++) index[j] = index[j+1];
m = m<196>1;
}

i = index[0];                                   /* declare
winner */
if (source != i)
if (source == 0) source = i;
else if (st[i] << st[source]) source = i;
theta = combine(); delta = dp[i]; epsil =
ep[i]+phi*(tstamp<196>utc[i])+xi;
return;
}

Clock<196>Combining Procedure

/*
```
```   compute weighted ensemble average

index = candidate index list, m = number of candidates; returns
combined clock offset
*/

double combine() {

int i;                                          /* int temps */
double x, y, z;                         /* double temps */
z = 0. ; y = 0.;
for (i = 0; i << m; i++) {                      /* compute
weighted offset */
j = index[i]; x = dist(j)); z = z+tp[j]/x; y = y+1./x;
}
return z/y;                                     /* normalize */
}

Subroutine to Compute Synchronization Distance

/*
compute synchronization distance

n = peer id; returns synchronization distance
*/

double dist(int n) {

return ep[n]+phi*(tstamp<196>utc[n])+fabs(dp[n])/2.;
}

Security considerations
see Section 3.6 and Appendix C