fasttime PLL 1 results

In this test, the PLL has been implemented, along with variable loop delay and clock skewing. Not much attention has been paid to tuning the algorithm; the values used (loop gain of 0.1 and loop delay starting at 1 second and incrementing at 1 second increments) were chosen because they tend to work.

The two machines tested are the same as in the free-running clock test: an Apple iBook 900MHz G3 running OS X 10.4, and a Pentium 3 800MHz running Fedora Core 2. During the test the daemon and fasttime_plot applications are started and allowed to run for 12 hours. fasttime_plot records the current fasttime time (obtained using calibration from the daemon) and system time every second and notes the offset.

Mac Test 1

There is an anomoly in the test just after the 2 hour mark:

Inspection of the raw data shows that the system clock actually jumped forward 3 seconds! The PLL was able to reasonably cope with the jump, restabalising to within a 5 microsecond offset after 5 minutes (note that there is not yet any code specifically for dealing with this sort of gross error). Unfortunately it did knock out the stability measurement (the readings suddenly become nan), which disabled any further loop delay adjustments.

The data before the anomoly also shows some interesting results, but these are clearer in the second Mac test (described further down the page).

Linux Test 1

The Linux test also shows an anomoly, but of a completely different nature:

After around 6 hours, the offset drifts and looks very much like a free-running clock. In fact this is exactly what happened; the daemon process either completely died or stopped updating the conversion tables at this point. There is nothing particularly noteworthy in the data at the time it stopped; stability was gradually increasing and loop delay had increased to 44 seconds. At this point I don't have an explanation for what happened. See below for better Linux data.

Mac Test 2

I reran exactly the same test on both machines the following night. Absolutely no changes were made at all to the source code or environment, and both tests were conducted at approximately the same time (but different days). This time there were no anomolies from either machine.

Offset Rate error Allan deviation

The most interesting feature is the repeated variation in offset. Note that this offset is calculated by a client program making its own readings of the TSC register and using the daemon's conversion table. The daemon keeps track of its own offset (which it then uses to correct via the PLL). Here is the daemon offset (in red) overlaid on the client's offset (in green):

Informal testing of several clients running off the same daemon shows that each client will have a pattern similar to this one's, but not in synchrony with each other. My current hypothesis is that this is due to jitter in the system clock caused by its own (poor) interpolation of the TSC register to provide sub-10ms times. That the client offsets never exceed 50 microseconds strengthens this argument.

Presumably after a reset the maximum error of client offset will change (the interpolation constant is determined by the OS during boot). I have yet to test this, however.

The Allan deviation plot shows a curve typical of systems dominated by clock jitter, with very little to no frequency wander. Rate error is bound within 7 PPM.

Linux Test 2

Again, the second run of the same test went without a hitch, and data is valid for the entire 12 hour period.

Offset Rate error Allan deviation

The clock is quite stable. The client's offset never exceeds 10 microseconds, and is generally less than 2 microseconds. Rate error never exceeds 1 PPM. The daemon's offset is shown below:

The daemon shows similar offset characteristics to the client, though on average smaller. The difference in client offset that appeared in the Mac test is far less significant here, perhaps indicating that the Linux TSC interpolation is better than the Mac's.

Stability and loop delay for the Linux test are plotted below. Note that lower values of stability are better (it is calculated as the RMS of all offsets).

Stability Loop delay

It is likely that longer loop delays can be used, and used earlier; this merely involves tuning how it is increased and decreased based on stability.

Conclusions

Although the PLL hasn't been tuned yet (in fact, it was only tested on the Mac before these tests) it shows good stability over a 12 hour period. As shown in the first Mac test, it also copes well with sudden changes to the system clock, despite not having re-synchronisation code yet.

Assuming my hypothesis is correct, fasttime provides a clock with more stability and less jitter than the system clock on the Mac. This is good news, and makes it potentially useful on this system despite its slower execution time (by about 3x at the moment).

Replicating the tests

I would really love it if you have the time to run these tests on your own machine. Assuming you use some Unix variant on either PPC or Intel/AMD, follow the directions below.

  1. Download latest source from CVS (see project page)
  2. Compile by typing 'make'. If the configure fails (because you use a platform I haven't tested on yet), just hack the config.h file.
  3. Start the daemon: ./fasttimed > /dev/null &
  4. Start the client: ./fasttime_plot -s 43200 > data
  5. The value of 43200 is the number of seconds to run the test for (12 hours). Feel free to run for any time that's convenient for you.
  6. At the conclusion of the test kill the fasttimed process and email me the results file, which I will make pretty graphs out of and post here!

Note that fasttimed currently prints debug info to stdout, which is safely redirected to /dev/null (there is no useful information there that isn't replicated in the client's report).