fredrikj.net / blog /

Logarithms as well

May 2, 2012

In the last post, I discussed computing the exponential function. With that being implemented, it’s not much work to do logarithms as well. If we can evaluate the exponential function, then we can compute $y = \log x$ by using Newton’s method to find a root of the equation $\exp(y) – x = 0$, i.e. using the iteration $y_{n+1} + 1 – x / \exp(y_n)$.

Newton iteration is extremely efficient. Starting from a sufficiently accurate initial value, the number of correct digits roughly doubles with each iteration; moreover, the algorithm is “self-correcting”, so we only need to use full precision for the last iteration, 1/2 the precision for the second last iteration, 1/4 the precision for the iteration before that, etc. The total cost turns out to be only a fraction more than the time to evaluate a single exponential to the target precision.

In fact, it seems even better to use Halley’s method, which has the update step $y_{n+1} = y_n + 2 – 4t/(x + t)$ where $t = \exp(y_n)$. The amount of work per step is basically the same (one exponential and one division), but now the number of digits triples with each iteration! (We could avoid the division in the ordinary Newton iteration using $1/\exp(y_n) = \exp(-y_n)$, but it is more convenient to limit the basecase exponential to positive arguments at least for now.)

Newton’s method and Halley’s method are just the first two instances of Householder’s method, of order $d = 1$ and $d = 2$ respectively. Using higher-order Householder iterations, we can obtain arbitrarily rapid convergence, but for $d > 2$ the formulas start to become unwieldy. For $d = 3$, we have

$$y_{n+1} = \frac{6 x \left(x+2 t\right)}{x^2+4 x t+t^2}+t-3.$$

I have now implemented a first version of a basecase logarithm function, which evaluates $\log(1+x)$ on the standard interval $[0,1)$. It starts from a double-precision approximation and refines it to the target precision (modulo some rounding error) using Halley iteration. Here is how it performs (timings are in nanoseconds):

prec system dd qd mpfr mpmath new code
53 67 4900 22000 390
64 4800 22000 390
106 900 6700 24000 590
128 7800 26000 600
192 9200 29000 1100
212 20000 10000 30000 1500
256 13000 32000 1500
320 14000 35000 2100
384 16000 39000 2500

Comparing the numbers to the table from the last post, we find that log runs at around half the speed of exp. This ratio should improve at higher precision, but even at low precision, it is not a bad result, and the code is still about an order of magnitude faster than MPFR. Some savings are possible: spending around 70 nanoseconds on the initial value by calling the system logarithm is a bit excessive (a faster, sloppier double-precision logarithm could be used instead), and the division could probably be done slightly faster. But apart from that, the code is quite lean, and the only way to improve it is to improve the exponential function.

I should mention that the current code typically gives 1-2 bits less than full accuracy. To guarantee full accuracy, it is necessary to add a few guard bits. It shouldn’t actually be all that difficult to compute a rigorous error bound for the Halley iteration to make this automatic, at least in principle (translating it to correct and efficient code is probably a bit hairy), but this will have to wait since it requires error bounding for exp first.

This is actually probably the best way to compute the logarithm, up to a precision of several thousand digits. We could of course use a table lookup and a Taylor series correction, just as for exp, avoiding the overhead of Newton/Halley iteration. But a table lookup requires at least one division to transform the argument, and moreover nested table lookups aren’t possible, so we would need a much larger table for the same speed. If we use the convergence acceleration formula $\log(1+x) = 2^n \log((1+x)^{1/2^n})$), the complexity is roughly the same as for exp (with $\exp(x) = (\exp(x/2^n))^{2^n}$), but square roots are so expensive that this only becomes competitive at very high precision. Either way, it would be difficult to get rid of the 2x overhead compared to exp.