Logarithms as well

May 2, 2012

In the last post, I discussed computing the exponential function. With that being implemented, it’s not much work to do logarithms as well. If we can evaluate the exponential function, then we can compute $y = \log x$ by using Newton’s method to find a root of the equation $\exp(y) – x = 0$, i.e. using the iteration $y_{n+1} + 1 – x / \exp(y_n)$.

Newton iteration is extremely efficient. Starting from a sufficiently accurate initial value, the number of correct digits roughly doubles with each iteration; moreover, the algorithm is “self-correcting”, so we only need to use full precision for the last iteration, 1/2 the precision for the second last iteration, 1/4 the precision for the iteration before that, etc. The total cost turns out to be only a fraction more than the time to evaluate a single exponential to the target precision.

In fact, it seems even better to use Halley’s method, which has the update step $y_{n+1} = y_n + 2 – 4t/(x + t)$ where $t = \exp(y_n)$. The amount of work per step is basically the same (one exponential and one division), but now the number of digits triples with each iteration! (We could avoid the division in the ordinary Newton iteration using $1/\exp(y_n) = \exp(-y_n)$, but it is more convenient to limit the basecase exponential to positive arguments at least for now.)

Newton’s method and Halley’s method are just the first two instances of Householder’s method, of order $d = 1$ and $d = 2$ respectively. Using higher-order Householder iterations, we can obtain arbitrarily rapid convergence, but for $d > 2$ the formulas start to become unwieldy. For $d = 3$, we have

$$y_{n+1} = \frac{6 x \left(x+2 t\right)}{x^2+4 x t+t^2}+t-3.$$

I have now implemented a first version of a basecase logarithm function, which evaluates $\log(1+x)$ on the standard interval $[0,1)$. It starts from a double-precision approximation and refines it to the target precision (modulo some rounding error) using Halley iteration. Here is how it performs (timings are in nanoseconds):

prec	system	dd	qd	mpfr	mpmath	new code
53	67			4900	22000	390
64				4800	22000	390
106		900		6700	24000	590
128				7800	26000	600
192				9200	29000	1100
212			20000	10000	30000	1500
256				13000	32000	1500
320				14000	35000	2100
384				16000	39000	2500

Comparing the numbers to the table from the last post, we find that log runs at around half the speed of exp. This ratio should improve at higher precision, but even at low precision, it is not a bad result, and the code is still about an order of magnitude faster than MPFR. Some savings are possible: spending around 70 nanoseconds on the initial value by calling the system logarithm is a bit excessive (a faster, sloppier double-precision logarithm could be used instead), and the division could probably be done slightly faster. But apart from that, the code is quite lean, and the only way to improve it is to improve the exponential function.

I should mention that the current code typically gives 1-2 bits less than full accuracy. To guarantee full accuracy, it is necessary to add a few guard bits. It shouldn’t actually be all that difficult to compute a rigorous error bound for the Halley iteration to make this automatic, at least in principle (translating it to correct and efficient code is probably a bit hairy), but this will have to wait since it requires error bounding for exp first.

This is actually probably the best way to compute the logarithm, up to a precision of several thousand digits. We could of course use a table lookup and a Taylor series correction, just as for exp, avoiding the overhead of Newton/Halley iteration. But a table lookup requires at least one division to transform the argument, and moreover nested table lookups aren’t possible, so we would need a much larger table for the same speed. If we use the convergence acceleration formula $\log(1+x) = 2^n \log((1+x)^{1/2^n})$), the complexity is roughly the same as for exp (with $\exp(x) = (\exp(x/2^n))^{2^n}$), but square roots are so expensive that this only becomes competitive at very high precision. Either way, it would be difficult to get rid of the 2x overhead compared to exp.

fredrikj.net | Blog index | RSS feed | Follow me on Mastodon | Become a sponsor