# More digits of zetazero(1)

August 4, 2013

As mentioned in the earlier post about computing 100,000 digits of $\rho_1$, the first nontrivial zero of the Riemann zeta function, the code wasn’t quite optimal.

The expensive part of computing $\zeta(s)$ is evaluating $\sum_{k=1}^N k^{-s}$ for some large integer $N$. When not computing any derivatives, I already exploited the fact that the power $k^{-s}$ only needs to be computed from scratch when $k$ is prime, since we can use $(i j)^{-s} = i^{-s} j^{-s}$ when $k = ij$. So by storing powers in a table, we can greatly improve speed at the cost of memory efficiency (the space complexity becomes quadratic in the precision).

I have now implemented this trick for (short) power series as well, along with some small enhancements to the code. Another small speedup comes from computing logarithms of nearby primes using binary splitting. Finally, I improved the computation of the Riemann-Siegel theta function a bit by using the log gamma and digamma functions directly instead of using the less-optimized log gamma power series code (this change has not yet been committed).

Here are the old and new timings (in seconds) for $\rho_1$ and $\zeta(\rho_1)$:

 Digits Old code New code $\rho_1$ $\zeta(\rho_1)$ $\rho_1$ $\zeta(\rho_1)$ 10,000 159 20 24 12 100,000 55,507 5182 5515 2749

Computing $\rho_1$ to 100,000 digits got 10x faster, and the following single evaluation of $\zeta(\rho_1)$ for verification got 1.9x faster. Not bad! At 10,000 digits, Arb is now 141x / 51x faster than Mathematica and 274x / 18x faster than mpmath.

To push the limits a bit further, I’ve now computed $\rho_1$ to 303,000 digits, or just over 1 million bits. This took 73,225 seconds (20 hours) on a CPU about 20% slower (2.00 GHz Xeon E5-2650) than the one used for the benchmarks above. The computation used up to 62 GiB of memory, and involved generating all the Bernoulli numbers up to $B_{325,328}$ as exact fractions! Evaluating $\zeta(\rho_1)$ to the same precision afterwards for verification took 31,772 seconds (9 hours).

Update: here also is a plot of the geometric means of the continued fraction, suggesting convergence to Khinchin’s constant:

It should be possible to calculate 1 million digits, assuming that one has 600 GiB of RAM and 1-2 weeks of patience. A better way to do that would be to write a parallel and more memory-efficient version, though. The power sum could obviously be done in parallel (perhaps without any sieving to avoid the quadratic memory consumption, or with partial sieving to get a better speed/memory tradeoff). The tail in the Euler-Maclaurin summation formula should be possible to parallelize as well, though that’s slightly more involved (Arb already provides the necessary tools to distribute the computation of Bernoulli numbers).

Remaining potential improvements that come to mind are:

• Using Gauss/Karatsuba multiplication to speed up the complex number/power series arithmetic
• Using a more complicated sieving strategy to collect small factors (e.g. 3 and 5) in the power sum
• Using complex Newton iteration for $\rho_1$ to avoid evaluating the log gamma function altogether
• Using approximations instead of exact values for the high-index Bernoulli numbers

More interesting (and useful) would be to thoroughly optimize for low precision (up to a couple hundred digits) and for large imaginary parts (by implementing the Riemann-Siegel expansion). This is a big project, and I doubt I will have time to work on it any time soon!

In related news, I have also added a function that calculates Glaisher’s constant $A = \exp(1/12 – \zeta’(-1))$. Thus another table of timings (in seconds) must be shown:

 Digits Mathematica mpmath Arb 1000 0.59 0.40 0.080 10,000 181 98 5.6 100,000 ? ? 1182

Wee see that evaluating $\zeta’(s)$ at an integer is slightly faster than at a generic complex number.