Timings for the complex gamma function in Arb
February 20, 2013
I’ve now implemented the gamma function for complex arguments in Arb. This is important to have in many number theory applications, and for evaluation of other special functions.
There are actually three functions: for computing $\Gamma(z)$, $1/\Gamma(z)$, and $\log \Gamma(z)$ (with correct branch cuts, i.e. not just first evaluating $\Gamma(z)$ and then taking the principal logarithm), but of course they share most of the internal evaluation code, which uses the asymptotic Stirling series.
How fast is it? The following table shows timings in seconds for evaluating $\Gamma(z)$ where $z = \sqrt 2 + i \sqrt 3$, at various levels of precision. I’ve included Pari/GP 2.5.3 and mpmath 0.17 (both via sage-5.7.beta1) and Mathematica 8.0 for comparison. The test was done on a 64-bit system running on an Intel Xeon X5675 3.07 GHz CPU. A number in parentheses measures the first call (accounting for the time to precompute the series coefficients, i.e. Bernoulli numbers) while a number not in parentheses measures a repeated call (with coefficients cached).
|1000||0.013 (0.26)||0.026 (0.37)||0.020 (0.10)||0.0080 (0.010)|
|3000||0.19 (6.5)||0.36 (3.6)||0.31 (1.7)||0.10 (0.18)|
|10000||2.9 (235)||6.4 (95)||5.8 (44)||1.6 (3.1)|
|30000||38 (6030)||92 (3030)||80 (887)||22 (45)|
As usual, Arb takes a complex interval as input and uses interval computations throughout to produce a rigorous complex interval as output (the other systems output a floating-point number that only heuristically is accurate roughly to full precision).
Arb is the clear champion at high precision, especially for the first call where it is orders of magnitude faster than the other systems.
Pari/GP is about twice as fast as Arb at low precision. This probably comes down to the fact that a handful of functions in Arb currently are very poorly optimized (including addition, logarithm, and arctangent), and I expect that improving these should allow breaking even with Pari.
I also suspect that the Arb timings for repeated evaluation at several thousand digits can be improved much further (perhaps by as much as a factor two) with some tricks that I haven’t implemented yet.