fredrikj.net / blog /

Coping with a big argument

March 26, 2013

Implementing special functions so that they work properly on the entirety of their (representable) domain is a very hard problem. Take for example the digamma function: it behaves like $\psi(z) \sim \log z$, so you would never really expect it to overflow. However, Mathematica (as of version 9.0.1) runs into problems even with a rather small input:

In[2]:= N[PolyGamma[-10 + 10^7 I]]
Out[2]= 16.1181 + 1.5708 I

In[3]:= N[PolyGamma[-10 + 10^8 I]]
Out[3]= 18.4207 + 1.5708 I

In[4]:= N[PolyGamma[-10 + 10^9 I]]
General::ovfl: Overflow occurred in computation.
General::ovfl: Overflow occurred in computation.
Out[4]= Indeterminate

It only throws a worse tantrum if you increase the precision:

In[5]:= N[PolyGamma[-10 + 10^9 I], 1000]
General::ovfl: Overflow occurred in computation.
General::ovfl: Overflow occurred in computation.
General::ovfl: Overflow occurred in computation.
General::stop: Further output of General::ovfl will be suppressed during this calculation.
N::meprec: Internal precision limit $MaxExtraPrecision = 50. reached while evaluating
    PolyGamma[0, -10 + 1000000000 I].
Out[5]= Indeterminate

Strangely enough, it works fine if the -10 is removed:

In[6]:= N[PolyGamma[10^9 I]]
Out[6]= 20.7233 + 1.5708 I

In[7]:= N[PolyGamma[10^30 I]]
Out[7]= 69.0776 + 1.5708 I

The probable explanation is that when the real part is negative, Mathematica uses the functional equation

$$\psi(1 – z) – \psi(z) = \pi\,\!\cot{ \left ( \pi z \right ) }$$

and isn’t clever enough about evaluating the cotangent. Indeed,

In[9]:= Cot[N[Pi (-10 + 10^9 I)]]
General::ovfl: Overflow occurred in computation.

Out[9]= Indeterminate

When $b$ is large, the real and imaginary parts of $\cot(a+bi)$ behave like $O(e^{-2b})$ and $-1 + O(e^{-2b})$ respectively. Most likely, Mathematica evaluates the cotangent using a formula such as

$$\cot(a+bi) = -\frac{\sin (2 a)}{\cos (2 a)-\cosh (2 b)}+\frac{i \sinh (2 b)}{\cos (2 a)-\cosh (2 b)}$$

where if $b$ is huge, the hyperbolic functions overflow, even though the quotient should come out as something small.

We find the same problem in other systems, such as Pari:

? psi(-10 + 10^6 * sqrt(-1))
%1 = 13.81551055801935743743825707 + 1.570806826794896234231321717*I
? psi(-10 + 10^9 * sqrt(-1))
  ***   at top-level: psi(-10+10^9*sqrt(-1
  ***                 ^--------------------
  *** psi: the PARI stack overflows !
  current stack size: 4000000 (3.815 Mbytes)
  [hint] you can increase GP stack with allocatemem()

  ***   Break loop: type 'break' to go back to GP

If we try the evaluation in mpmath, it works a little better:

>>> print digamma(-10 + 1e9j)
(20.7232658369464 + 1.5707963372949j)

It also works with a much bigger imaginary part, but here it takes a long time to complete:

>>> print digamma("-10 + 1e1000000j")
(2302585.09299405 + 1.5707963267949j)

In fact mpmath also uses a naive formula for the cotangent: the difference is just that it has arbitrary-precision exponents. The reason it gets slow is that in order to compute the exponential function with something of order $10^{{10^6}}$ as input, it needs some $10^6$ digits of $\log 2$ for argument reduction, and outputs a number with roughly as many digits in the exponent. It works, but it’s not very practical, and you’re obviously out of luck if you kick the imaginary part up to $10^{10^{15}}$.

The best solution is to evaluate the cotangent using different formulas depending on the location in the complex plane. For example, the imaginary part of $\cot(a+bi)$ can be written as

$$\frac{e^{4b}-1}{1 – 2 e^{2b} \cos(2a) + e^{4b}}.$$

If $b$ is a large negative number, the exponentials become tiny (for large positive $b$ one just rewrites the formula in terms of $e^{-4b}$ and $e^{-2b}$ instead).

I’ve recently been implementing more elementary functions in Arb, paying attention to such issues. Testing the digamma function, it works as expected. The code

#include "fmpcb.h"

int main()
{
    fmpcb_t x, y;

    fmpcb_init(x);
    fmpcb_init(y);

    long prec = 53;

    fmpcb_set_ui(x, 10);
    fmpcb_pow_ui(x, x, 1000000, prec);
    fmpcb_mul_onei(x, x);
    fmpcb_sub_ui(x, x, -10, prec);

    fmpcb_digamma(y, x, prec);

    fmpcb_printd(y, 15);
    printf("\n");

    fmpcb_clear(x);
    fmpcb_clear(y);
}

outputs:

(2302585.09299405 + 1.5707963267949j)  +/-  (1.58e-11, 2.67e-17j)

The computation just takes some 30 microseconds, so it’s no slower than with a tiny argument.

In fact, I have now fixed almost all functions in Arb to work properly with huge exponents. The only missing function is binary-to-decimal conversion.

To illustrate, here are the successive values of $x$ if you set $x = 1/2$ and then repeatedly compute $x = \exp x$ at 53-bit precision:

(7425180500362907 * 2^-52) +/- (1 * 2^-52)
(5855046294128313 * 2^-50) +/- (617423411 * 2^-78)
(6380028057516027 * 2^-45) +/- (941782241 * 2^-71)
(6853548015219477 * 2^209) +/- (358430563 * 2^192)
(+inf) +/- (+inf)
(+inf) +/- (+inf)

When the output becomes expensive (or even physically impossible) to represent, the exponential function of a positive argument automatically overflows to an infinite interval. The maximum allowed exponent is not fixed but varies roughly proportionally with the precision. At 150-bit precision, it becomes possible to do one more iteration:

(588283407380499048463925620267632796905727263 * 2^-148) +/- (1 * 2^-149)
(231942279659869632990725740596674949038034297 * 2^-145) +/- (617423411 * 2^-175)
(1010955799572893508099758397146956100274197425 * 2^-142) +/- (941782241 * 2^-168)
(1085988031898535876444393234034937166751112959 * 2^112) +/- (358430563 * 2^95)
(1334524050391896112657493127448630305501140025 * 2^8135028756631480222606112090277357169611889599342489768863892054745940119766089) +/- (383295431 * 2^8135028756631480222606112090277357169611910086956529576124135528869507204498460)
(+inf) +/- (+inf)

Conversely, with a too big negative number as input, the exponential function outputs something like $0 \pm 2^{-n}$ where $n$ is roughly proportional to the working precision. The output has very poor relative accuracy, but it’s perfectly useful as a magnitude bound. This nice behavior is exploited in the functional equation for the digamma function.

Likewise, for trigonometric functions, if the real part of the input is larger than $2^n$ where $n$ roughly is proportional to the precision, an output similar to $0 \pm 1$ is produced immediately, rather than internally trying to compute $n$ bits of $\pi$ for the argument reduction.

It’s a bit inconvenient to not always be guaranteed the best floating-point result, for example if asking for a 53-bit value of $\cos(10^{100000})$. However, such situations can be dealt with by repeatedly increasing the precision until the output is accurate. Of course, the point of automatic error propagation is that it allows doing such things robustly. What we gain is that the worst-case cost of an evaluation becomes predictable, and we don’t have to fear that the computation might hang or crash just because of large input (the worst thing that can happen is that the output bounding interval never improves to anything better than $0 \pm \infty$).