# A series complexity problem

January 20, 2022

In this post, a curious observation: for certain power series of the form $f = g / h$ (specifically, where $g$ and $h$ are holonomic), the $n$-th coefficient can be computed faster than $\Theta(n \log n)$ — but apparently only barely faster. Is it possible to do even better? Examples of sequences with generating functions of this kind include Bernoulli numbers (exponential generating function $f = g / h$ with $g = X$ and $h = e^X - 1$) and Euler numbers ($g = 1$, $h = \cosh(X)$).

## Background

Given a formula defining a formal power series $f = \sum_{n \ge 0} f_n X^n \in \mathbb{K}[[X]]$, how fast (counting operations in the field $\mathbb{K}$) can we compute the isolated coefficient $f_n$? The result depends on the class of formulas considered. Here are four important complexity results for common classes. (Remark: some of the algorithms assume that the characteristic of the field $\mathbb{K}$ is zero or growing with $n$, and compositions are assumed to be formal compositions.)

1. $O(\log n)$ if $f$ is a rational function $f \in \mathbb{K}(X)$
• Examples: sequences $f_n$ in this class include powers $c^n$ and Fibonacci numbers.
• Algorithm: the sequence $f_n$ satisfies a linear difference equation with constant coefficients over $\mathbb{K}$. Write this recurrence in matrix form and compute the $n$-th power of the matrix.
2. $O(n^{1/2+o(1)})$ if $f$ is a holonomic function (a solution of a linear differential equation with polynomial coefficients)
• Example: sequences $f_n$ in this class include factorials, hypergeometric sequences, harmonic numbers, Motzkin numbers. Algebraic functions are a subset of this class.
• Algorithm: write down a linear difference equation with rational function coefficients for the coefficients. Write the recurrence in matrix form and compute the matrix product using a baby-step giant-step technique, applying FFT-based multipoint evaluation to polynomials of length $O(\sqrt{n})$.
3. $O(n \log n)$ if $f$ is an elementary function, or more generally if $f = g(h_1,\ldots,h_k)$ where $g$ is an elementary function and $h_1,\ldots,h_k$ are arbitrary power series where the first $n$ coefficients are given.
• Example: sequences in this class include include Bernoulli numbers and Bell numbers (elementary exponential generating functions), partition numbers (generating function $1 / h$ where $h$ can be constructed in $O(n)$).
• Algorithm: write $f$ as a composition of arithmetic operations, exponentials and logarithms. Each composition can be performed in $O(n \log n)$ using FFT multiplication and Newton iteration.
4. $O(n^{3/2+o(1)})$ if $f = g(h)$ where $g$ and $h$ are arbitrary power series and the first $n$ coefficients of $g$ and $h$ are given.
• Algorithm: a divide-and-conquer method by Brent and Kung.

The bound in class 1 is optimal since $\Omega(\log n)$ is a lower bound for the arithmetic complexity of exponentiation. Without specific assumptions, improving the upper bounds or proving nontrivial lower bounds in the other cases seems hard. However, we can we get better complexity bounds by restricting to special cases, either making stronger assumptions about the field $\mathbb{K}$ or considering more restricted classes of functions $f$. For example, Bostan, Caruso, Christol and Dumas have shown that it is possible to compute the $n$-th coefficient of an algebraic power series (a subclass of holonomic power series - class 2) using $O(p^{1/2+o(1)} \log n)$ operations when $\mathbb{K}$ has characteristic $p$. For generic composition (class 4), Kedlaya and Umans have given an $O((n \log q)^{1+o(1)})$ algorithm in the case $\mathbb{K} = \mathbb{F}_q$, and a recently discovered randomized (Las Vegas) algorithm achieves $O(n^{1.43})$ expected complexity for generic composition (class 4) assuming that $\mathbb{K}$ is sufficiently large.

## Sums and products of holonomic functions

Holonomic functions satisfy certain closure properties: for example, if $g$ and $h$ are holonomic functions, then $g + h$ and $g h$ are also holonomic. In other words, holonomic functions generate a ring, which contains functions like $f = \sin(X) \cdot \sqrt{1+X} \cdot \log(1+X)$.

Suppose that we are given any function the form $f \in \mathbb{K}[h_1, \ldots, h_k]$ where $h_1, \ldots, h_k$ are holonomic. Using closure properties, we can construct an explicit recurrence relation and initial values defining $f$. Using the method of class 2 above, we thus need at most $O(n^{1/2+o(1)})$ operations to compute the $n$-th coefficient of $f$. This is much better than the $O(n \log n)$ method of expanding the power series $h_1, \ldots, h_k$ and evaluating the multivariate polynomial using power series additions and multiplications (the method of class 3).

The same is true for $g(A)$ where $A$ is an algebraic function, but not in general for $A(g)$. For example, $\exp(X/(1+X))$ is holonomic, but $X / (\exp(X) - 1)$ (the generating function of Bernoulli numbers) is not.

## Quotients of holonomic functions

Holonomic functions are not closed with respect to division, so how can we deal with quotients $f = g / h$, or equivalently, functions of the form $f = R(h_1, \ldots, h_k)$ where $R \in \mathbb{K}(X_1, \ldots, X_k)$ is a multivariate rational function and $h_1, \ldots, h_k$ are holonomic functions? The obvious method of expanding the numerator and denominator into power series of length $n$ and performing the power series division using FFT and Newton iteration costs $\Theta(n \log n)$. The idea to speed things up is to use series multisection. The simplest case is bisection: we can split $f = g/h$ into its even and odd parts $$f_{\text{even}}(X) = \frac{f(X) + f(-x)}{2} = \frac{g(X) h(-X) + g(-X) h(X)}{2 h(X) h(-X)},$$ $$f_{\text{odd}}(X) = \frac{f(X) - f(-x)}{2} = \frac{g(X) h(-X) + g(-X) h(X)}{2 h(X) h(-X)}.$$ To compute a single coefficient $f_n$, we only need one of the parts (according to whether $n$ is even or odd). After deflating the numerator and denominator, we are left with a power series division of length $n/2$ instead of length $n$. More generally, we can write each $m$-section of $f$ as a linear combination of $f$ evaluated at $m$-th roots of unity. This leads to the following algorithm to extract a single coefficient:

1. Choose $m$ and compute recurrence operators and initial values for $g'$ and $h'$ such that $[x^i] (g'/h') = [x^{mi+j}] (g/h)$ for some $j < m$ and $i < n/m$. (Closure properties ensure that we can find such deflated numerator and denominator series $g'$ and $h'$ which are holonomic.)
2. Expand $g'$ and $h'$ to length $n/m$ using recurrences.
3. Do a power series division of length $n/m$.

Multisection is an old trick that has been used specifically for computations of Bernoulli numbers, mainly with small $m$ ($m = 2$ or $m = 4$, say). A detailed analysis can be found in Kevin Hare's master thesis. Hare considers quotients of "poly-exponential functions", but it is easy to see that the idea generalizes to holonomic functions.

The new (to my knowledge) observation I'm going to add here is that we can choose $m$ as a function of $n$ to get an asymptotic speedup. Hare does not analyze this kind of algorithm, writing: "Unfortunately for large $m$ it becomes impractical to determine what these lacunary recursion formulae are as the time to determine the recursion formulae and the complexity of these recursion formulae far exceeds the time to calculate these values with smaller gaps". Nevertheless, here is a heuristic analysis of what happens when we vary $m$. Assume that the order and degree of the recurrences increase exponentially with $m$ so that step 1 costs $O(\exp(A m))$ and step 2 costs $O(n \exp(A m))$ for some constant $A$. The total complexity of the algorithm is then $O(\exp(A m) + n \exp(A m) + (n/m) \log(n/m))$. Now, if we choose $m = \log \log \log n$, then the complexity is $$O\left(n (\log \log n)^A + \frac{n}{\log \log \log n} \log\left(\frac{n}{\log \log \log n}\right)\right) = O\left(\frac{n \log n}{\log \log \log n}\right)$$ which is asymptotically faster than $\Theta(n \log n)$! The order and degree bounds for closure properties of holonomic functions (and the running times of algorithms to compute closure properties) can be made explicit, so it should be possible to do this analysis rigorously and with explicit constants.

## Open problems

A small research project could be to perform a rigorous complexity analysis of the multisection method and determine the theoretically optimal value of $m$ as a function of $n$. (Hopefully, the basic analysis above is correct so that there really is a speedup!) Here are some harder problems to think about:

• Is there an even faster algorithm? I don't see a way to do better using multisection alone, but if there is a "dumb" algorithm that performs barely better than $\Theta(n \log n)$, perhaps there is a more clever algorithm that is substantially better (sublinear in $n$?).
• What can be done about the bit complexity, e.g. over $\mathbb{K} = \mathbb{Q}$?
• Can this be generalized to other operations? What about algebraic functions of holonomic functions, exponentials of holonomic functions, etc.? It is worth noting that a method for coefficient extraction of quotients immediately translates to a method for coefficient extraction of logarithms (since the logarithm of a power series is just a quotient followed by term-by-term integration), but I have no idea how to do this for exponentials.
• A special case: what is the arithmetic complexity of computing the $n$-th Bernoulli number? (Here one might have to be a bit careful about the model of computation.)

I have mentioned this problem to a few experts in the field, but so far without much response.