# A FLINT example: Lambert W function power series

March 11, 2011

Two days ago, a new version of the Fast Library for Number Theory (FLINT) was released. I contributed a lot of new code to this release, including linear algebra speed improvements and new functionality for fast power series arithmetic and computation of special numbers and polynomials (see the release announcement and some of my benchmarking results).

In this blog post I’ll demonstrate how to do power series arithmetic with FLINT, using its fmpq_poly module which implements polynomials over the rational numbers Q. Standard operations (addition, multiplication and division) were available before; the functions I’ve added include square root, log, exp, sin, tan, atan, etc. (all the usual elementary functions). The same functions are also available for power series over a finite field Z/nZ (with word-size n). Everything is asymptotically fast (the running time is linear in the size of the output, up to logarithmic factors).

Of course, transcendental functions are a bit restricted when considered over Q or Z/nZ, since it’s only possible to obtain power series expansions at specific rational points (in most cases just x = 0). So at present, some very interesting numerical applications of fast power series arithmetic are not supported. But some time in the future, we’ll probably add support for numerical power series over the reals and complexes as well.

As today’s example, let us implement the Lambert W function for the power series ring Q[[x]]. The Lambert W function is defined implicitly by the equation x = W(x) exp(W(z)), which can be solved using Newton iteration with the update step w = w – (w exp(w) – x) / ((w+1) exp(w)).

Power series Newton iteration is just like numerical Newton iteration, except that the convergence behavior is much simpler: starting with a correct first-order expansion, each iteration at least doubles the number of correct coefficients.

A simple recursive implementation with asymptotically optimal performance (up to constant factors) looks as follows:

#include
#include "flint.h"
#include "fmpq_poly.h"

void lambertw(fmpq_poly_t w, fmpq_poly_t x, long n)
{
if (n == 1)
{
fmpq_poly_zero(w);
}
else
{
fmpq_poly_t t, u, v;
lambertw(w, x, (n + 1) / 2);
fmpq_poly_init(t);
fmpq_poly_init(u);
fmpq_poly_init(v);
fmpq_poly_exp_series(t, w, n);
fmpq_poly_mullow(u, t, w, n);
fmpq_poly_sub(v, u, x);
fmpq_poly_div_series(u, v, t, n);
fmpq_poly_sub(w, w, u);
fmpq_poly_clear(t);
fmpq_poly_clear(u);
fmpq_poly_clear(v);
}
}


Beyond the base case W(x) = 0 + O(x), the function just computes w to accuracy ceil(n/2), and then extends it to accuracy n using a single Newton step. As we can see, C code directly using the FLINT library interface gets a bit verbose, but this style has the advantage of giving precise control over temporary memory allocation, polynomial lengths, etc. (it is very similar to the interface of GMP/MPIR).

We add a simple test main routine:

int main()
{
fmpq_poly_t x;
fmpq_poly_t w;
fmpq_poly_init(x);
fmpq_poly_init(w);
fmpq_poly_set_coeff_ui(x, 1, 1);
lambertw(w, x, 10);
fmpq_poly_print_pretty(w, "x");
printf("\n");
fmpq_poly_clear(x);
fmpq_poly_clear(w);
}


The output of the program is:

531441/4480*x^9 - 16384/315*x^8 + 16807/720*x^7 - 54/5*x^6 + 125/24*x^5 - 8/3*x^4 + 3/2*x^3 - 1*x^2 + 1*x


It is well known that the coefficients in this series are given in closed form by (-k)k-1 / k!, so we can check that the output is correct.

Computing 1000 terms takes just a few seconds. If this sounds like much, remember that the coefficients grow rapidly: together, the computed numerators and denominators have over 2 million digits!

So far this is perhaps not so interesting, as we could compute the coefficients faster using a direct formula. But the nice thing is that arbitrary compositions are allowed, i.e we can compute W(f(x)) for any given power series f, and this will still be just as fast.

Let’s consider a nontrivial example: the infinite “power tower” T(z) = zzzz... A moment’s reflection shows that this is an analytic function with a rational power series expansion around z = 1. In fact, we have explicitly T(z) = W(-log(z))/(-log(z)). We can compute this series expansion (in the shifted variable x = z – 1) as follows:

int main()
{
fmpq_poly_t x;
fmpq_poly_t w;
long n = 10;
fmpq_poly_init(x);
fmpq_poly_init(w);
fmpq_poly_set_coeff_ui(x, 0, 1);
fmpq_poly_set_coeff_ui(x, 1, 1);
fmpq_poly_log_series(x, x, n + 1);
fmpq_poly_neg(x, x);
lambertw(w, x, n + 1);
fmpq_poly_shift_right(w, w, 1);
fmpq_poly_shift_right(x, x, 1);
fmpq_poly_div_series(w, w, x, n);
fmpq_poly_print_pretty(w, "x");
printf("\n");
fmpq_poly_clear(x);
fmpq_poly_clear(w);
}


The only complication is that fmpq_poly_div_series requires a nonzero leading coefficient in the denominator, so we must shift both series down one power.

The program outputs:

118001/2520*x^9 + 123101/5040*x^8 + 4681/360*x^7 + 283/40*x^6 + 4*x^5 + 7/3*x^4 + 3/2*x^3 + 1*x^2 + 1*x + 1

To make things nicer, we assume that the coefficients have the form ak / k! (i.e. that T(z) is the exponential generating function for ak) and change the output code to something like the following:

long k;
mpq_t t;
mpz_t u;
mpq_init(t);
mpz_init(u);
for (k = 0; k < n; k++)
{
fmpq_poly_get_coeff_mpq(t, w, k);
mpz_fac_ui(u, k);
mpz_mul(mpq_numref(t), mpq_numref(t), u);
mpq_canonicalize(t);
gmp_printf("%Qd ", t);
}
mpq_clear(t);
mpz_clear(u);


This indeed gives us an integer sequence:

1 1 2 9 56 480 5094 65534 984808 16992144

Now what is the value of the 1000th coefficient (to be precise, a1000, the initial one being the 0th!) in this sequence? After a simple modification of the program, 2.9 seconds of computation gives:

116088723416360877058165139472971678305682655888757200617040183288023530426756681791214166146936295338906200405380900565797054717998071778437757582562676432270594729770831984037179011167877182932317695683927346108840781529292782919617415801022889763531998203556748720236870472740313747820376836354056589570878404139562784693762331122998711070595645913436447537334994232839721368275902686875807251095288080395306471091025409811078916244347343343375806012255865925818202775569656436509351036076228649393400187670469063215003559774586495010151736330831006687588008043886163633208133324925968354018598718396321446522507297042269011590554350050765064097808856685726892919091844572545581642428942983342505179168857619230316014346424101371730872734534492192176599495608409492914591040791939356414531202971705769303257234151456918871942207889248610196901459400077483577940763454422516589494589386979762908326280910675714898537511196619258057757601829560715165754755469941168861084140499195252056413724265130518619966880917401902668151574186675809680229260294868082194497633384642944873208313626575767679265889756445878063163639282166245308180447623432893312520697087313187138285220141409331942812710129867491990841736391939490562342870154316209797955556381777937576606896211985949120247041122030144008552040487919104081821646288468944794572548379308285499126418611400713712447555062853630274495412279277142852027491666742488186890767945371565766096452794814548702964428648297660149787638501522977387119357596043059939423242161640102515280896797542967829629757402705726445239053261557399630212654678115919485631223995547355297477425151029625304838666187951874709256802926224889173882107084716891403043088761748938211657131479578425767585519331805968937010542495567221591600504522701519356853332139872512204043830445131201157613311750725449188186072484468315734307808390196624736783193070534665116557731933519958498663270193078704185994119446629783305199163258244436211827836670241745954935539341498910525641015621246608253851978785829794919003347187955531964814287965653050322140399695072998272983889906823049155302053273484019653833081580196857296769881600411144855641888964455021209598897362668473406912526816735047448372816163718832244604054261282083620649731423678182582137133666912162187578149277916758677659326221406922607543435597637586885441804409524773454375858826053548656981688502940651435148227696208156279868460423027051552771077659399889469617306015354335528530235916712574337562579736559278351853549825129834280128952701817672970606139463650468155476330275845066948765336085851188608302309056603401440047692698200295529572915618836122163118770906896634410940116898688481585685180958996837198544863615413808321802623327256966120967255251353141629521865937921459938657771439492527626159018195922050167504883881038997644963556212956342228712695352450134112412161126957056000000000000000000000000000000000000000000

In fact, if we look up the first 10 coefficients in the On-Line Encyclopedia of Integer Sequences, we find http://oeis.org/A033917. This OEIS entry lists the representation

a(n) = Sum_{k=0..n} Stirling1(n, k)*(k+1)^(k-1)

Since FLINT supports fast vector computation of Stirling numbers, this formula can be implemented efficiently:

#include "fmpz.h"
#include "fmpz_vec.h"
#include "arith.h"

void coefficient(fmpz_t a, long n)
{
long k;
fmpz * s;
fmpz_t t;

s = _fmpz_vec_init(n + 1);
fmpz_stirling1_vec(s, n, n + 1);
fmpz_init(t);
fmpz_zero(a);

for (k = 1; k <= n; k++)
{
fmpz_set_ui(t, k + 1);
fmpz_pow_ui(t, t, k - 1);
}

_fmpz_vec_clear(s, n + 1);
fmpz_clear(t);
}

int main()
{
fmpz_t a;
fmpz_init(a);
coefficient(a, 1000);
fmpz_print(a);
printf("\n");
fmpz_clear(a);
}


And indeed, the output turns out to be the same!

This program is faster, taking only 0.1 seconds to run. But of course, it only gives us a single coefficient, and would be slower for computing a range of values by making repeated calls.

Similar ideas to those presented here (basically, reducing a problem to fast polynomial multiplication using generating functions, Newton iteration, etc.) are used internally by FLINT for computation of the standard elementary functions themselves as well as various special numbers and polynomials (Bernoulli numbers and polynomials, partitions, Stirling numbers, Bell numbers, etc). The internal code uses a lot of tricks to reduce overhead and handle special cases faster, however. (See the previous blog post Fast combinatorial and number-theoretic functions with FLINT 2, and for more recent information the release announcement and benchmarks page linked at the top of this post.)

In other news, I haven’t written a lot of code for mpmath or Sage recently. Of course, my hope is that FLINT (2) will make it into Sage in the not too distant future. The fast polynomial and power series arithmetic support in FLINT will also be very useful for future special functions applications (in mpmath and elsewhere).