Generics in FLINT

July 19, 2022

My main project this year (apart from making performance improvements to Arb) is a generics system for FLINT written in plain C. The goals of this system include:

to provide a unified, consistent interface to all FLINT, Arb, Antic and Calcium types
to allow constructing new, fully recursive domains (polynomials, matrices, series, etc.) over existing domains, with high performance
to extend the scope of Arb- and Calcium-style mathematically rigorous computation over noncomputable domains
to reduce bloat while maintaining and improving performance of the aforementioned libraries

The generics code is currently part of a separate git repository called generic-rings (rough documentation here) but my intention is to merge the code with FLINT, Arb, etc. when it is fully usable. I'm probably half way towards that goal right now, and there could be a stable release before the end of the year if things go smoothly. This post will discuss the motivation and design of the system; as usual, feedback and code contributions are appreciated.

Contents

Quick overview
Interfaces are hard
Aiming for correctness
Simplifying and extending FLINT and friends

Quick overview

The approach is pretty standard. I'm using a parent-element model rather similar to SageMath or Nemo. A parent (mathematical domain) is represented by a context object of type gr_ctx_t which stores a method table (for all the usual operations like init, clear, set, equal, add, mul, sqrt, exp, ...) and information like the size and type of elements, mathematical properties of the domain, precision, modulus, precomputed data, option flags, etc.

Elements of basic rings are represented by the usual FLINT/Arb/Antic/Calcium types (fmpz_t, arb_t, etc.) which are passed around by void pointers with type alias gr_ptr. This means that there is no space overhead (a vector of 100 fmpz_t elements takes up exactly 100 words), and existing FLINT functions can be used freely together with generic functions when the particular type is known. There is also a type-agnostic interface for stack-based or heap-based allocation of single elements or vectors of elements. There are then types for structures with generic elements (gr_mat_t, gr_poly_t, gr_mpoly_t, etc.) which themselves can be used as generic elements, making the system fully recursive.

Here is a short C program to illustrate the syntax:

#include "gr.h"

int main()
{
    gr_ctx_t QQ, RR;
    gr_ptr x, y;
    int status = GR_SUCCESS;

    gr_ctx_init_fmpq(QQ);
    gr_ctx_init_real_arb(RR, 53);

    /* efficient stack based allocation of variables */
    GR_TMP_INIT(x, QQ);
    GR_TMP_INIT(y, RR);

    /* compute 1/3 in QQ */
    status |= gr_set_ui(x, 1, QQ);
    status |= gr_div_ui(x, x, 3, QQ);

    /* convert to RR -- we can either use the knowledge that x is
       an fmpq_t, or use a generic conversion */
#if 0
    status |= gr_set_fmpq(y, x, RR);
#else
    status |= gr_set_other(y, x, QQ, RR);
#endif

    gr_ctx_println(QQ);
    gr_println(x, QQ);
    gr_ctx_println(RR);
    gr_println(y, RR);

    assert (status == GR_SUCCESS);

    GR_TMP_CLEAR(x, QQ);
    GR_TMP_CLEAR(y, QQ);

    gr_ctx_clear(RR);
    gr_ctx_clear(QQ);

    return 0;
}

This is the output:

Rational field (fmpq)
1/3
Real numbers (arb, prec = 53)
[0.3333333333333333 +/- 7.04e-17]

The C code is quite verbose, but things look nicer when you wrap the generics interface in a higher-level language. The repository includes a work-in-progress Python wrapper using ctypes; on the Python side, it takes only a few hundred lines of code to expose all the basic functionality for all currently available types and domains. This wrapper is only meant for testing right now; when things are more stable, I plan to do a more efficient Cython rewrite which should be able to replace the current Python-FLINT package. Here is the previous example in Python:

>>> from flint import *
>>>
>>> # already defined by default:
>>> # QQ = RationalField_fmpq(); RR = RealField_arb(prec=53)
>>>
>>> x = QQ(1) / 3
>>> y = RR(x)
>>> 
>>> QQ
Rational field (fmpq)
>>> x
1/3
>>> RR
Real numbers (arb, prec = 53)
>>> y
[0.3333333333333333 +/- 7.04e-17]

For a less trivial example, we compute $b = a^2$ where $$a = \begin{pmatrix} 1+(i/3)x & 1 \\ 1 & 1 \end{pmatrix}$$ in $\operatorname{Mat}_{2\times 2}(\overline{\mathbb{Q}}[x])$, then convert the result to $c = b$ with $c \in \operatorname{Mat}_{2\times 2}(\mathbb{C}[x])$, and finally compute $\det(b)$ and $\det(c)$ in the respective domains:

>>> Rx = PolynomialRing(QQbar)
>>> x = Rx([0, 1])
>>> M = MatrixRing(Rx, 2)
>>> i = QQbar(-1).sqrt()
>>> a = M([[1 + (i/3)*x, 1], [1, 1]])
>>> b = a**2
>>> c = MatrixRing(PolynomialRing(CC), 2)(b)
>>> b
[[[2, Root a = 0.666667*I of 9*a^2+4, -1/9], [2, Root a = 0.333333*I of 9*a^2+1]],
[[2, Root a = 0.333333*I of 9*a^2+1], [2]]]
>>> c
[[[2.000000000000000, [0.666666666666667 +/- 4.82e-16]*I, [-0.1111111111111111 +/- 1.89e-17]], [2.000000000000000, [0.3333333333333333 +/- 7.04e-17]*I]],
[[2.000000000000000, [0.3333333333333333 +/- 7.04e-17]*I], [2.000000000000000]]]
>>> b.det()
[0, 0, -1/9]
>>> c.det()
[0, [+/- 4.45e-16]*I, [-0.1111111111111111 +/- 9.75e-17]]

(Polynomials currently print as lists, so the result is $-x^2/9$.)

The interface is still too incomplete to do anything really useful, but this should give an idea of what the end result will look like. There is also work-in-progress support for groups, residue rings and finite fields.

Interfaces are hard

Why bother with this? One reason is user-friendliness. There are currently many ways to use FLINT and related libraries, none perfect:

The C interface itself (low-level, inconsistent)
Nemo.jl/Oscar (the most complete high-level interface, but requires Julia which makes it a non-option e.g. for Python users; also suffers from some historical design decisions and design flaws in Julia)
SageMath (only wraps some types, often buggy, design flaws, requires bloated Sage library)
Python-FLINT (only wraps some types, some design flaws, hard to install and doesn't work on Windows, poorly maintained the last few years)
A poorly-maintained, incomplete and overly complicated C++ interface
Arblib.jl (Arb types only)
PariTwine for interfacing with Pari/GP (highly incomplete)
Other third-party bindings

When I was younger and even more naive than today, I thought that the challenge in mathematical software is to implement really fancy algorithms to solve really fancy mathematical problems really efficiently. Once you have code to solve $X$, it is a simple matter for people who need to solve $X$ to to call your code.

That turns out to be wrong. A lot of the time, people will not use your fancy code, for one of the following reasons: they don't understand the interface, they don't like the interface, there is no interface in their favorite programming language, or the software is too complicated to install. Worse, they may use your code incorrectly and conclude that your code is incorrect (or perhaps they never notice, and end up with a wrong theorem in a paper or an exploding rocket or something).

The real challenge is to make code accessible, functional and integrated across library, language and system barriers. This requires massive developer effort. Projects like SageMath and Oscar depend on many, many person-years spent solely on developing and maintaining interfaces between components, with inadequate funding and mixed results.

Until the DeepAIs and OpenMinds of the world figure out how to create AI-based tools that solve the integration problem automatically (and I doubt they'll see any profit incentives to get into computer algebra any time soon), it seems the most constructive thing to do is to try to make interfaces simpler for human developers and users. What does this mean for FLINT and its wrappers?

FLINT and its descendants expose hundreds of C types and thousands of functions, with many inconsistencies in the API or the semantics. Here are some examples of inconsistencies:

Some types (e.g. fmpz) are passed by reference, others (e.g. ulong) by value.
Some types depend on a context object to encode information about the parent domain (e.g. fmpz_mpoly_t); others store such information as part of the type (e.g. nmod_poly_t) while others require passing around extra parameters (e.g. the precision for arb_t).
Some functions return error codes, some throw exceptions, some abort (flint_abort), some allow nonstop computation by incorporating NaN-like values (e.g. arb_t).
Some mathematical domains have a canonical dedicated type (e.g. fmpq for rational numbers); others have multiple implementations (e.g. fq, fq_nmod, fq_zech).
Most types are exact, but some are inexact (e.g. arb_t) while others are exact but implement undecidable operations which can fail (e.g. ca_t). Predicates are handled differently (representation-based comparisons, pessimistic interpretations, triple-valued logic, errors).
There are interface differences between homologous functions; for example fmpz_set_si takes one input parameter but fmpq_set_si takes two (both a numerator and a denominator). Some functions take the arguments in a nonstandard order.
Some types miss useful type variants of standard functions (e.g. add_ui).
Most but not all functions allow aliasing.
Matrices, unlike other types, require allocating results with the right shape.
Varying support for text- and file-based input and output and conversions between different types.

Many of these inconsistencies exist for good technical reasons while others are historical accidents; either way, inconsistencies pose a problems both for C users and for wrappers in other languages. Adding to this complexity, wrappers in languages like Julia and Python introduce their own problems:

They need to handle conversions between types (potentially $N^2$ conversions for $N$ types).
They need to handle interactions with external types and with language constructs that often weren't designed with mathematics in mind.
Lack of static type checking makes it difficult to ensure correctness.
The C calls require duplicating type information from C headers. This often requires a lot of manual work and can be error-prone when there is no type checking built into the C FFI. (You'd wish there was a standardized automatic solution that didn't involve the clusterfun of parsing header files. Arblib.jl has an interesting approach, generating type signatures automatically from Arb's documentation instead!)

I've spent a lot of time myself working on several wrappers for FLINT and it just isn't fun. All of this has led me to conclude that FLINT needs a drastically simpler C interface. Changing the existing interfaces and breaking compatibility is not an option, but a generics layer that provides a uniform interface to the existing types (in essence, an interface to C in C) can help solve the problem.

Aiming for correctness

The FLINT generics system will not be perfect in terms of mathematical correctness, but it will benefit from several lessons I've learned working on Arb and Calcium and studying the design of computer algebra system and proof assistants. Correctness is not strictly a question of semantics: the interface design is closely related.

Consistent error handling

One of the major design decisions is that all operations return a status code (GR_SUCCESS, GR_DOMAIN or GR_UNABLE). This makes it possible to catch and gracefully recover from a variety of errors: overflow, conversions to the wrong type, undecidable predicates, missing implementations, and so on. One of the lessons that came out of Calcium is that it is useful to distinguish between precisely two classes of errors:

a "domain" error is a mathematical problem (division by zero, attempt to convert a non-integer to an integer, etc.)
an "unable" error is an implementation problem (undecidable comparison because of inexact representations, result too large to store in memory, missing algorithm, etc.)

For example, a "domain" error computing the inverse $A^{-1}$ of a matrix proves that the matrix is singular while an "unable" error signifies that the implementation does not know how to compute or prove non-existence of an inverse (for numerical or algebraic reasons). To illustrate this, note the different exceptions thrown by the Python wrapper when trying to invert a singular matrix using exact and inexact arithmetic:

>>> a = MatrixRing(QQ, 2)([[1, QQ(1)/3], [3, 1]])
>>> b = MatrixRing(RR_arb, 2)(a)
>>> 
>>> a ** (-1)
  ...
ValueError: x ** y is not defined for x = [[1, 1/3],
[3, 1]]
, y = -1 over Ring of 2 x 2 matrices over Rational field (fmpq)
>>> b ** (-1)
  ...
NotImplementedError: unable to compute x ** y for x = [[1.000000000000000, [0.3333333333333333 +/- 7.04e-17]],
[3.000000000000000, 1.000000000000000]]
, y = -1 over Ring of 2 x 2 matrices over Real numbers (arb, prec = 53)

These two status flags subsume a lot of special-purpose error-handling logic currently used in FLINT and especially in Arb and Calcium which in many cases do not distinguish between "domain" errors and "unable" errors.

Many applications will just want to intercept any errors and throw appropriate exceptions or terminate execution, but it is possible to use the flags as part of fine-grained control flow.

While it may seem extraneous to return an error code from a function like gr_add which rarely should be expected to fail, one of the interesting applications is that one can implement interruptible computations. For example, it is possible to have a size-checked domain of integers limited to 1000 digits so that one can try a computation over $\mathbb{Z}$ which safely aborts if the coefficients become too large.

Why return codes and not (1) proper exceptions, or (2) special (NaN-like) values? First, C doesn't support proper exceptions; they can be simulated, but this isn't memory safe. For (2), I'm increasingly coming to realize that it is a very bad idea to have elements of a mathematical structure that are not actually elements of that structure (see also below). And while you can have a domain like $\mathbb{Q} \cup \{ \operatorname{NaN} \}$ where errors can be represented as in-domain results, you still need some way to handle errors when you're specifically working in $\mathbb{Q}$, when converting to machine types (overflowing a ulong), etc.

What about the danger of unhandled error flags? The results still remain to be seen, but for C programming, I've found that __attribute__((warn_unused_result)) works extremely well. (Of course, wrappers will need to be careful about this.)

Mathematical domains

Domains in the FLINT generics system will be based on math. This means, for example, that there will be a clear distinctions between actual rings and pseudo-rings.

This is most important when it comes to types with inexact representation. There is a field of real numbers represented by Arb balls which really represents the mathematical domain $\mathbb{R}$: the field axioms hold and the operator "=" compares equality of real numbers, not equality of ball representations. You can certainly also have a structure of Arb balls viewed as balls (i.e. as sets), with pointwise operations and structural comparisons, but this structure will necessarily not be a mathematical ring or field!

Similarly, an Arb-based "real field" does not contains special values like $\pm \infty$ and NaN, even though the underlying Arb types allows representing such values. You can have a non-finite interval like $[\pm \infty]$ in such a real field, but this will be interpreted as an unknown real number and not as an unknown extended real number. (You can certainly have a structure representing the extended real numbers as well, but it will again not be a field!)

Building on the principles of interval arithmetic

The generics system follows the principle of either returning errors or computing inclusions (in the sense of interval arithmetic) for results that cannot be computed exactly. Just like operations can return "unable to perform the computation", predicates return True, False or Unknown to represent inclusions of boolean values. The generics system uses triple-valued logic internally.

Since most programming languages have poor support for triple-valued logic, probably the safest way to handle unknown truth values in a wrapper is to throw an exception. This is what the Python wrapper does currently:

>>> a = (RR_arb(1) / 3) * 3
>>> a
[1.00000000000000 +/- 3.89e-16]
>>> a == 2
False
>>> a == 1
  ...
Undecidable: unable to decide x == y for x = [1.00000000000000 +/- 3.89e-16], y = 1.000000000000000 over Real numbers (arb, prec = 53)

Exceptions are inconvenient when you want to branch on definite truth values (i.e. with one branch for True and another for False/Unknown), but it is rather easy to make this behavior configurable. A cute possibility is to have context managers that translate Unknown to False (this is how most Arb interfaces currently handle predicates) and True respectively:

>>> with pessimistic_logic:
...     a == 1
...
False

>>> with optimistic_logic:
...     a == 1
...
True

Domain stability

The generics system is completely type-stable: results stay in the domains constructed by the user; they cannot silently change value to a larger domain. On the C level, stability is enforced by the fact that standard functions take a single context parameter and require all operands to have the same type (except for explicit conversion functions).

In the Python interface, operands can be coerced to the larger domain when one domain is included in the other exactly or via a homomorphism (the logic for such conversions is implemented at the C level), but operations will not (by default) construct larger domains. For example, an inexact division in $\mathbb{Z}$ gives an error.

>>> ZZ(1) / 2
  ...
ValueError: x / y is not defined for x = 1, y = 2 over Integer ring (fmpz)

To produce a fraction in the Python interface, at least one operand needs to be an element of $\mathbb{Q}$:

>>> ZZ(1) / QQ(2)
1/2

We can go further. Irrational numbers are not elements of $\mathbb{Q}$:

>>> a = ZZ(1) / QQ(2)
>>> a ** a
  ...
ValueError: x ** y is not defined for x = 1/2, y = 1/2 over Rational field (fmpq)

The field of algebraic number does the job, however:

>>> a ** QQbar(a)
Root a = 0.707107 of 2*a^2-1

For transcendental numbers, we need to pass to the real field (either an inexact Arb-based representation or an exact Calcium-based representation will work):

>>> b = a ** QQbar(a)
>>> b ** b
  ...
ValueError: x ** y is not defined for x = Root a = 0.707107 of 2*a^2-1, y = Root a = 0.707107 of 2*a^2-1 over Complex algebraic numbers (qqbar)
>>> 
>>> b ** RR(b)
[0.782654027355680 +/- 5.20e-16]
>>> b ** RR_ca(b)
0.782654 {a where a = 0.782654 [Pow(0.707107 {(b)/2}, 0.707107 {(b)/2})], b = 1.41421 [b^2-2=0]}

The same story holds for promotions to complex numbers:

>>> RR(-1).sqrt()
  ...
ValueError: sqrt(x) is not defined for x = -1.000000000000000 over Real numbers (arb, prec = 53)
>>> CC(-1).sqrt()
1.000000000000000*I

More generally, "complex" coercions like $\mathbb{Z}[x] + \mathbb{Q} \to \mathbb{Q}[x]$ are not done automatically:

>>> ZZx([1,2,3]) + QQ(1) / 2
  ...
ValueError
>>> QQx([1,2,3]) + QQ(1) / 2
[3/2, 2, 3]

Some people will find lack of automatic domain extensions and complex coercions a limitation. Supporting them in some form in the future is certainly a possibility, but for now, I'm inclined to think of domain stability as a feature which makes it easier to reason algebraically about code. There is not always an obvious canonical choice of extension (is the value of $\sqrt{-1}$ an element of $\mathbb{Z}[i]$, $\mathbb{Q}[x] / \langle x^2 + 1\rangle$, $\overline{\mathbb{Q}}$, $\mathbb{C}$, or the algebraic closure of a $p$-adic field?), and value-dependent return types come with a bunch of issues.

Randomized testing

Projects like FLINT and Arb rely heavily on randomized testing. The FLINT generics system comes with a test system that can check that builtin or user-defined implementations of mathematical structures satisfy the expected properties (testing aliasing rules, type conversions, commutativity, associativity, distributivity, homomorphisms, predicate inclusions, etc.). Randomized testing does not rigorously guarantee correctness, but practically speaking provides very strong protection against bugs.

Simplifying and extending FLINT and friends

The goal is not only to wrap existing functionality. As the name "generics" suggests, the goal is to be able to do generic programming directly in C. This opens the door for things like multivariate polynomials with Arb coefficients, power series types, and sparse matrices, which until now have been too much trouble to do in C.

In fact, there is already plenty of ad-hoc generic code in FLINT and its descendants:

Function pointer generics are used in some multivariate polynomial code and in the fmpz_mod module.
Template-based generics for the different finite field (fq) implementations.
Type union + switch statement generics are used for fq_default and Antic's nf.
Some Arb functions use ugly hacks to distinguish between constants and power series, etc.
Good old code duplication (copy-pasted implementations of the same algorithm for different rings, with minor variations).

The generics module could potentially replace such code with a single implementation, eliminating hundreds of thousands of lines of code in these libraries. This should simplify maintenance and improve build times (one of many reasons why I'm not interested in using C++ templates for generics). There is some overhead working with function pointers, but I'm optimistic that the performance impact will be negligible; crucially, the generics system allows in-place operations and generic algorithms can be overridden with specialized versions. For example, generic implementations of polynomials and matrices should perform very close to optimally when the base ring provides overrides for vector operations, polynomial multiplication and matrix multiplication. Fully generic code will probably not be a viable replacement for some core types like the nmod, fmpz and fmpq polynomials and matrices in FLINT, but it could replace the implementations for more complex rings.

I have not done much systematic profiling yet, but here is a quick example of timings (in seconds) for solving a random size-$n$ linear system $Ax = b$ over $\mathbb{Z}/p\mathbb{Z}$ where $p$ is a random 100-bit prime:

n         fmpz_mod_mat_solve    gr_mat_nonsingular_solve_lu
--------------------------------------------------------------
1         1.32e-06              7.02e-07
2         2.9e-06               2.21e-06
4         7.2e-06               6.03e-06
8         2.15e-05              1.88e-05
16        7.8e-05               6.87e-05
32        0.000329              0.000285
64        0.00169               0.00138
128       0.00883               0.00721
256       0.0469                0.0402
512       0.251                 0.215
1024      1.472                 1.432

The generic matrix code (gr_mat_t on top of the fmpz_mod base ring) is even a little bit faster than fmpz_mod_mat! (More commonly there should be a few percent slowdown, but I suppose fmpz_mod_mat has not been optimized all that well.) The fmpz_mod_mat module in FLINT is currently more than 3000 lines of code; once the generics system is in place, it could be replaced with a thin wrapper around the generics code for backward compatibility.

There is also the prospect of further specializing methods at runtime for performance. For example, Arb's variable-precision ball arithmetic could be complemented with fixed-precision ball and floating-point arithmetic that enables more low-level optimization and vectorization, and small residue rings and finite fields could be packed into smaller structures. Indeed, there is an experimental nmod8 type in the generics code for integers modulo 8-bit $n$, requiring only 1/8 as much space as FLINT's nmod code which always works with 64-bit words.

fredrikj.net | Blog index | RSS feed | Follow me on Mastodon | Become a sponsor