fredrikj.net / blog /

Calcium development update

May 13, 2021

I'm making steady progress on Calcium. In this post, I will discuss some of the highlights in the recently released version 0.3 and give a preview of the improvements coming up soon in 0.4.

Symbolic expressions and LaTeX output

I have already written about symbolic expressions, more about symbolic expressions, and about printing algebraic numbers, so this will be very brief. In short, Calcium now supports working unevaluated symbolic expressions, including conversions back and forth between number types and symbolic expressions.

A neat feature is that Calcium objects automatically render to LaTeX in Jupyter notbooks:

A = ca_mat([[5, pi], [1, -1]])**4
A
$$\displaystyle{\begin{pmatrix}a^{2}_{1} + 66 a_{1} + 625 & 8 a^{2}_{1} + 104 a_{1} \\8 a_{1} + 104 & a^{2}_{1} + 18 a_{1} + 1\end{pmatrix}}\; \text{ where } a_{1} = \pi$$
A.charpoly()
$$x^{2}+ \left(-2 a^{2}_{1}-84 a_{1}-626\right) x+ \left(a^{4}_{1} + 20 a^{3}_{1} + 150 a^{2}_{1} + 500 a_{1} + 625\right)\; \text{ where } a_{1} = \pi$$
A.charpoly()(A)
$$\displaystyle{\begin{pmatrix}0 & 0 \\0 & 0\end{pmatrix}}$$

On that note, there is now a demo notebook for Calcium (interactive version) which is a good starting point for new users.

Stronger zero testing and simplification

Calcium is getting significantly better at simplifying or proving equality of numbers involving a mixture of algebraic numbers, elementary functions (exp, log, trigonometric functions), and complex parts (absolute value, sign, argument, real and imaginary parts, conjugate). Here are some examples of equalities that previously didn't work, but now do:

$$\log\!\left(\frac{1}{e^{\sqrt{2} + 1}}\right) = -\sqrt{2} - 1$$ $$\arg\!\left(\sqrt{-\pi i}\right) = -\frac{\pi}{4}$$ $$\arg\!\left(1 - \pi i\right) - i \log\!\left(\frac{1 + \pi i}{\left|1 - \pi i\right|}\right) - i \log\!\left(\operatorname{sgn}\!\left(1 + \pi i\right)\right) = \operatorname{atan}(\pi) $$ $$\left|e^{\sqrt{1 + i}}\right| = e^{\operatorname{Re}\left(\sqrt{1 + i}\right)}$$ $$\frac{\pi + \sqrt{2} + \sqrt{3}}{\pi + \sqrt{5 + 2 \sqrt{6}}} = 1$$ $$\sin\!\left(2 \operatorname{atan}(\pi)\right) + {\left(15 \sqrt{3} + 26\right)}^{1 / 3} = \frac{2 \pi}{{\pi}^{2} + 1} + \sqrt{3} + 2$$ $$\sin\!\left(\frac{1 + \sqrt{2}}{2}\right) = \sqrt{\frac{1 - \cos\!\left(1 + \sqrt{2}\right)}{2}}$$ $$\tan\!\left(\pi \sqrt{2}\right) \tan\!\left(\pi \sqrt{3}\right) = \frac{\cos\!\left(\pi \sqrt{5 - 2 \sqrt{6}}\right) - \cos\!\left(\pi \sqrt{5 + 2 \sqrt{6}}\right)}{\cos\!\left(\pi \sqrt{5 - 2 \sqrt{6}}\right) + \cos\!\left(\pi \sqrt{5 + 2 \sqrt{6}}\right)}$$

Such evaluations basically come down to stronger heuristics for flattening nested field extensions (powers of powers, logarithms of powers) and use of stronger normal forms (factorization, function rewriting) for zero testing. The goal is that Calcium should be able to prove any constant identity involving elementary functions, but that is still a long way off.

A small, related quality-of-life improvement: numerical evaluation for the ca type now has the ability to use symbolic tests to determine (rigorously) whether the real or imaginary part is exactly zero:

>>> z = log(1+i) + log(1-i)
>>> z.nstr(30)                                  # Compute z to 30-digit accuracy as a complex number
'0.693147180559945309417232121458 + 0e-40*I'
>>> z.nstr(30, parts=True)                      # Compute both real and imaginary parts accurately
'0.693147180559945309417232121458'

Performance improvements

More simplifications = more expenses, but Calcium is also getting faster. For example, the huge_expr test problem (https://ask.sagemath.org/question/52653) previously took 24 seconds with qqbar_t arithmetic and 100 seconds with ca_t arithmetic; it now takes 8 seconds with qqbar_t and slightly less than 8 seconds with ca_t:

fredrik@agm:~/src/calcium$ build/examples/huge_expr -ca
Evaluating N...
cpu/wall(s): 0.189 0.189
Evaluating M...
cpu/wall(s): 0.022 0.022
Evaluating E = -(1-|M|^2)^2...
cpu/wall(s): 0.008 0.009
N ~ -0.16190853053311203695842869991458578203473645660641
E ~ -0.16190853053311203695842869991458578203473645660641
Testing E = N...
cpu/wall(s): 7.297 7.297

Equal = T_TRUE

Total: cpu/wall(s): 7.518 7.517
virt/peak/res/peak(MB): 61.76 65.40 34.23 37.79

Performance on some of the DFT benchmarks have also improved by an order of magnitude.

A major speedup came from fixing an incredibly stupid bug: because of a missing assignment of a derivative, numerical evaluation of algebraic numbers always used a slow fallback (computing all roots from scratch) instead of fast Newton refinement. This fix alone made the Calcium test suite as a whole run 2x faster! Other performance improvements have been more incremental. There is still a lot of low-hanging fruit for future optimization.

Series expansions

I'm adding more power series methods for the ca_poly type, making it useful for computing exact Taylor series expansions. For example, exponentials and logarithms of power series are now supported. As a stupid and artificial benchmark problem, let $f = \sum_{n=0}^{\infty} \sqrt{2+n} \, x^n = \sqrt{2} + \sqrt{3} x + \sqrt{4} x^2 + \sqrt{5} x^3 + \ldots$, compute the expansion $g = \exp(f) + O(x^N)$, and then compute $h = \log(g) + O(x^N)$ and verify that $f = h$ (to order $x^N$). Pycalcium snippet:

%time N=10; x = ca_poly([0,1]); f = sum(sqrt(2+n)*x**n for n in range(N)); g = f.exp_series(N); h = g.log_series(N); f == h

Sage/Pynac snippet:

%time N=10; f = sum(sqrt(n+2) * x**n for n in range(N)); g = taylor(exp(f), x, 0, N); h = taylor(log(g), x, 0, N); expand(f - h)
N Sage/Pynac Calcium
10 0.055 s 0.0066 s
20 0.48 s 0.053 s
30 3.0 s 0.27 s
40 15 s 1.1 s
50 63 s 3.4 s
100 413 s

Series manipulations are particularly efficient for formal power series over simple algebraic number fields, where Calcium exploits fast polynomial multiplication. The same benchmark but with $f = \sum_{n=1}^{\infty} (1 + \sqrt{2})^n x^n$:

N Sage/Pynac Calcium
50 0.21 s 0.0077 s
100 0.91 s 0.019 s
500 61 s 0.30 s
1000 537 s 1.5 s

Right now, only univariate series are supported; I plan to add multivariate polynomials and power series eventually, but it's not one of my short-term development goals.



fredrikj.net  |  Blog index  |  RSS feed  |  Follow me on Mastodon  |  Become a sponsor