fredrikj.net / blog /

# What is the most common real number?

*September 21, 2019*

Time for a development update concerning
the Mathematical Functions Grimoire (Fungrim).
I've done relatively little work expanding the database of formulas
in the last month, instead prioritizing the backend code.
Most of the source code has now been converted
from scripts to a Python library (`pygrim`).
This library provides the following:

- Code to represent and manipulate Fungrim S-expressions, plus all the symbols defined in Fungrim (as Python symbols):

>>> import pygrim >>> expr = pygrim.GammaFunction(pygrim.Div(1,2)) ** 2 >>> expr Pow(GammaFunction(Div(1, 2)), 2) >>> expr.head() Pow >>> expr.args() (GammaFunction(Div(1, 2)), 2) >>> expr.replace({pygrim.GammaFunction : pygrim.Log}) Pow(Log(Div(1, 2)), 2)

>>> import pygrim.formulas >>> pygrim.formulas.entries_dict["98a765"] Entry(ID("98a765"), Formula(Equal(GoldenRatio, Mul(2, Cos(Div(ConstPi, 5))))))

>>> expr.latex() '{\\left(\\Gamma\\!\\left(\\frac{1}{2}\\right)\\right)}^{2}'

>>> expr.n(30) RealBall(Decimal("3.14159265358979323846264338328"), Decimal("4.98e-31"))

The library is still very much a work in progress and has nothing resembling an API yet; it will probably be several weeks before I make a public release with a version number, but as usual, anyone who is interested can go and play with the git version right now.

The built-in numerical evaluation will be extremely useful for further development of the Fungrim formula database. Indeed, testing identities numerically is a good way to catch errors in the formulas, and the code can also be used to generate new data for inclusion in Fungrim. I also expect that having an interface to Arb based on symbolic expressions will be useful for things not directly related to Fungrim.

## Introducing Ordner

The point of having a database of mathematical formulas in symbolic
form is that you can process this data algorithmically.
The first tool to demonstrate this is **Ordner**,
a new section of the Fungrim website.
What is Ordner? Essentially, it is a directory of the real numbers $\mathbb{R}$.
Bill Hart suggested
naming this index *Ordner* for *Online Real Decimal Number Encyclopedia Reference*
and that sounded silly enough to me (*ordner* also means *folder* in German).

Ordner lets you look up real numbers in terms of 30-digit decimal approximations. This is not necessarily the final design, but it's a convenient way to implement things for now. For each known 30-digit key, Ordner lists all constant symbolic expressions appearing in the Fungrim formula database with this numerical value, and for each such expressions, Ordner provides links to all entries in Fungrim where that expression appears.

Let's say that you do a numerical computation and come across a real number with the value 0.768225422... that you suspect might be interesting. Browsing Ordner, you will find

which currently lists two matching symbolic expressions: `DedekindEta(ConstI)`
($\eta\!\left(i\right)$ rendered to LaTeX), appearing in 11 Fungrim entries,
and
`Div(GammaFunction(Div(1, 4)), Mul(2, Pow(ConstPi, Div(3, 4))))`,
($\Gamma(1/4) / (2 {\pi}^{3 / 4})$),
appearing in 1 entry. Choose your preferred closed form, or follow the links to find out more!
A more common number like $\pi$ will have dozens of expressions.

Ordner is not the first tool to provide something like this: most importantly, there is the OEIS, and the Inverse Symbolic Calculator as well as Plouffe's Inverter. Ordner is currently missing a direct search function (the entire Fungrim website is still static), and it is also much less comprehensive than the others: right now Ordner comprises 2821 keys and 4818 symbolic expressions, while OEIS has hundreds of thousands of entries, ISC has millions, and Plouffe's Inverter has billions.

What distinguishes Ordner is that it is generated *completely automatically*
from the Fungrim formula database.
For example, if an entry in Fungrim reads $A = (\pi / 2) B$, then
the constant subexpressions $\pi$, $2$ and $(\pi/2)$ get added to Ordner automatically.
This means that Ordner will grow organically as the Fungrim database is expanded,
asymptotically covering all the real numbers that are important enough to
appear as explicit constants in "real-world" formulas! (As well
as real numbers that have been added manually to Fungrim as part of
numerical tables, e.g. the table of Bernoulli numbers
$B_n$ for $0 \le n \le 50$.)

A few implementation details:

- The precise structure of the expressions matters. For example, if the entry $A = (\pi / 2) B$ had read $A = (\pi B) / 2$, then $\pi / 2$ would not be counted. The code behind Ordner could be extended to perform automatic rearrangements of this kind in order to discover more real numbers.
- All decimal keys in Ordner are normalized to be nonnegative, so expressions
`x`representing negative values are indexed as`Neg(x)`in Ordner. - Complex numbers are
indexed by the real and imaginary parts (
`Re(x)`,`Im(x)`), as well as the absolute value and complex argument (`Abs(x)`,`Arg(x)`) when both the real and imaginary parts are nonzero. - The number 0 is a special case: a vanishing expression is only included when the numerical evaluation code can prove that the expression exactly represents 0. Some trivially zero-valued expressions are excluded to prevent bloat.
- Finally, since the Fungrim formula language normally uses
`Exp(x)`instead of`Pow(ConstE, x)`to represent the exponential function, formulas containing`Exp(...)`are listed under`2.71828182845904523536028747135`as a special case, so as to represent this fundamental constant fairly!

## Ranking the reals

One of the fun things we can do with this data is to rank the interesting real numbers by frequency (how many Fungrim entries contain a constant expression matching this number). Such rankings are available on the website:

Small integers dominate as you would expect; hence the "exclusive" version of the ranking which features more "interesting" numbers. Here is a plot of the top 30, including integers:

The top of the ranking comes as no surprise: the most common (nonnegative) real numbers by far are 1, 0 and 2, followed by $\pi$ as the first noninteger; $e$ and $1/2$ are not far behind, trailing 3 and 4. (Of course, which of 0 and 1 comes first is going to be an artifact of the rules for counting expressions.)

Here is a plot of the top 30 excluding integers:

Since the Fungrim formula database is quite small at this point, lots of
important real numbers are of course missing entirely.
Some are probably ranked too low; for example `Log(2)` and `Div(Sqrt(2), 2)` are
just outside of the top 30.
Others are ranked too high;
for example `ConstCatalan` and `JacobiTheta(3, 0, ConstI)`
are near the top since I added lots of formulas specifically for
Catalan's constant and special values of Jacobi theta functions,
and these particular constants will surely drop in the rankings
as the topics become more diverse.
What will the distribution look like when Fungrim has 10000 or 100000 entries,
if it ever reaches that point?
Prediction: more simple fractions and square roots
of simple fractions.

## What's next

There's still a ton of work to be done on the Fungrim backend library, not to mention lots of possible additions to the website interface. For example: now that Ordner exists, each formula entry in Fungrim could list the numerical values it contains and cross-link these numerical values to Ordner. Another idea is to write a clever search function for Ordner that tries transformations (given $x$, also try to look up $x + 1, 1/x, x^2, e^x, \ldots$).

I'm currently revising the Fungrim S-expression language (specifically, the call sequences for operators and the associated variable-binding rules), trying to find a good balance between human legibility and ease of semantic parsing. This might be the subject of a future writeup.