Experiments with low-level ball arithmetic

May 16, 2012

In the implementation of ball arithmetic I wrote about previously, arithmetic is done naively using FLINT (fmpz) integers. This is easy to implement but adds quite a lot of overhead at low precision (up to a precision of several hundred digits).

I’ve started writing some experimental code for a low level implementation (based on mpn functions). The goal is that a single ball addition or multiplication should be as cheap as or cheaper than a single MPFR operation, ideally close to a mpz fixed-point arithmetic with no error bounding at all.

There are many different ways to do the low-level implementation, giving various tradeoffs between simplicity, speed, and accuracy. So far I’ve tried two approaches.

The first is to do straight up floating-point arithmetic with a full-limb radix (i.e. the digits are base 2³² or 2⁶⁴, and the exponent measures the number of digits), like the GMP mpf type, but with a single limb (aligned with the bottom of the mantissa) for the radius. When the radius overflows one limb, the precision of the midpoint is reduced one limb. This has the advantage of requiring few adjustments, and adding numbers can be done with a single mpn_add without temporary space or extra shifts. One disadvantage is that since the top limb can be nearly empty, about half an extra limb is needed on average to obtain the same effective precision as with base 2 arithmetic.

The second is to do base-2 floating-point arithmetic, i.e. with a base-2 exponent, but with the mantissa still having a precision measured in a whole number of limbs, and with a single-limb radius aligned with the last limb as in the first representation. This requires some more adjustment code to make sure that the top bit of the top limb always is set. Advantages of this representation include the fact that the full representable precision is used, and that the error propagated in multiplication can be bounded very easily if one accepts always losing 1-2 bits of precision (this is too much in general, but can be useful for dot products).

Both representations have some minor disadvantages: whenever the radius overflows a limb, the error is amplified by a large factor (however, this is a rare event on average), the results are machine-dependent (this is not a problem if one uses the ball arithmetic as a lower-level routine for a computation intended to produce an exact or precisely defined result in the end, using adaptive precision or Ziv’s strategy for correct rounding), and the lack of precise semantics makes it somewhat difficult to write strong test code.

Other possible approaches include using base-2 arithmetic with a precision measured in bits, perhaps with the radius taking up less than a whole limb (making radius calculations even cheaper). This is likely to require more code, with more (potentially expensive) adjustment steps, but might make the arithmetic more well-behaved with fewer strange worst-case situations.

In any case, I’ve compared how long time it takes to multiply two n-limb numbers using mpn, mpz, FLINT fmpz, and MPFR functions, as well as the working fmpz-based ball arithmetic code (arb), and the two experimental implementations mentioned above (arb1 and arb2). To be precise, the mpz and fmpz timings indicate the combined time to do a multiplication and a shift, necessary to do a fixed-point multiplication.

limbs	mpn	mpz	fmpz	mpfr	arb	arb1	arb2
1	10	30	57	48	139	26	32
2	18	44	51	58	230	57	51
3	30	54	68	89	259	70	64
4	44	68	83	100	279	88	82
5	62	88	100	129	300	110	100
6	83	110	129	150	340	130	130
7	110	129	150	180	380	160	150
8	139	170	180	200	400	180	190
9	170	210	220	240	440	230	220
10	210	259	270	300	510	280	270
11	259	310	320	490	550	320	310
12	300	340	360	530	610	360	360
13	350	400	420	519	660	410	410
14	410	450	470	640	700	480	470
15	470	519	530	640	769	540	530

We see that arb1 and arb2 are nearly equal, and both are faster than MPFR and nearly as fast as mpz, accomplishing the goal. There are a few special cases left (e.g. zero radius) that would need handling in a proper implementation, which should add a few nanoseconds to each, but it’s also possible that the code for either could be streamlined a bit.

I also implemented addition for the arb1 type:

limbs	mpn	mpz	fmpz	mpfr	arb	arb1
1	7	17	30	83	87	31
2	6	17	29	88	95	39
3	7	17	29	92	97	45
4	8	20	32	79	100	48
5	11	20	32	82	100	51
6	11	23	36	83	100	58
7	12	23	34	100	100	59
8	16	26	40	98	110	61
9	17	31	41	110	120	62
10	14	32	44	110	110	67
11	20	27	45	120	110	72
12	15	27	39	110	110	71
13	18	28	40	110	120	73
14	18	31	43	110	110	78
15	19	28	41	120	120	83

Again, speed is much better for arb1 than for MPFR and arb. The timings for arb1 should actually be a bit better, closer to the mpz numbers. Right now I’m actually not sure why the difference is so large. The numbers get better if you only time it with a fixed set of inputs.

So far, the code is to incomplete to draw any conclusions about which approach is better. Code complexity is probably the biggest issue. In both versions, multiplication takes 150 lines or so of very dense code, and this code does not even handle all cases yet. It would be great to get rid of more branches, but that alone is difficult. Addition is not any easier (the arb1 implementation, which is supposed to be the one where addition is simple, has 150 lines of code and about 20 branches).

fredrikj.net | Blog index | RSS feed | Follow me on Mastodon | Become a sponsor

limbs	mpn	mpz	fmpz	mpfr	arb	arb1	arb2
1	10	30	57	48	139	26	32
2	18	44	51	58	230	57	51
3	30	54	68	89	259	70	64
4	44	68	83	100	279	88	82
5	62	88	100	129	300	110	100
6	83	110	129	150	340	130	130
7	110	129	150	180	380	160	150
8	139	170	180	200	400	180	190
9	170	210	220	240	440	230	220
10	210	259	270	300	510	280	270
11	259	310	320	490	550	320	310
12	300	340	360	530	610	360	360
13	350	400	420	519	660	410	410
14	410	450	470	640	700	480	470
15	470	519	530	640	769	540	530

limbs	mpn	mpz	fmpz	mpfr	arb	arb1
1	7	17	30	83	87	31
2	6	17	29	88	95	39
3	7	17	29	92	97	45
4	8	20	32	79	100	48
5	11	20	32	82	100	51
6	11	23	36	83	100	58
7	12	23	34	100	100	59
8	16	26	40	98	110	61
9	17	31	41	110	120	62
10	14	32	44	110	110	67
11	20	27	45	120	110	72
12	15	27	39	110	110	71
13	18	28	40	110	120	73
14	18	31	43	110	110	78
15	19	28	41	120	120	83

limbs	mpn	mpz	fmpz	mpfr	arb	arb1	arb2
1	10	30	57	48	139	26	32
2	18	44	51	58	230	57	51
3	30	54	68	89	259	70	64
4	44	68	83	100	279	88	82
5	62	88	100	129	300	110	100
6	83	110	129	150	340	130	130
7	110	129	150	180	380	160	150
8	139	170	180	200	400	180	190
9	170	210	220	240	440	230	220
10	210	259	270	300	510	280	270
11	259	310	320	490	550	320	310
12	300	340	360	530	610	360	360
13	350	400	420	519	660	410	410
14	410	450	470	640	700	480	470
15	470	519	530	640	769	540	530

limbs	mpn	mpz	fmpz	mpfr	arb	arb1
1	7	17	30	83	87	31
2	6	17	29	88	95	39
3	7	17	29	92	97	45
4	8	20	32	79	100	48
5	11	20	32	82	100	51
6	11	23	36	83	100	58
7	12	23	34	100	100	59
8	16	26	40	98	110	61
9	17	31	41	110	120	62
10	14	32	44	110	110	67
11	20	27	45	120	110	72
12	15	27	39	110	110	71
13	18	28	40	110	120	73
14	18	31	43	110	110	78
15	19	28	41	120	120	83

limbs	mpn	mpz	fmpz	mpfr	arb	arb1	arb2
1	10	30	57	48	139	26	32
2	18	44	51	58	230	57	51
3	30	54	68	89	259	70	64
4	44	68	83	100	279	88	82
5	62	88	100	129	300	110	100
6	83	110	129	150	340	130	130
7	110	129	150	180	380	160	150
8	139	170	180	200	400	180	190
9	170	210	220	240	440	230	220
10	210	259	270	300	510	280	270
11	259	310	320	490	550	320	310
12	300	340	360	530	610	360	360
13	350	400	420	519	660	410	410
14	410	450	470	640	700	480	470
15	470	519	530	640	769	540	530

limbs	mpn	mpz	fmpz	mpfr	arb	arb1
1	7	17	30	83	87	31
2	6	17	29	88	95	39
3	7	17	29	92	97	45
4	8	20	32	79	100	48
5	11	20	32	82	100	51
6	11	23	36	83	100	58
7	12	23	34	100	100	59
8	16	26	40	98	110	61
9	17	31	41	110	120	62
10	14	32	44	110	110	67
11	20	27	45	120	110	72
12	15	27	39	110	110	71
13	18	28	40	110	120	73
14	18	31	43	110	110	78
15	19	28	41	120	120	83