fexpr.h – flat-packed symbolic expressions¶
This module supports working with symbolic expressions.
Introduction¶
Formally, a symbolic expression is either:
An atom, being one of the following:
An integer, for example 0 or -34.
A symbol, for example
x
,Mul
,SomeUserNamedSymbol
. Symbols should be valid C identifiers (containing only the charactersA-Z
,a-z
,0-9
,_
, and not starting with a digit).A string, for example
"Hello, world!"
. For the moment, we only consider ASCII strings, but there is no obstacle in principle to supporting UTF-8.
A non-atomic expression, representing a function call \(e_0(e_1, \ldots, e_n)\) where \(e_0, \ldots, e_n\) are symbolic expressions.
The meaning of an expression depends on the interpretation
of symbols in a given context.
For example, with a standard intepretation (used within Calcium) of the symbols
Mul
, Add
and
Neg
, the expression Mul(3, Add(Neg(x), y))
encodes the formula \(3 \cdot ((-x)+y)\)
where x
and y
are symbolic variables.
See fexpr_builtin.h – builtin symbols for documentation of builtin symbols.
Computing and embedding data¶
Symbolic expressions are usually not the best data structure to use directly for heavy-duty computations. Functions acting on symbolic expressions will typically convert to a dedicated data structure (e.g. polynomials) internally and (optionally) convert the final result back to a symbolic expression.
Symbolic expressions do not allow embedding arbitrary binary objects such as Flint/Arb/Antic/Calcium types as atoms. This is done on purpose to make symbolic expressions easy to use as a data exchange format. To embed an object in an expression, one has the following options:
Represent the object structurally using atoms supported natively by symbolic expressions (for example, an integer polynomial can be represented as a list of coefficients or as an arithmetic expression tree).
Introduce a dummy symbol to represent the object, maintaining an external translation table mapping this symbol to the intended value.
Encode the object using a string or symbol name. This is generally not recommended, as it requires parsing; properly used, symbolic expressions have the benefit of being able to represent the parsed structure.
Flat-packed representation¶
Symbolic expressions are often implemented using trees of pointers
(often together with hash tables for uniqueness),
requiring some form of memory management.
The fexpr_t
type, by contrast, stores a symbolic expression
using a “flat-packed” representation without internal pointers.
The expression data is just an array of words (ulong
).
The first word is a header encoding type information (whether
the expression is a function call or an atom, and the type
of the atom) and the total number of words
in the expression.
For atoms, the data is stored either in the header word itself (small
integers and short symbols/strings) or in the following words.
For function calls, the header is followed by
the expressions \(e_0\), …, \(e_n\) packed contiguously in memory.
Pros:
Memory management is trivial.
Copying an expression is just copying an array of words.
Comparing expressions for equality is just comparing arrays of words.
Merging expressions is basically just concatenating arrays of words.
Expression data can be shared freely in binary form between threads and even between machines (as long as all machines have the same word size and endianness).
Cons:
Repeated instances of the same subexpression cannot share memory (a workaround is to introduce local dummy symbols for repeated subexpressions).
Extracting a subexpression for modification generally requires making a complete copy of that subxepression (however, for read-only access to subexpressions, one can use “view” expressions which have zero overhead).
Manipulating a part of an expression generally requires rebuilding the whole expression.
Building an expression incrementally is typically \(O(n^2)\). As a workaround, it is a good idea to work with balanced (low-depth) expressions and try to construct an expression in one go (for example, to create a sum, create a single
Add
expression with many arguments instead of chaining binaryAdd
operations).
Types and macros¶
-
type
fexpr_struct
¶
-
type
fexpr_t
¶ An fexpr_struct consists of a pointer to an array of words along with a record of the number of allocated words.
An fexpr_t is defined as an array of length one of type fexpr_struct, permitting an fexpr_t to be passed by reference.
-
type
fexpr_ptr
¶ Alias for
fexpr_struct *
, used for arrays of expressions.
-
type
fexpr_srcptr
¶ Alias for
const fexpr_struct *
, used for arrays of expressions when passed as constant input to functions.
-
type
fexpr_vec_struct
¶
-
type
fexpr_vec_t
¶ A type representing a vector of expressions with managed length. The structure contains an
fexpr_ptr
entries for the entries, an integer length (the size of the vector), and an integer alloc (the number of allocated entries).
-
fexpr_vec_entry
(vec, i)¶ Returns a pointer to entry i in the vector vec.
Memory management¶
-
fexpr_ptr
_fexpr_vec_init
(slong len)¶ Returns a heap-allocated vector of len initialized expressions.
Size information¶
-
slong
fexpr_num_leaves
(const fexpr_t expr)¶ Returns the number of leaves (atoms, counted with repetition) in the expression expr.
-
slong
fexpr_size
(const fexpr_t expr)¶ Returns the number of words in the internal representation of expr.
Comparisons¶
-
int
fexpr_equal
(const fexpr_t a, const fexpr_t b)¶ Checks if a and b are exactly equal as expressions.
Atoms¶
-
void
fexpr_set_si
(fexpr_t res, slong c)¶ -
void
fexpr_set_ui
(fexpr_t res, ulong c)¶ -
void
fexpr_set_fmpz
(fexpr_t res, const fmpz_t c)¶ Sets res to the atomic integer c.
-
void
fexpr_get_fmpz
(fmpz_t res, const fexpr_t expr)¶ Sets res to the atomic integer in expr. This aborts if expr is not an atomic integer.
-
void
fexpr_set_symbol_builtin
(fexpr_t res, slong id)¶ Sets res to the builtin symbol with internal index id (see fexpr_builtin.h – builtin symbols).
-
int
fexpr_is_builtin_symbol
(const fexpr_t expr, slong id)¶ Returns whether expr is the builtin symbol with index id (see fexpr_builtin.h – builtin symbols).
-
int
fexpr_is_any_builtin_symbol
(const fexpr_t expr)¶ Returns whether expr is any builtin symbol (see fexpr_builtin.h – builtin symbols).
Input and output¶
-
void
fexpr_write
(calcium_stream_t stream, const fexpr_t expr)¶ Writes expr to stream.
LaTeX output¶
-
void
fexpr_write_latex
(calcium_stream_t stream, const fexpr_t expr, ulong flags)¶ Writes the LaTeX representation of expr to stream.
-
void
fexpr_print_latex
(const fexpr_t expr, ulong flags)¶ Prints the LaTeX representation of expr to standard output.
-
char *
fexpr_get_str_latex
(const fexpr_t expr, ulong flags)¶ Returns a string of the LaTeX representation of expr. The string must be freed with
flint_free()
.Warning: string literals appearing in expressions are currently not escaped.
The flags parameter allows specifying options for LaTeX output. The following flags are supported:
-
FEXPR_LATEX_SMALL
¶ Generate more compact formulas, most importantly by printing fractions inline as \(p/q\) instead of as \(\displaystyle{\frac{p}{q}}\). This flag is automatically activated within subscripts and superscripts and in certain other parts of formulas.
-
FEXPR_LATEX_LOGIC
¶ Use symbols for logical operators such as Not, And, Or, which by default are rendered as words for legibility.
Function call structure¶
-
slong
fexpr_nargs
(const fexpr_t expr)¶ Returns the number of arguments n in the function call \(f(e_1,\ldots,e_n)\) represented by expr. If expr is an atom, returns -1.
-
void
fexpr_func
(fexpr_t res, const fexpr_t expr)¶ Assuming that expr represents a function call \(f(e_1,\ldots,e_n)\), sets res to the function expression f.
-
void
fexpr_view_func
(fexpr_t view, const fexpr_t expr)¶ As
fexpr_func()
, but sets view to a shallow view instead of copying the expression. The variable view must not be initialized before use or cleared after use, and expr must not be modified or cleared as long as view is in use.
-
void
fexpr_arg
(fexpr_t res, const fexpr_t expr, slong i)¶ Assuming that expr represents a function call \(f(e_1,\ldots,e_n)\), sets res to the argument \(e_{i+1}\). Note that indexing starts from 0. The index must be in bounds, with \(0 \le i < n\).
-
void
fexpr_view_arg
(fexpr_t view, const fexpr_t expr, slong i)¶ As
fexpr_arg()
, but sets view to a shallow view instead of copying the expression. The variable view must not be initialized before use or cleared after use, and expr must not be modified or cleared as long as view is in use.
-
void
fexpr_view_next
(fexpr_t view)¶ Assuming that view is a shallow view of a function argument \(e_i\) in a function call \(f(e_1,\ldots,e_n)\), sets view to a view of the next argument \(e_{i+1}\). This function can be called when view refers to the last argument \(e_n\), provided that view is not used afterwards. This function can also be called when view refers to the function f, in which case it will make view point to \(e_1\).
-
int
fexpr_is_builtin_call
(const fexpr_t expr, slong id)¶ Returns whether expr has the form \(f(\ldots)\) where f is a builtin function defined by id (see fexpr_builtin.h – builtin symbols).
-
int
fexpr_is_any_builtin_call
(const fexpr_t expr)¶ Returns whether expr has the form \(f(\ldots)\) where f is any builtin function (see fexpr_builtin.h – builtin symbols).
Composition¶
-
void
fexpr_call0
(fexpr_t res, const fexpr_t f)¶ -
void
fexpr_call1
(fexpr_t res, const fexpr_t f, const fexpr_t x1)¶ -
void
fexpr_call2
(fexpr_t res, const fexpr_t f, const fexpr_t x1, const fexpr_t x2)¶ -
void
fexpr_call3
(fexpr_t res, const fexpr_t f, const fexpr_t x1, const fexpr_t x2, const fexpr_t x3)¶ -
void
fexpr_call4
(fexpr_t res, const fexpr_t f, const fexpr_t x1, const fexpr_t x2, const fexpr_t x3, const fexpr_t x4)¶ -
void
fexpr_call_vec
(fexpr_t res, const fexpr_t f, fexpr_srcptr args, slong len)¶ Creates the function call \(f(x_1,\ldots,x_n)\). The vec version takes the arguments as an array args and n is given by len. Warning: aliasing between inputs and outputs is not implemented.
Subexpressions and replacement¶
-
int
fexpr_contains
(const fexpr_t expr, const fexpr_t x)¶ Returns whether expr contains the expression x as a subexpression (this includes the case where expr and x are equal).
-
int
fexpr_replace
(fexpr_t res, const fexpr_t expr, const fexpr_t x, const fexpr_t y)¶ Sets res to the expression expr with all occurrences of the subexpression x replaced by the expression y. Returns a boolean value indicating whether any replacements have been performed. Aliasing is allowed between res and expr but not between res and x or y.
-
int
fexpr_replace2
(fexpr_t res, const fexpr_t expr, const fexpr_t x1, const fexpr_t y1, const fexpr_t x2, const fexpr_t y2)¶ Like
fexpr_replace()
, but simultaneously replaces x1 by y1 and x2 by y2.
-
int
fexpr_replace_vec
(fexpr_t res, const fexpr_t expr, const fexpr_vec_t xs, const fexpr_vec_t ys)¶ Sets res to the expression expr with all occurrences of the subexpressions given by entries in xs replaced by the corresponding expressions in ys. It is required that xs and ys have the same length. Returns a boolean value indicating whether any replacements have been performed. Aliasing is allowed between res and expr but not between res and the entries of xs or ys.
Arithmetic expressions¶
-
void
fexpr_set_fmpq
(fexpr_t res, const fmpq_t x)¶ Sets res to the rational number x. This creates an atomic integer if the denominator of x is one, and otherwise creates a division expression.
-
void
fexpr_set_arf
(fexpr_t res, const arf_t x)¶ -
void
fexpr_set_d
(fexpr_t res, double x)¶ Sets res to an expression for the value of the floating-point number x. NaN is represented as
Undefined
. For a regular value, this creates an atomic integer or a rational fraction if the exponent is small, and otherwise creates an expression of the formMul(m, Pow(2, e))
.
-
void
fexpr_set_re_im_d
(fexpr_t res, double x, double y)¶ Sets res to an expression for the complex number with real part x and imaginary part y.
-
void
fexpr_neg
(fexpr_t res, const fexpr_t a)¶ -
void
fexpr_add
(fexpr_t res, const fexpr_t a, const fexpr_t b)¶ -
void
fexpr_sub
(fexpr_t res, const fexpr_t a, const fexpr_t b)¶ -
void
fexpr_mul
(fexpr_t res, const fexpr_t a, const fexpr_t b)¶ -
void
fexpr_div
(fexpr_t res, const fexpr_t a, const fexpr_t b)¶ -
void
fexpr_pow
(fexpr_t res, const fexpr_t a, const fexpr_t b)¶ Constructs an arithmetic expression with given arguments. No simplifications whatsoever are performed.
-
int
fexpr_is_arithmetic_operation
(const fexpr_t expr)¶ Returns whether expr is of the form \(f(e_1,\ldots,e_n)\) where f is one of the arithmetic operators
Pos
,Neg
,Add
,Sub
,Mul
,Div
.
-
void
fexpr_arithmetic_nodes
(fexpr_vec_t nodes, const fexpr_t expr)¶ Sets nodes to a vector of subexpressions of expr such that expr is an arithmetic expression with nodes as leaves. More precisely, expr will be constructed out of nested application the arithmetic operators
Pos
,Neg
,Add
,Sub
,Mul
,Div
with integers and expressions in nodes as leaves. PowersPow
with an atomic integer exponent are also allowed. The nodes are output without repetition but are not automatically sorted in a canonical order.
-
int
fexpr_get_fmpz_mpoly_q
(fmpz_mpoly_q_t res, const fexpr_t expr, const fexpr_vec_t vars, const fmpz_mpoly_ctx_t ctx)¶ Sets res to the expression expr as a formal rational function of the subexpressions in vars. The vector vars must have the same length as the number of variables specified in ctx. To build vars automatically for a given expression,
fexpr_arithmetic_nodes()
may be used.Returns 1 on success and 0 on failure. Failure can occur for the following reasons:
A subexpression is encountered that cannot be interpreted as an arithmetic operation and does not appear (exactly) in vars.
Overflow (too many terms or too large exponent).
Division by zero (a zero denominator is encountered).
It is important to note that this function views expr as a formal rational function with vars as formal indeterminates. It does thus not check for algebraic relations between vars and can implicitly divide by zero if vars are not algebraically independent.
-
void
fexpr_set_fmpz_mpoly
(fexpr_t res, const fmpz_mpoly_t poly, const fexpr_vec_t vars, const fmpz_mpoly_ctx_t ctx)¶ -
void
fexpr_set_fmpz_mpoly_q
(fexpr_t res, const fmpz_mpoly_q_t frac, const fexpr_vec_t vars, const fmpz_mpoly_ctx_t ctx)¶ Sets res to an expression for the multivariate polynomial poly (or rational function frac), using the expressions in vars as the variables. The length of vars must agree with the number of variables in ctx. If NULL is passed for vars, a default choice of symbols is used.
-
int
fexpr_expanded_normal_form
(fexpr_t res, const fexpr_t expr, ulong flags)¶ Sets res to expr converted to expanded normal form viewed as a formal rational function with its non-arithmetic subexpressions as terminal nodes. This function first computes nodes with
fexpr_arithmetic_nodes()
, sorts the nodes, evaluates to a rational function withfexpr_get_fmpz_mpoly_q()
, and then converts back to an expression withfexpr_set_fmpz_mpoly_q()
. Optional flags are reserved for future use.
Vectors¶
-
void
fexpr_vec_init
(fexpr_vec_t vec, slong len)¶ Initializes vec to a vector of length len. All entries are set to the atomic integer 0.
-
void
fexpr_vec_clear
(fexpr_vec_t vec)¶ Clears the vector vec.
-
void
fexpr_vec_print
(const fexpr_vec_t vec)¶ Prints vec to standard output.
-
void
fexpr_vec_swap
(fexpr_vec_t x, fexpr_vec_t y)¶ Swaps x and y efficiently.
-
void
fexpr_vec_fit_length
(fexpr_vec_t vec, slong len)¶ Ensures that vec has space for len entries.
-
void
fexpr_vec_set
(fexpr_vec_t dest, const fexpr_vec_t src)¶ Sets dest to a copy of src.
-
void
fexpr_vec_append
(fexpr_vec_t vec, const fexpr_t expr)¶ Appends expr to the end of the vector vec.
-
slong
fexpr_vec_insert_unique
(fexpr_vec_t vec, const fexpr_t expr)¶ Inserts expr without duplication into vec, returning its position. If this expression already exists, vec is unchanged. If this expression does not exist in vec, it is appended.
-
void
fexpr_vec_set_length
(fexpr_vec_t vec, slong len)¶ Sets the length of vec to len, truncating or zero-extending as needed.
-
void
_fexpr_vec_sort_fast
(fexpr_ptr vec, slong len)¶ Sorts the len entries in vec using the comparison function
fexpr_cmp_fast()
.