mirror of
https://github.com/vale981/arb
synced 2025-03-06 09:51:39 -05:00
361 lines
15 KiB
ReStructuredText
361 lines
15 KiB
ReStructuredText
**gamma.h** -- support for the gamma function
|
|
===============================================================================
|
|
|
|
This module implements various algorithms for evaluating the
|
|
gamma function and related functions. The functions provided here are mainly
|
|
intended for internal use, though they may be useful to call directly in some
|
|
applications where the default algorithm choices are suboptimal.
|
|
Most applications should use the standard, user-friendly top-level functions
|
|
in the *fmprb* and *fmpcb* modules.
|
|
|
|
|
|
Evaluation using Taylor series
|
|
--------------------------------------------------------------------------------
|
|
|
|
.. function :: long gamma_taylor_bound_mag(long n)
|
|
|
|
Letting `1/\Gamma(x) = \sum_{n=1}^{\infty} c_n x^n`, returns an integer
|
|
`p` such that `|c_n| \le 2^{-p}`.
|
|
Uses the bound `\log |c_n| \le b(n) = ((\pi-1)n + (3-5n) \log(n/6))/6`
|
|
(to be published).
|
|
|
|
.. function :: void gamma_taylor_bound_ratio(fmpr_t r, long n)
|
|
|
|
Sets `r` to a bound for `B(n+1) / B(n)` where `B(n) = \exp b(n)`.
|
|
We use `r(n) = 3n^{-5/6}`.
|
|
|
|
.. function :: void gamma_taylor_bound_remainder(fmpr_t err, const fmpr_t z, long n)
|
|
|
|
Given a nonnegative `z` and `n \ge 1`, computes a bound for
|
|
`\sum_{k=n}^{\infty} |c_{k+1}| z^k`. This is given by
|
|
`B(n+1) z^n / (1 - r(n+1) z)`, provided that the denominator is
|
|
positive (otherwise the bound is infinity).
|
|
|
|
.. function :: void gamma_taylor_fmprb(fmprb_t y, const fmprb_t x, long prec)
|
|
|
|
Sets `y = \Gamma(x)`, computed by translating `x` to the interval
|
|
`[-1/2,1/2]` and then evaluating the Taylor series for the
|
|
reciprocal gamma function.
|
|
|
|
Assumes that the absolute value of the midpoint of `x` is not too large
|
|
in order for the translation to be reasonably efficient (too large
|
|
values will result in extreme slowness and eventually an exception).
|
|
|
|
Evaluation using the Stirling series
|
|
--------------------------------------------------------------------------------
|
|
|
|
.. function:: void gamma_stirling_choose_param_fmprb(int * reflect, long * r, long * n, const fmprb_t x, int allow_reflection, int digamma, long prec)
|
|
|
|
.. function:: void gamma_stirling_choose_param_fmpcb(int * reflect, long * r, long * n, const fmprb_t x, int allow_reflection, int digamma, long prec)
|
|
|
|
Compute parameters `r`, `n` such that the remainder in the Stirling
|
|
series with `z = x+yi` approximately is bounded by `2^{-\mathrm{prec}}`.
|
|
If *digamma* is nonzero, the calculation is done for the digamma
|
|
function.
|
|
|
|
The parameter `n` is the truncation point in the asymptotic
|
|
Stirling series. If `|z|` is too small for the Stirling series
|
|
to give sufficient accuracy directly, we first translate to `z + r`
|
|
using the formula `\Gamma(z) = \Gamma(z+r) /
|
|
(z (z+1) (z+2) \cdots (z+r-1))`.
|
|
|
|
If *allow_reflection* is nonzero, the *reflect* flag is set if `z`
|
|
should be replaced with `1-z` using the reflection formula.
|
|
|
|
This function uses double precision arithmetic internally,
|
|
and does not guarantee the error bound rigorously;
|
|
a rigorous error bound, which also accounts for the radius of `z`,
|
|
is computed a posteriori when evaluating the Stirling series.
|
|
However, in practice, this function does estimate the bound
|
|
very accurately.
|
|
|
|
To obtain a remainder smaller than `2^{-b}`, we must choose an `r` such
|
|
that, in the real case, `x + r > \beta b`, where
|
|
`\beta > \log(2) / (2 \pi) \approx 0.11`.
|
|
In practice, a slightly larger factor `\beta \approx 0.2` more closely
|
|
balances `n` and `r`. A much larger `\beta` (e.g. `\beta = 1`) could be
|
|
used to reduce the number of Bernoulli numbers that have to be
|
|
precomputed, at the expense of slower repeated evaluation.
|
|
|
|
.. function :: void gamma_stirling_coeff(fmprb_t b, ulong k, int digamma, long prec)
|
|
|
|
Sets `b = B_{2k} / (2k (2k-1))`, rounded to *prec* bits, or if *digamma*
|
|
is nonzero, sets `b = B_{2k} / (2k)`.
|
|
|
|
.. function :: void gamma_stirling_eval_series_fmprb(fmprb_t s, const fmprb_t z, long n, int digamma, long prec)
|
|
|
|
.. function :: void gamma_stirling_eval_series_fmpcb(fmpcb_t s, const fmpcb_t z, long n, int digamma, long prec)
|
|
|
|
Evaluates the Stirling series
|
|
|
|
.. math ::
|
|
|
|
\log \Gamma(z) - R(n,z) = \left(z-\frac{1}{2}\right)\log z - z +
|
|
\frac{\ln {2 \pi}}{2} + \sum_{k=1}^{n-1} t_k
|
|
|
|
where
|
|
|
|
.. math ::
|
|
|
|
t_k = \frac{B_{2k}}{2k(2k-1)z^{2k-1}}.
|
|
|
|
If *digamma* is nonzero, the derivative of this series (i.e. the
|
|
expansion for the digamma function) is evaluated.
|
|
|
|
The error bound
|
|
|
|
.. math ::
|
|
|
|
|R(n,z)| \le \frac{|t_n|}{\cos(0.5 \arg(z))^{2n}}
|
|
|
|
is included in the output (when evaluating the digamma function, the
|
|
expression is the same except that `t_n` changes to the
|
|
differentiated term and the exponent `2n` changes to `2n+1`
|
|
(section 5.11 in [NIST2012]_).
|
|
|
|
|
|
Rising factorials
|
|
--------------------------------------------------------------------------------
|
|
|
|
.. function :: void gamma_rising_fmprb_ui_bsplit_simple(fmprb_t y, const fmprb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_rising_fmprb_ui_bsplit_eight(fmprb_t y, const fmprb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_rising_fmprb_ui_bsplit_rectangular(fmprb_t y, const fmprb_t x, ulong n, ulong step, long prec)
|
|
|
|
.. function :: void gamma_rising_fmprb_ui_bsplit(fmprb_t y, const fmprb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_rising_fmpcb_ui_bsplit_simple(fmpcb_t y, const fmpcb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_rising_fmpcb_ui_bsplit_eight(fmpcb_t y, const fmpcb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_rising_fmpcb_ui_bsplit_rectangular(fmpcb_t y, const fmpcb_t x, ulong n, ulong step, long prec)
|
|
|
|
.. function :: void gamma_rising_fmpcb_ui_bsplit(fmpcb_t y, const fmpcb_t x, ulong n, long prec)
|
|
|
|
Sets `y` to the rising factorial `x (x+1) (x+2) \cdots (x+n-1)`,
|
|
computed using binary splitting.
|
|
|
|
The different versions of this function process the basecase differently.
|
|
The *simple* version simply multiplies together several factors
|
|
one after another.
|
|
|
|
The *eight* version processes eight factors at a time using the formula
|
|
|
|
.. math ::
|
|
|
|
x(x+1)\cdots(x+7) = (28 + 98x + 63x^2 + 14x^3 + x^4)^2 - 16 (7+2x)^2,
|
|
|
|
replacing 7 full-precision multiplications with 3 squarings,
|
|
1 multiplication, and several linear operations ([CP2005]_, page 316).
|
|
Empirically, if `x` is a full-precision number, this is about twice as
|
|
fast as the *simple* version at high precision. Numerical stability is
|
|
slightly worse.
|
|
|
|
The *rectangular* version processes *step* factors at a time by
|
|
expanding the polynomial `f(t) = t (t+1) (t+2) \cdots (t+\mathrm{step}-1)`
|
|
and evaluating each factor `f(x + \mathrm{step} \, k)`
|
|
using rectangular splitting. At very high precision, if `x` is a
|
|
full-precision number, this asymptotically reduces the number of
|
|
full-precision multiplications required. If the *step* parameter
|
|
is set to zero, a default value is used.
|
|
|
|
The functions *gamma_rising_fmprb_ui_bsplit* and
|
|
*gamma_rising_fmpcb_ui_bsplit* automatically choose
|
|
an algorithm depending on the inputs.
|
|
|
|
.. function :: void gamma_rising_fmprb_ui_delta(fmprb_t y, const fmprb_t x, ulong n, ulong m, long prec)
|
|
|
|
.. function :: void gamma_rising_fmpcb_ui_delta(fmpcb_t y, const fmpcb_t x, ulong n, ulong m, long prec)
|
|
|
|
Sets `y` to the rising factorial `x (x+1) (x+2) \cdots (x+n-1)`,
|
|
computed as a product of partial products
|
|
`(x+k)(x+k+1)\cdots(x+k+m-1)`. Each partial product is obtained
|
|
from the previous by using a precomputed table of powers of `x` to
|
|
evaluate the difference
|
|
|
|
.. math ::
|
|
|
|
\Delta_m(x,k) = (x+k+m)_{(m)} - (x+k)_{(m)}.
|
|
|
|
The instance `m = 4` of this algorithm was used by Smith ([Smi2001]_),
|
|
but we generalize it to a variable `m` which can be chosen nearly
|
|
optimally depending on the precision and `n`.
|
|
|
|
The polynomials `\Delta_m(x,k) \in \mathbb{Z}[k][x]` are generated dynamically.
|
|
Expanding the rising factorials, applying the binomial theorem
|
|
a couple of times, and doing several rearrangements of the sums, we
|
|
find the closed form
|
|
|
|
.. math ::
|
|
|
|
\Delta_m(x,k) = \sum_{v=0}^{m-1} x^v \sum_{i=0}^{m-v-1} k^i C_m(v,i),
|
|
|
|
where
|
|
|
|
.. math ::
|
|
|
|
C_m(v,i) = \sum_{j=i+1}^{m-v} m^{j-i} \left[{m \atop v+j}\right] {{v+j} \choose v} {j \choose i}
|
|
|
|
in which the square bracket denotes an unsigned Stirling number
|
|
of the first kind.
|
|
|
|
|
|
.. function :: void gamma_rising_fmprb_ui_multipoint(fmprb_t f, const fmprb_t c, ulong n, long prec)
|
|
|
|
Sets `y` to the rising factorial `x (x+1) (x+2) \cdots (x+n-1)`,
|
|
computed using fast multipoint evaluation. This only requires
|
|
`O(n^{1/2+\varepsilon})` multiplications, but has high overhead
|
|
and poor numerical stability (adding `O(n)` guard bits to the input
|
|
might be necessary to achieve full accuracy). It can be expected to
|
|
be faster than the binary splitting algorithm if the input is a
|
|
full-precision number, the precision is at least 100000 bits,
|
|
and *n* is of the same order of magnitude as (perhaps slightly
|
|
smaller than) the number of bits.
|
|
|
|
.. function :: void gamma_harmonic_sum_fmprb_ui_bsplit_simple(fmprb_t y, const fmprb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_harmonic_sum_fmprb_ui_bsplit_rectangular(fmprb_t y, const fmprb_t x, ulong n, ulong step, long prec)
|
|
|
|
.. function :: void gamma_harmonic_sum_fmprb_ui_bsplit(fmprb_t y, const fmprb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_harmonic_sum_fmpcb_ui_bsplit_simple(fmpcb_t y, const fmpcb_t x, ulong n, long prec)
|
|
|
|
.. function :: void gamma_harmonic_sum_fmpcb_ui_bsplit_rectangular(fmpcb_t y, const fmpcb_t x, ulong n, ulong step, long prec)
|
|
|
|
.. function :: void gamma_harmonic_sum_fmpcb_ui_bsplit(fmpcb_t y, const fmpcb_t x, ulong n, long prec)
|
|
|
|
Sets `y` to the harmonic sum `1/x + 1/(x+1) + \ldots + 1/(x+n-1)`,
|
|
computed using division-avoiding binary splitting.
|
|
|
|
The *rectangular* version processes *step* terms at a time in
|
|
analogy with the rising factorial algorithm.
|
|
Letting `f(t) = t (t+1) (t+2) \cdots (t+n-1)`, we have
|
|
`1/x + \ldots + 1/(x+n-1) = f'(x) / f(x)`.
|
|
If the *step* parameter is set to zero, a default value is used.
|
|
|
|
The functions *gamma_harmonic_sum_fmprb_ui_bsplit* and
|
|
*gamma_harmonic_sum_fmpcb_ui_bsplit* automatically choose
|
|
an algorithm depending on the inputs.
|
|
|
|
|
|
Rational arguments
|
|
--------------------------------------------------------------------------------
|
|
|
|
.. function:: void gamma_small_frac(fmprb_t y, unsigned int p, unsigned int q, long prec)
|
|
|
|
Efficiently evaluates `y = \Gamma(p/q)` where `p/q` (assumed to be reduced)
|
|
is one of `1, 1/2, 1/3, 2/3, 1/4, 3/4, 1/6, 5/6`.
|
|
|
|
The cases `\Gamma(1) = 1` and `\Gamma(1/2) = \sqrt \pi` are trivial.
|
|
We reduce all remaining cases to `\Gamma(1/3)` or `\Gamma(1/4)`
|
|
using the following relations:
|
|
|
|
.. math ::
|
|
|
|
\Gamma(2/3) = \frac{2 \pi}{3^{1/2} \Gamma(1/3)}, \quad \quad
|
|
\Gamma(3/4) = \frac{2^{1/2} \pi}{\Gamma(1/4)},
|
|
|
|
\Gamma(1/6) = \frac{\Gamma(1/3)^2}{(\pi/3)^{1/2} 2^{1/3}}, \quad \quad
|
|
\Gamma(5/6) = \frac{2 \pi (\pi/3)^{1/2} 2^{1/3}}{\Gamma(1/3)^2}.
|
|
|
|
The values of `\Gamma(1/3)` and `\Gamma(1/4)` are cached for fast
|
|
repeated evaluation. We compute them rapidly to high precision using
|
|
|
|
.. math ::
|
|
|
|
\Gamma(1/3) = \left( \frac{12 \pi^4}{\sqrt{10}}
|
|
\sum_{k=0}^{\infty}
|
|
\frac{(6k)!(-1)^k}{(k!)^3 (3k)! 3^k 160^{3k}} \right)^{1/6}, \quad \quad
|
|
\Gamma(1/4) = \sqrt{\frac{(2\pi)^{3/2}}{\operatorname{agm}(1, \sqrt 2)}}.
|
|
|
|
An alternative formula which could be used for `\Gamma(1/3)` is
|
|
|
|
.. math ::
|
|
|
|
\Gamma(1/3) = \frac{2^{4/9} \pi^{2/3}}{3^{1/12} \left( \operatorname{agm}\left(1,\frac{1}{2} \sqrt{2+\sqrt{3}}\right)\right)^{1/3}},
|
|
|
|
but this appears to be slightly slower in practice.
|
|
|
|
.. function:: void gamma_series_fmpq_hypgeom(fmprb_struct * res, const fmpq_t a, long len, long prec)
|
|
|
|
Given a rational number `0 < a \le 1`, uses binary splitting to compute
|
|
*len* coefficients in the Taylor series of `\Gamma(a+x)`, i.e. computes
|
|
`\Gamma(a), \Gamma'(a) ... \Gamma^{(\mathrm{len}-1)}(a) / (\mathrm{len}-1)!`.
|
|
In particular, with *len* = 1, this function computes `\Gamma(a)`
|
|
efficiently for small rational *a*.
|
|
|
|
The *len* = 1 case of this algorithm dates back to Brent [Bre1978]_,
|
|
and the extension to higher derivatives was done by Karatsuba [Kar1998]_.
|
|
Karatsuba's original algorithm is suboptimal for large *len*;
|
|
we use the faster algorithm given without error bounds
|
|
by Borwein, Bradley and Crandall [BBC2000]_.
|
|
|
|
The algorithm consists of evaluating the finite part of
|
|
|
|
.. math ::
|
|
|
|
\Gamma(s) = \int_0^{\infty} e^{-t} t^{s-1} dt =
|
|
N^s \left( \sum_{k=0}^R \frac{(-1)^k N^k}{(k + s) k!} + S \right) + I
|
|
|
|
where
|
|
|
|
.. math ::
|
|
|
|
S = \sum_{k=R+1}^{\infty} \frac{(-1)^k N^k}{(k + s) k!}
|
|
|
|
and
|
|
|
|
.. math ::
|
|
|
|
I = \int_N^{\infty} e^{-t} t^{s-1} dt.
|
|
|
|
This formula is valid for complex `s` with `\Re{s} > 0`.
|
|
It is therefore also valid if `s` is a power series argument `s = a + x`
|
|
where `\Re{a} > 0`, so doing the arithmetic with truncated power
|
|
series gives us the derivatives.
|
|
|
|
We now discuss choosing the parameters `R` and `N`, and bounding
|
|
the error terms `S` and `I`. We assume that `0 < \Re{a} \le 1`, `N \ge 1`
|
|
and `R \ge 2 N`. The coefficients of `I` are given by
|
|
|
|
.. math ::
|
|
|
|
I = \int_N^{\infty} e^{-t} t^{a+x-1} dt =
|
|
\sum_{j=0}^{\infty} \frac{x^j}{j!}
|
|
\int_N^{\infty} e^{-t} t^{a-1} \log^j t \; dt.
|
|
|
|
As shown by Karatsuba, the integrals are bounded in absolute value by
|
|
`2 e^{-N} \log^j N`. Thus, for a precision of `p` bits, `N` should be
|
|
about `p \log 2`.
|
|
|
|
Expanding the terms in `S` as geometric series gives
|
|
|
|
.. math ::
|
|
|
|
S = \sum_{j=0}^{\infty} \, x^j \,
|
|
\sum_{k=R+1}^{\infty} \frac{(-1)^{k+j} N^k}{(k+a)^{j+1} k!}.
|
|
|
|
By the assumption that `R \ge 2 N`, the sums are bounded by
|
|
|
|
.. math ::
|
|
|
|
\frac{N^R}{R^{j+1} R!} \left(\frac{1}{2} + \frac{1}{4} + \ldots\right) =
|
|
\frac{N^R}{R^{j+1} R!} \le \frac{1}{R^{j+1}} N^R \left(\frac{e}{R}\right)^R.
|
|
|
|
Let `R = cN` where `c` is to be determined. Expanding the
|
|
logarithm of `N^R \left(\frac{e}{R}\right)^R` around `N = \infty`
|
|
gives the approximate magnitude `(c - c \log c) N`. Setting this equal
|
|
to `-p \log 2`, we find that we should take
|
|
`c = 1/W(1/e) \approx 3.59112147666862` where `W(x)` is the
|
|
Lambert W-function. (Karatsuba gives the incorrect value `c = 3`).
|
|
|
|
We also estimate the working precision needed in the binary splitting
|
|
stage (the binary splitting could be done with exact arithmetic, but
|
|
this is unnecessarily costly). Assume that the sum is around unity in
|
|
magnitude. The binary logarithm of term `k` is roughly
|
|
`b(k) = k \log_2 N + k \log_2 e - k \log_2 k`. Since
|
|
`b'(k) = \log_2 N - \log_2 k`, the largest term magnitude occurs
|
|
roughly at `k = N`, so we need to increase the working precision
|
|
by about `b(N) = N / \log 2` bits.
|
|
|