mirror of
https://github.com/vale981/arb
synced 2025-03-05 09:21:38 -05:00
138 lines
5.4 KiB
Text
138 lines
5.4 KiB
Text
Core arithmetic
|
|
-------------------------------------------------------------------------------
|
|
|
|
* Consider changing the interface of functions such as X_set_Y, X_neg_Y
|
|
to always take a precision parameter (and get rid of X_set_round_Y,
|
|
X_neg_Y etc.). Perhaps have X_setexact_Y methods for convenience,
|
|
or make an exception for _set_ in particular.
|
|
|
|
* Make sure that excessive shifts in add/sub are detected
|
|
with exact precision. Write tests for correctness of overlaps/contains
|
|
in huge-exponent cases.
|
|
|
|
* Double-check correctness of add/sub code with large shifts (rounding x+eps).
|
|
|
|
* Work out semantics for comparisons/overlap/containment checks
|
|
when NaNs are involved, and write test code.
|
|
|
|
* Add adjustment code for balls (when the mantissa is much more precise than
|
|
the error bound, it can be truncated). Also, try to work out more consistent
|
|
semantics for ball arithmetic (with regard to extra working precision, etc.)
|
|
|
|
* Do a low-level rewrite of the fmpr type.
|
|
|
|
The mantissa should probably be changed to an unsigned, top-aligned fraction
|
|
(i.e. the exponent will point to the top rather than the bottom, and
|
|
the top bit of the array of limbs will always be set).
|
|
|
|
This requires a separate sign field, increasing the struct size from
|
|
2 to 3 words, but ought to lead to simpler code and slightly less overhead.
|
|
(Optionally, the sign could be encoded as a bit of the exponent.)
|
|
|
|
The unsigned fraction can be stored directly in a ulong when it has
|
|
most 64 bits. A zero top bit can be used to tag the field as a pointer.
|
|
The pointer could either be to an mpz struct or directly to a limb array
|
|
where the first two limbs encode the allocation and used size.
|
|
There should probably be a recycling mechanism as for fmpz.
|
|
|
|
Required work:
|
|
memory allocation code
|
|
conversions to/from various integer types
|
|
rounding/normalization
|
|
addition
|
|
subtraction
|
|
comparison
|
|
multiplication
|
|
fix any code accessing the exponent and mantissa directly as integers
|
|
|
|
Lower priority:
|
|
low-level division, square root (these are not as critical for
|
|
performance -- it is ok to do them by converting to integers and back)
|
|
|
|
direct low-level code for addmul, mul_ui etc
|
|
|
|
* Native string conversion code instead of relying on mpfr (so we can have
|
|
big exponents, etc.).
|
|
|
|
* Add functions for sloppy arithmetic (non-exact rounding). This could be
|
|
used to speed up some ball operations with inexact output, where we don't
|
|
need the best possible result, just a correct error bound.
|
|
|
|
* Write functions that ignore the possibility that exponents might be
|
|
large, and use where appropriate (e.g. polynomial and matrix multiplication
|
|
where one bounds magnitudes in an initial pass).
|
|
|
|
* Rewrite fmprb_div (similar to fmprb_mul)
|
|
|
|
|
|
Polynomial and power series arithmetic
|
|
-------------------------------------------------------------------------------
|
|
|
|
* Verify that mullow and power series methods always truncate the inputs to
|
|
length n.
|
|
|
|
* Handle all input of special form ax^n + b quickly in composition and powering.
|
|
|
|
* Implemented the addition and convlution methods for Taylor shifts.
|
|
|
|
* Add polynomial mulmid, and use in Newton iteration
|
|
|
|
* Tune basecase/Newton selection for exp/sin/cos series (the basecase
|
|
algorithms are more stable, and faster for quite large n)
|
|
|
|
* Look at using the exponential to compute the complex sine/cosine series
|
|
|
|
* Improve block multiplication, e.g. by discarding blocks that don't contribute
|
|
to the result, and scaling individual blocks.
|
|
|
|
|
|
Elementary functions
|
|
-------------------------------------------------------------------------------
|
|
|
|
* Add more transcendental functions.
|
|
|
|
* Double-check error bounds used in the fixed-point exponential code
|
|
|
|
* Faster elementary functions at low precision (especially log/arctan).
|
|
Use Brent's algorithm (http://maths-people.anu.edu.au/~brent/pd/RNC7t4.pdf):
|
|
atan(x) = atan(p/q) + atan((q*x-p)/(q+p*x))
|
|
|
|
* Use the complex Newton iteration for cos(pi p/q) when appropriate.
|
|
Double check the proof of correctness of the complex Newton iteration
|
|
and make it work when the polynomial is not exact.
|
|
|
|
* For small cos(pi p/q) and sin(pi p/q) use a lookup table of the
|
|
1/q values and then do complex binary exponentiation.
|
|
|
|
* Investigate using Chebyshev polynomials for elefun_cos_minpoly.
|
|
This is certainly faster when n is prime, but might be faster for all n,
|
|
at least if implemented cleverly.
|
|
|
|
|
|
Special functions
|
|
-------------------------------------------------------------------------------
|
|
|
|
* Write a faster logarithmic rising factorial (with correct branch
|
|
cuts) for reducing the complex log gamma function. Also implement
|
|
the logarithmic reflection formula.
|
|
|
|
* Tune zeta parameter selection. Also make it robust: for example,
|
|
if FMPRB_RAD_PREC is changed to 2 (a testing tactic suggested
|
|
by Paul Zimmermann), the parameter selection code currently hangs
|
|
in an infinite loop for many inputs.
|
|
|
|
* Extend Stirling series code to compute polygamma functions (i.e. starting
|
|
the series from some derivative), and optimize for a small number of
|
|
derivatives by using a direct recurrence instead of binary splitting.
|
|
|
|
* Fall back to the real code when evaluating gamma functions (or their
|
|
power series) at points that happen to be real
|
|
|
|
* Implement more functions: error functions, Bessel functions,
|
|
theta functions, etc.
|
|
|
|
Other
|
|
-------------------------------------------------------------------------------
|
|
|
|
* Document fmpz_extras
|
|
|