Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 Some publications
Rapports de recherche INRIA/INRIA Research Reports -
search or register LORIA publications
Until version 4.1.4, GNU MP
(GMP for short) division has complexity
O(M(n) log(n)), which is not asymptotically optimal.
We propose here some division algorithms that achieve O(M(n)) with small
constants.
This short note shows the nasty effects of patents for the development of free
software, even for patents that were not written with software applications
in mind.
The only primitive trinomials of degree 6972593 over GF(2)
are x6972593 + x3037958 + 1 and its reciprocal.
Gal's Accurate Tables Method Revisited,
with Damien Stehlé, INRIA Research Report RR-5359, October 2004.
An improved version
will appear in the Proceedings of
Arith'17.
Those ideas are demonstrated by an implementation of the
exp2 function
in double precision.
Erratum in the final version of the paper: in Section 4, the
simultaneous worst case for sin and cos is t0=1f09c0c6cde5e3
and not t0=31a93fddd45e3.
Gal's accurate tables algorithm aims at providing an efficient implementation of
elementary functions with correct rounding as often as possible. This method
requires an expensive pre-computation of a table made of the values taken by the
function or by several related functions at some distinguished points. Our
improvements of Gal's method are two-fold: on the one hand we describe what is
the arguably best set of distinguished values and how it improves the efficiency
and correctness of the implementation of the function, and on the other hand we
give an algorithm which drastically decreases the cost of the pre-computation.
These improvements are related to the worst cases for the correct rounding of
mathematical functions and to the algorithms for finding them. We show that the
whole method can be turned into practice by giving complete tables for
2x and sin(x) for x in [1/2,1[, in double precision.
A naive digital plane is a subset of
points (x,y,z) in Z3 verifying
h ≤ ax+by+cz < h+max{ | a|,|b|, |c| } where
(a,b,c,h) in Z4. Given a finite unstructured subset
of Z3, determine whether there
exists a naive digital plane containing it is called
digital plane recognition. This question is
rather classical in the field of digital geometry (also called
discrete geometry). We suggest in this paper a new algorithm to
solve it. Its asymptotic complexity
is bounded by O(n7) but its behavior seems to be linear in
practice. It uses an original strategy of optimization in a set of
triangular facets (triangles). The code is short and elementary
(less than 300 lines) and available on
http://www.loria.fr/~debled/plane.
On March 10, 2004, Dan Bernstein announced a revised draft of his paper
Removing redundancy in high-precision Newton iteration,
with algorithms that compute a reciprocal
of order n over C[[x]] 1.5+o(1) times longer than a product;
a quotient or logarithm 2.16666...+o(1) times longer;
a square root 1.83333...+o(1) times longer;
an exponential 2.83333...+o(1) times longer.
We give better algorithms.
Note added on March 24, 2004: the 1.5+o(1) reciprocal algorithm was already
published by Schönhage (Information Processing Letters 74, 2000, p. 41-46).
Ce document rassemble des notes d'un cours donné en 2003 dans
la filière Algorithmique Numérique et
Symbolique du DEA d'Informatique de l'Université Henri Poincaré
Nancy 1.
Ces notes sont basées en grande partie sur le livre
Elementary Functions. Algorithms and Implementation de Jean-Michel
Muller.
Aussi disponible sur LibreCours.
A new proof of the ``accurate summation'' algorithm proposed by Demmel and
Hida is presented.
The main part of that proof
has been written in the Coq language and verified by the Coq proof assistant.
The binary algorithm is a variant of the Euclidean algorithm that
performs well in practice. We present a quasi-linear time
recursive algorithm that computes the greatest common divisor of
two integers by simulating a slightly modified version of the
binary algorithm. The structure of the
recursive algorithm is very close to the one of the well-known
Knuth-Schönhage fast gcd algorithm, but the description and the
proof of correctness are significantly simpler in our case. This
leads to a simplification of the implementation and to better
running times.
Consider polynomials over GF(2).
We describe efficient
algorithms for finding trinomials with
large irreducible (and possibly primitive) factors, and give examples
of trinomials
having a primitive factor of degree r for all Mersenne exponents
r = + 3 mod 8 in the range 5 < r < 2976221,
although there is no irreducible trinomial of degree r.
We also give
trinomials with a primitive factor of degree r = 2k
for 3 <= k <= 12. These trinomials enable efficient
representations of the finite field GF(2r).
We show how trinomials with large primitive factors can
be used efficiently in applications where primitive trinomials
would normally be used.
This paper provides a simpler proof of the ``accurate summation'' algorithm
proposed by Demmel and Hida in
DeHi02.
It also gives improved bounds in some cases, and examples showing that those
new bounds are optimal.
This simpler proof will be used to obtain a computer-generated proof
of Demmel-Hida's algorithm, using a proof assistant like HOL, PVS or Coq.
Some aspects of what a standard for the implementation of the elementary
functions could be are presented. Firstly, the need for such a standard is
motivated. Then the proposed standard is given. The question of roundings
constitutes an important part of this paper: three levels are proposed,
ranging from a level relatively easy to attain (with fixed maximal relative
error) up to the best quality one, with correct rounding on the whole range
of every function.
We do not claim that we always suggest the right choices, or that we have
thought about all relevant issues. The mere goal of this paper is to raise
questions and to launch the discussion towards a standard.
Cet article décrit les premiers résultats de la recherche
(avec Richard Brent et Samuli Larvala) de trinômes primitifs de
degré 6972593 sur GF(2), et indique quelques conséquences
amusantes de la loi de Moore.
The short product of two power series is
the meaningful part of the product of these objects,
i.e., sum(a[i] b[j] xi+j, i+j < n). In [Mulders00], Mulders gives an
algorithm to compute a short product faster than the full product in the
case of Karatsuba's multiplication [KaOf62].
This algorithm works by selecting a
cutoff point k and performing a full k x k product and two
(n-k) x (n-k) short products recursively. Mulders also gives a
heuristically optimal cutoff point beta n. In this paper, we determine
the optimal cutoff point in Mulders' algorithm. We also give a slightly
more general description of Mulders' method.
We propose a new algorithm to find worst cases for correct rounding
of an analytic function.
We first reduce this problem to the real small value problem
--- i.e. for polynomials with real coefficients.
Then we show that this second problem can be solved efficiently,
by extending Coppersmith's work
on the integer small value problem
--- for polynomials with integer coefficients ---
using lattice reduction
[Coppersmith96a,Coppersmith96b,Coppersmith01].
For floating-point numbers with a mantissa less than $N$,
and a polynomial approximation of degree $d$,
our algorithm finds all worst cases at distance $< N^{\frac{-d^2}{2d+1}}$
from a machine number in time $O(N^{\frac{d+1}{2d+1}+\varepsilon})$.
For $d=2$, this improves on the $O(N^{2/3+\varepsilon})$ complexity
from Lef\`evre's algorithm [Lefevre00,LeMu01] to $O(N^{3/5+\varepsilon})$.
We exhibit some new worst cases found using our algorithm,
for double-extended and quadruple precision.
For larger $d$, our algorithm can be used to check that there exist no
worst cases at distance $< N^{-k}$ in time $O(N^{\frac{1}{2}+O(\frac{1}{k})})$.
Symbolic computation and computer algebra systems are usually known to be
either very slow, or memory expensive.
However, some specific symbolic computation problems have
received in the last years new algorithmic solutions,
which enabled to push further the limits of what is doable within a
reasonable amount of time and space.
Some noticeable examples are polynomial factorisation,
lattice reduction, Groebner basis computation.
We will present a few such algorithms, together with a
state-of-the-art of what problems computer algebra systems can
(or cannot) solve, and for each problem what the current frontiers are.
A proof of GMP square root using the Coq assistant, INRIA Research Report 4475, June 2002.
To appear in Journal of Automated Reasoning, Special Issue on Automating and Mechanising Mathematics: In honour of N.G. de Bruijn.
We present a formal proof (at the implementation level) of an efficient
algorithm proposed by Paul Zimmermann
[Zimmermann00]
to compute square roots
of arbitrary large integers.
This program, which is part of the GNU Multiple Precision
Arithmetic Library (GMP) is completely proven within the
Coq system. Proofs are developed using the Correctness tool
to deal with imperative features of the program.
The formalization is rather large (more than 13000 lines) and requires
some advanced techniques for proof management and reuse.
Note: Vincent Lefèvre found a potential problem in the GMP implementation,
which is fixed by the following patch. This does not
contradicts our proof: the problem is due to the different C data types
(signed or not, different width), whereas our proof assumed a unique type.
This document presents my research contributions from 1988 to 2001,
performed first at INRIA Rocquencourt within the Algo project (1988 to 1992),
then at INRIA Lorraine and LORIA within the projects Euréca (1993-1997),
PolKA (1998-2000), and Spaces (2001).
Three main periods can be roughly
distinguished: from 1988 to 1992 where my research
focused on analysis of algorithms and random generation,
from 1993 to 1997 where I worked on computer algebra and related algorithms,
finally from 1998 to 2001 where I was interested in arbitrary precision
floating-point arithmetic with well-defined semantics.
Arithmétique en précision
arbitraire, rapport de recherche INRIA 4272,
septembre 2001, à paraître dans la revue
"Calculateurs parallèles" [in french].
This paper surveys the available algorithms for integer or
floating-point arbitrary precision calculations. After a brief
discussion about possible memory representations, known algorithms
for multiplication, division, square root, greatest common
divisor, input and output, are presented, together with their
complexity and usage. For each operation, we present the naïve
algorithm, the asymptotically optimal one, and also intermediate
«divide and conquer» algorithms, which often are very useful. For
floating-points computations, some general-purpose methods are
presented for algebraic, elementary, hypergeometric and special
functions.
Recently, van Hoeij's published a new algorithm for factoring
polynomials over the rational integers. This algorithms rests
on the same principle as Berlekamp-Zassenhaus, but uses
lattice basis reduction to improve drastically on the
recombination phase. The efficiency of the LLL algorithm is very
dependent on fine tuning; in this paper, we present such tuning to
achieve better performance. Simultaneously, we describe a
generalization of van Hoeij's algorithm to factor polynomials over
number fields.
This paper gives new results for the isolation of real roots of a univariate
polynomial using Descartes' rule of signs, following work of Vincent,
Uspensky, Collins and Akritas, Johnson, Krandick.
The first contribution is a generic algorithm which enables one to describe
all the existing strategies in a unified framework.
Using that framework, a new algorithm is presented, which is optimal
in terms of memory usage, while doing no more computations than other
algorithms based on Descartes' rule of signs.
We show that these critical optimizations have important consequences
by proposing a full efficient solution for isolating the real roots
of zero-dimensional polynomial systems.
Density results on floating-point invertible numbers, with Guillaume Hanrot, Joël Rivat and Gérald Tenenbaum, to appear
in TCS, January 2001.
(The slides of a related talk I gave in January 2002 at the workshop
"Number Theory and Applications" in Luminy are
here.)
Let F_k denote the k-bit mantissa
floating-point (FP) numbers.
We prove a conjecture of J.-M. Muller according to which
the proportion of numbers in F_k with no FP-reciprocal (for rounding to the
nearest element) approaches
1/2-3/2 log(4/3) i.e. about 0.06847689 as
k goes to infinity.
We investigate a similar question for the inverse square root.
The standard algorithm for testing irreducibility of a trinomial of prime degree r over GF(2) requires 2r + O(1) bits of memory and of order r2
bit-operations. We describe an algorithm which requires only 3r/2 + O(1) bits of memory and less bit-operations than the standard algorithm.
Using the algorithm, we have found several new irreducible trinomials of high degree.
If r is a Mersenne exponent (i.e. 2r - 1 is a Mersenne prime), then an irreducible trinomial of degree r is necessarily primitive and can be used
to give a pseudo-random number generator with period at least 2r - 1. We give examples of primitive trinomials for r = 756839, 859433, and
3021377. The results for r = 859433 extend and correct some computations of Kumada et al [Mathematics of Computation 69 (2000),
811-814]. The two results for r = 3021377 are primitive trinomials of the highest known degree.
This short note gives a detailed correctness proof of
fast (i.e. subquadratic) versions of the
GNU MP
mpn_bz_divrem_n and mpn_sqrtrem functions, together with
complete GMP code.
The mpn_bz_divrem_n function divides (with remainder)
a number of 2n limbs by a divisor of n limbs in 2K(n), where K(n) is the
time spent in a (n times n) multiplication, using the
Moenck-Borodin-Jebelean-Burnikel-Ziegler algorithm.
The mpn_sqrtrem computes the square root and the remainder
of a number of 2n limbs (square root and remainder have about n limbs each)
in time 3K(n)/2; it uses Karatsuba Square Root.
We present new algorithms for the inverse, quotient, or
square root of power series.
The key trick is a new algorithm --- RecursiveMiddleProduct or RMP
--- computing the n middle coefficients of a
(2n * n) product in essentially the same number of operations --- K(n)
--- than a full (n * n) product with Karatsuba's method.
This improves previous work of Mulders, Karp and Markstein,
Burnikel and Ziegler.
These results apply both to series, polynomials, and multiple precision
floating-point numbers.
A Maple implementation is available here,
together with slides of a talk given at ENS Paris
in January 2004.
In this paper we describe ideas used to accelerate the Searching Phase
of the Berlekamp-Zassenhaus algorithm, the algorithm most widely used
for computing factorizations in Z[x]. Our ideas do not alter the
theoretical worst-case complexity, but they do have a significant effect in
practice: especially in those cases where the cost of the Searching
Phase completely dominates the rest of the algorithm. A complete
implementation of the ideas in this paper is publicly available in the
library NTL. We give timings of this implementation on
some difficult factorization problems.