Some publications

Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Some publications [2009] [2008] [2007] [2006] [2005] [2004] [2003] [2002] [2001] [2000] [1999] [1998] [1997] [1996] [1994] [1993] [1991]
[ INRIA Research Reports]

2009:

Modern Computer Arithmetic, with Richard Brent (in preparation).

The art of computer programming

for all operations (not just multiplication as many text books),
for all size ranges (not just schoolbook methods or FFT-based methods),
and including all details (for example how to properly deal with carries for integer algorithms, or a rigorous analysis of roundoff errors for floating-point algorithms).

Calcul formel : mode d'emploi. Exemples en Maple, with Philippe Dumas, Claude Gomez, Bruno Salvy, March 2009 (in french).

This book is a free version of the book of the same name published by Masson in 1995. The examples use an obsolete version of Maple (V.3), but most of the text still applies to Maple and other modern computer algebra systems.

Computing predecessor and successor in rounding to nearest, with Siegfried Rump, Sylvie Boldo and Guillaume Melquiond, to appear in BIT Numerical Mathematics, February 2009.

We give simple and efficient methods to compute and/or estimate the predecessor and successor of a floating-point number using only floating-point operations in rounding to nearest. This may be used to simulate interval operations, in which case the quality in terms of the diameter of the result is significantly improved compared to existing approaches.

2008:

Worst Cases for the Exponential Function in the IEEE 754r decimal64 Format, with Vincent Lefèvre and Damien Stehlé, LNCS volume 5045, pages 114-126, special LNCS issue following the Dagstuhl seminar 06021: Reliable Implementation of Real Number Algorithms: Theory and Practice, August 2008,

^-15

^-11

^-2

decimal32

decimal64

Ten New Primitive Binary Trinomials, with Richard Brent, April 2008, to appear in Mathematics of Computation [corresponding web page].

We exhibit ten new primitive trinomials over GF(2) of record degrees 24036583, 25964951, 30402457, and 32582657. This completes the search for the currently known Mersenne prime exponents.

Implementation of the reciprocal square root in MPFR, March 2008 (extended abstract), Dagstuhl Seminar Proceedings following Dagstuhl seminar 08021 (Numerical validation in current hardware architectures), January 06-11, 2008.

MPFR

GNU MP's

mpn

Landau's function for one million billions, with Marc Deléglise and Jean-Louis Nicolas, February 2008, to appear in Journal de Théorie des Nombres de Bordeaux. A Maple program implementing the algorithm described in this paper is available from Jean-Louis Nicolas web page.

₁

^a₁

₂

^a₂

^a_k

₁

^a₁

₂

^a₂

^a_k

_{l(M) ≤ n}

^3/2

¹⁵

l-superchampion numbers

superior highly composite numbers

_{d divides n}

Faster Multiplication in GF(2)[x], with Richard P. Brent, Pierrick Gaudry and Emmanuel Thomé, Proceedings of the Eighth Algorithmic Number Theory Symposium (ANTS-VIII), May 17-22, 2008, Banff Centre, Banff, Alberta (Canada), A. J. van der Poorten and A. Stein, editors, pages 153--166, LNCS 5011, 2008. A preliminary version appeared as INRIA Research Report, November 2007.

The code that we developed for this paper is contained in the gf2x package, available under the GNU General Public License from http://wwwmaths.anu.edu.au/~brent/gf2x.html.

Note (added 20 May 2008): part of this paper will be soon obsolete, namely the base case section, with the PCMULQDQ instruction from Intel's new AVX instruction-set (compiler intrinsics, simulator, compiler support).

Note (added 26 Jan 2009): Gao and Mateer have found a theoretical speedup in Cantor multiplication, see http://cr.yp.to/f2mult.html.

2007:

A Multi-level Blocking Distinct Degree Factorization Algorithm, INRIA Research Report 6331, with Richard P. Brent, 16 pages, October 2007 [to appear in a special issue of Contemporary Mathematics]. This paper describes in detail the algorithm presented at the 8th International Conference on Finite Fields and Applications (Fq8), July 9-13, 2007, Melbourne, Australia [extended abstract], [Richard's slides].

interval polynomials

^3/2

^1/2

A GMP-based implementation of Schönhage-Strassen's large integer multiplication algorithm, with Pierrick Gaudry and Alexander Kruppa, Proceedings of the International Symposium on Symbolic and Algebraic Computation (ISSAC 2007), Waterloo, Ontario, Canada, pages 167-174, editor C.W.Brown, 2007.

ⁿ

Note: this paper was motivated by Allan Steel, and the corresponding code is available from http://www.loria.fr/~kruppaal/mul_fft-4.2.1.1.tgz.

Time- and Space-Efficient Evaluation of Some Hypergeometric Constants, with Howard Cheng, Guillaume Hanrot, Emmanuel Thomé and Eugene Zima, Proceedings of the International Symposium on Symbolic and Algebraic Computation (ISSAC 2007), Waterloo, Ontario, Canada, pages 85-91, editor C.W.Brown, 2007.

Worst Cases of a Periodic Function for Large Arguments, with Guillaume Hanrot, Vincent Lefèvre and Damien Stehlé, Proceedings of the 18th IEEE Symposium on Computer Arithmetic (ARITH'18), pages 133-140, Montpellier, France, 2007. A preliminary version appeared as INRIA Research Report 6106, January 2007.

^{0.676 p}

2006:

Asymptotically Fast Division for GMP, October 2005, revised August 2006 and October 2006.

GNU MP

invert.c

Errors Bounds on Complex Floating-Point Multiplication, with Richard Brent and Colin Percival, Mathematics of Computation volume 76 (2007), pages 1469-1481. Some technical details are given in INRIA Research Report 6068, December 2006.

₀

₁

₀

₁

^1-t

^-24

^-53

20 years of ECM (© Springer-Verlag), with Bruce Dodson, Proceedings of ANTS VII, July 2006. A preliminary version appeared as INRIA Research Report 5834, February 2006.

Erratum: on page 541 we write ``Computer experiments indicate that these curves have, on average, 3.49 powers of 2 and 0.78 powers of 3, while Suyama's family has 3.46 powers of 2 and 1.45 powers of 3''. As noticed by Romain Cosset, those experiments done by the first author are wrong, since he ran 1000 random curves with the same prime input p=10¹⁰+19, and the results differ according on the congruence of p mod powers of 2 and 3. With 10000 random curves on the 10000 primes just above 10²⁰, we get an average of 2^3.34*3^1.68 = 63.9 for Suyama's family, and 2^3.36*3^0.67 = 21.5 for curves of the form (16d + 18) y² = x³ + (4d + 2)x² + x.

2005:

MPFR: A Multiple-Precision Binary Floating-Point Library With Correct Rounding, with Laurent Fousse, Guillaume Hanrot, Vincent Lefèvre, Patrick Pélissier, INRIA Research Report RR-5753, November 2005. A revised version will appear in ACM TOMS (Transactions on Mathematical Software).

correct rounding

exceptions

Techniques algorithmiques et méthodes de programmation (in french), 11 pages, July 2005, will appear in Encyclopédie de l'informatique et des systèmes d'information published by Vuibert.

5,341,321, June 2005.

This short note shows the nasty effects of patents for the development of free software, even for patents that were not written with software applications in mind.

The Elliptic Curve Method, November 2002, revised April 2003, finally appeared in the Encylopedia of Cryptography and Security, Springer, 2005 (old link).

Describes in two pages the history of ECM, how it works at high level, improvements to the method, and some applications. MPFR : vers un calcul flottant correct ? (in french), Interstices, 2005.

MPFR

A primitive trinomial of degree 6972593, with Richard Brent and Samuli Larvala, Mathematics of Computation, volume 74, number 250, pages 1001-1002, 2005.

^6972593

^3037958

An elementary digital plane recognition algorithm [pdf], with Yan Gerard and Isabelle Debled-Rennesson, appeared in Discrete Applied Mathematics, volume 151, issue 1-3, pages 169-183, 2005.

⁴

digital plane recognition

⁷

http://www.loria.fr/~debled/plane

here

Searching Worst Cases of a One-Variable Function Using Lattice Reduction, with Damien Stehlé and Vincent Lefèvre, IEEE Transactions on Computers, volume 54, number 3, pages 340-346, 2005. A preliminary version appeared as INRIA Research Report 4586. Some results for the 2^x function double-extended precision are available here.

^{-d^2/(2d+1)}

^{(d+1)/(2d+1)+epsilon}

^2/3+epsilon

^4/7+epsilon

^-k

^1/2+epsilon

2004:

Gal's Accurate Tables Method Revisited, with Damien Stehlé, INRIA Research Report RR-5359, October 2004. An improved version appeared in the Proceedings of Arith'17. Those ideas are demonstrated by an implementation of the exp2 function in double precision. Erratum in the final version of the paper: in Section 4, the simultaneous worst case for sin and cos is t0=1f09c0c6cde5e3 and not t0=31a93fddd45e3. See also my coauthor page.

Newton iteration revisited, with Guillaume Hanrot, March 2004.

Removing redundancy in high-precision Newton iteration

Note added on March 24, 2004: the 1.5+o(1) reciprocal algorithm was already published by Schönhage (Information Processing Letters 74, 2000, p. 41-46)

Note added on July 24, 2006: in a preprint Newton's method and FFT trading, Joris van der Hoeven gives better constants for the exponential (2.333...) and the quotient (1.666...).

Note added on April 20, 2009: as noticed by David Harvey, in Section 3, in the Divide algorithm, Step 4 should read q <- q₀ + g₀ (h₁ - ε) xⁿ, where h = h₀ + xⁿ h₁. Indeed, after Step 3 we have q₀ f = h₀ + ε xⁿ + O(x²ⁿ), i.e., q₀ = h₀/f + ε/f xⁿ + O(x²ⁿ). Thus h/f = q₀ + (h₁ - ε)/f xⁿ + O(x²ⁿ) = q₀ + g₀ (h₁ - ε) xⁿ + O(x²ⁿ).

Arithmétique flottante, with Vincent Lefèvre, INRIA Research Report RR-5105, February 2004 (in french).

Algorithmique Numérique et Symbolique

Elementary Functions. Algorithms and Implementation

LibreCours

A Formal Proof of Demmel and Hida's Accurate Summation Algorithm, with Laurent Fousse, January 2004.

main part

The Middle Product Algorithm, I. Speeding up the division and square root of power series, with Guillaume Hanrot and Michel Quercia, AAECC, volume 14, number 6, pages 415-438, 2004. A preliminary version appeared as INRIA Resarch Report 3973.

We present new algorithms for the inverse, division, and square root of power series. The key trick is a new algorithm --- MiddleProduct or, for short, MP --- computing the n middle coefficients of a (2n-1) * n full product in the same number of multiplications as a full n * n product. This improves previous work of Brent, Mulders, Karp and Markstein, Burnikel and Ziegler. These results apply both to series and polynomials. Proposal for a Standardization of Mathematical Function Implementation in Floating-Point Arithmetic, with David Defour, Guillaume Hanrot, Vincent Lefèvre, Jean-Michel Muller and Nathalie Revol, January 2003, Numerical Algorithms, volume 37, number 1-4, pages 367-375, 2004. Extended version appeared as INRIA Research Report RR-5406.

Some aspects of what a standard for the implementation of the elementary functions could be are presented. Firstly, the need for such a standard is motivated. Then the proposed standard is given. The question of roundings constitutes an important part of this paper: three levels are proposed, ranging from a level relatively easy to attain (with fixed maximal relative error) up to the best quality one, with correct rounding on the whole range of every function. We do not claim that we always suggest the right choices, or that we have thought about all relevant issues. The mere goal of this paper is to raise questions and to launch the discussion towards a standard. A long note on Mulders' short product, with Guillaume Hanrot, November 2002. A revised version appeared in the Journal of Symbolic Computation, volume 37, pages 391-401, 2004.

^i+j

k x k

(n-k) x (n-k)

beta n

2003:

A Binary Recursive Gcd Algorithm, with Damien Stehlé, INRIA Research Report RR-5050, December 2003. A revised version (© Springer-Verlag) is published in the Proceedings of the Algorithmic Number Theory Symposium (ANTS VI). [Erratum] [implementation in GMP]

The binary algorithm is a variant of the Euclidean algorithm that performs well in practice. We present a quasi-linear time recursive algorithm that computes the greatest common divisor of two integers by simulating a slightly modified version of the binary algorithm. The structure of the recursive algorithm is very close to the one of the well-known Knuth-Schönhage fast gcd algorithm, but the description and the proof of correctness are significantly simpler in our case. This leads to a simplification of the implementation and to better running times. Algorithms for finding almost irreducible and almost primitivive trinomials, with Richard Brent, April 2003, Proceedings of a Conference in Honour of Professor H. C. Williams, Banff, Canada (May 2003), The Fields Institute, Toronto.

Note added April 22, 2009: this paper is mentioned in Divisibility of Trinomials by Irreducible Polynomials over F₂, by Ryul Kim and Wolfram Koepf, International Journal of Algebra, Vol. 3, 2009, no. 4, 189-197.

Accurate Summation: Towards a Simpler and Formal Proof [pdf], with Laurent Fousse, March 2003, in Proc. of RNC'5, pages 97-108.

DeHi02

10^2098959 [in french], décembre 2002, paru dans la Gazette du Cines, numéro 14, janvier 2003.

Cet article décrit les premiers résultats de la recherche (avec Richard Brent et Samuli Larvala) de trinômes primitifs de degré 6972593 sur GF(2), et indique quelques conséquences amusantes de la loi de Moore. A Fast Algorithm for Testing Reducibility of Trinomials mod 2 and Some New Primitive Trinomials of Degree 3021377, with Richard Brent and Samuli Larvala, Mathematics of Computation, volume 72, number 243, pages 1443-1452, 2003. A preliminary version appeared as Report PRG TR-13-00, Oxford University Computing Laboratory, December 2000.

2002:

Worst Cases and Lattice Reduction [pdf], with Damien Stehlé and Vincent Lefèvre, preprint, October 2002.

real small value problem

integer small value problem

Symbolic Computation: Recent Progress and New Frontiers, invited talk at SCAN'2002, September 2002.

Symbolic computation and computer algebra systems are usually known to be either very slow, or memory expensive. However, some specific symbolic computation problems have received in the last years new algorithmic solutions, which enabled to push further the limits of what is doable within a reasonable amount of time and space. Some noticeable examples are polynomial factorisation, lattice reduction, Groebner basis computation. We will present a few such algorithms, together with a state-of-the-art of what problems computer algebra systems can (or cannot) solve, and for each problem what the current frontiers are. A Proof of GMP Square Root, with Yves Bertot and Nicolas Magaud, Journal of Automated Reasoning, volume 29, 2002, pages 225--252, Special Issue on Automating and Mechanising Mathematics: In honour of N.G. de Bruijn. A preliminary version appeared as INRIA Research Report 4475.

Zimmermann00

Note: Vincent Lefèvre found a potential problem in the GMP implementation, which is fixed by the following patch. This does not contradicts our proof: the problem is due to the different C data types (signed or not, different width), whereas our proof assumed a unique type.

Aliquot Sequence 3630 Ends After Reaching 100 Digits, with M. Benito, W. Creyaufmüller and J. L. Varona, Experimental Mathematics, volume 11, number 2, pages 201-206.

⁶

Note added 29 July 2008: in this paper, we say (page 203, middle of right column) "It is curious to note that the driver 2⁹ * 3 * 11 * 31 has appeared in no place in any of the sequences given in Table 1"; Clifford Stern notes that this driver appears at index 215 of sequence 165744, which gives 2⁹ * 3 * 7 * 11 * 31² * 37 * 10594304241173.

2001:

De l'algorithmique à l'arithmétique via le calcul formel, Habilitation à diriger des recherches, novembre 2001. (Transparents de la soutenance.)

This document presents my research contributions from 1988 to 2001, performed first at INRIA Rocquencourt within the Algo project (1988 to 1992), then at INRIA Lorraine and LORIA within the projects Euréca (1993-1997), PolKA (1998-2000), and Spaces (2001). Three main periods can be roughly distinguished: from 1988 to 1992 where my research focused on analysis of algorithms and random generation, from 1993 to 1997 where I worked on computer algebra and related algorithms, finally from 1998 to 2001 where I was interested in arbitrary precision floating-point arithmetic with well-defined semantics. Arithmétique en précision arbitraire, rapport de recherche INRIA 4272, septembre 2001, à paraître dans la revue "Calculateurs parallèles" [in french].

This paper surveys the available algorithms for integer or floating-point arbitrary precision calculations. After a brief discussion about possible memory representations, known algorithms for multiplication, division, square root, greatest common divisor, input and output, are presented, together with their complexity and usage. For each operation, we present the naïve algorithm, the asymptotically optimal one, and also intermediate «divide and conquer» algorithms, which often are very useful. For floating-points computations, some general-purpose methods are presented for algebraic, elementary, hypergeometric and special functions. Tuning and Generalizing Van Hoeij's Algorithm, with Karim Belabas and Guillaume Hanrot, INRIA Research report 4124, February 2001.

Recently, van Hoeij's published a new algorithm for factoring polynomials over the rational integers. This algorithms rests on the same principle as Berlekamp-Zassenhaus, but uses lattice basis reduction to improve drastically on the recombination phase. The efficiency of the LLL algorithm is very dependent on fine tuning; in this paper, we present such tuning to achieve better performance. Simultaneously, we describe a generalization of van Hoeij's algorithm to factor polynomials over number fields. Efficient isolation of a polynomial real roots, with Fabrice Rouillier, INRIA Research report 4113, February 2001.

This paper gives new results for the isolation of real roots of a univariate polynomial using Descartes' rule of signs, following work of Vincent, Uspensky, Collins and Akritas, Johnson, Krandick. The first contribution is a generic algorithm which enables one to describe all the existing strategies in a unified framework. Using that framework, a new algorithm is presented, which is optimal in terms of memory usage, while doing no more computations than other algorithms based on Descartes' rule of signs. We show that these critical optimizations have important consequences by proposing a full efficient solution for isolating the real roots of zero-dimensional polynomial systems. Density results on floating-point invertible numbers, with Guillaume Hanrot, Joël Rivat and Gérald Tenenbaum, Theoretical Computer Science, volume 291, number 2, 2003, pages 135-141. (The slides of a related talk I gave in January 2002 at the workshop "Number Theory and Applications" in Luminy are here.)

Let F_k denote the k-bit mantissa floating-point (FP) numbers. We prove a conjecture of J.-M. Muller according to which the proportion of numbers in F_k with no FP-reciprocal (for rounding to the nearest element) approaches 1/2-3/2 log(4/3) i.e. about 0.06847689 as k goes to infinity. We investigate a similar question for the inverse square root.

2000:

A proof of GMP fast division and square root implementations, September 2000.

mpn_bz_divrem_n

mpn_sqrtrem

mpn_bz_divrem_n

mpn_sqrtrem

Karatsuba Square Root

Speeding up the Division and Square Root of Power Series, with Guillaume Hanrot and Michel Quercia, INRIA Research Report 3973, July 2000.

RecursiveMiddleProduct

RMP

here

slides

Factorization in Z[x]: the searching phase, with John Abbott and Victor Shoup, April 2000, Proceedings of ISSAC'2000.

In this paper we describe ideas used to accelerate the Searching Phase of the Berlekamp-Zassenhaus algorithm, the algorithm most widely used for computing factorizations in Z[x]. Our ideas do not alter the theoretical worst-case complexity, but they do have a significant effect in practice: especially in those cases where the cost of the Searching Phase completely dominates the rest of the algorithm. A complete implementation of the ideas in this paper is publicly available in the library NTL. We give timings of this implementation on some difficult factorization problems.

1999:

Karatsuba Square Root, INRIA Research Report 3905, November 1999.

On Sums of Seven Cubes, with Francois Bertault and Olivier Ramaré, Mathematics of Computation, volume 68, number 227, pages 1303-1310, 1999.

Uniform Random Generation of Decomposable Structures Using Floating-Point Arithmetic with Alain Denise, Theoretical Computer Science, volume 218, number 2, 219--232, 1999. A preliminary version appeared as INRIA Research Report 3242, September 1997.

recursive method

1998:

Estimations asymptotiques du nombre de chemins Nord-Est de pente fixée et de largeur bornée, avec Isabelle Dutour et Laurent Habsieger, décembre 1998.

We study here a quantity related to the number of walks with North and East steps staying under the line of slope d starting from the origin. We give an asymptotic analysis of this quantity with respect to both the width n and the slope d, answering to a question asked by Bernard Mourrain.

1997:

Calcul formel : ce qu'il y a dans la boîte, journées X-UPS, octobre 1997.

Cinq algorithmes de calcul symbolique, notes de cours d'un module de spécialisation du DEA d'informatique de l'Université Henri Poincaré Nancy 1, 1997.

These are lecture notes of a course entitled «Some computer algebra algorithms» given by the author at the University of Nancy 1 in 1997. Five fundamental algorithms used by computer algebra systems are briefly described: Gosper's algorithm for computing indefinite sums, Zeilberger's algorithm for definite sums, Berlekamp's algorithm for factoring polynomials over finite fields, Zassenhaus' algorithm for factoring polynomials with integer coefficients, and Lenstra's integer factorization algorithm using elliptic curves. All these algorithms were implemented --- or improved --- by the author in the computer algebra system MuPAD.

Progress Report on Parallelism in MuPAD, with Christian Heckler and Torsten Metzner, Inria Research Report 3154, April 1997.

MuPAD is a general purpose computer algebra system with two programming concepts for parallel processing: micro-parallelism for shared-memory machines and macro-parallelism for distributed architectures. This article describes language instructions for both concepts, the current state of implementation, together with some examples.

1996:

Polynomial Factorization Challenges, with L. Bernardin and M. Monagan, poster presented at the International Symposium on Symbolic and Algebraic Computation (ISSAC), July 1996, 4 pages.

1994:

GFUN: a Maple package for the manipulation of generating and holonomic functions in one variable, with Bruno Salvy, ACM Transactions on Mathematical Software, volume 20, number 2, 1994. A preliminary version appeared as INRIA Technical Report 143, October 1992.

We describe the GFUN package which contains functions for manipulating sequences, linear recurrences or differential equations and generating functions of various types. This document is intended both as an elementary introduction to the subject and as a reference manual for the package.

1993:

A Calculus of Random Generation, with Philippe Flajolet and Bernard Van Cutsem, Proceedings of European Symposium on Algorithms (ESA'93), LNCS 726, pages 169-180, 1993.

Epelle : un logiciel de détection de fautes d'orthographe, INRIA Research Report 2030, September 1993.

This report describes the algorithm used by the epelle program, together with its implementation in the C language. This program is able to check about 30.000 words every second on modern computers, without any error, contrary to the Unix spell program which makes use of hashing methods and could thus accept wrong words. The main principle of epelle is to use digital trees (also called dictionary trees), which in addition reduces the space needed to store the list of words (by a factor of about 5 for the french dictionary). Creating a new digital tree for the franch language (about 240.000 words) takes only a dozen of seconds. The same program is directly usable for other languages and more generally for any list of alphanumeric keys.

1991:

Séries génératrices et analyse automatique d'algorithmes, PhD thesis (in french), École Polytechnique, Palaiseau, 1991.