diff options
author | Kevin Ryde <user42@zip.com.au> | 2000-04-15 09:15:38 +0200 |
---|---|---|
committer | Kevin Ryde <user42@zip.com.au> | 2000-04-15 09:15:38 +0200 |
commit | a80e17ab82901df71012b2f95f7be13304b3bcd9 (patch) | |
tree | 6b7e2421dcb57b9902b3eefc4ca10d7658403590 | |
parent | 8f2cb0a54ec2958e0a25e2e4ab9cb45e303ccc10 (diff) | |
download | gmp-a80e17ab82901df71012b2f95f7be13304b3bcd9.tar.gz |
Updates for things now completed or progressed.
A few typos corrected.
-rw-r--r-- | projects/projects.html | 64 |
1 files changed, 35 insertions, 29 deletions
diff --git a/projects/projects.html b/projects/projects.html index b1526e342..c35b8ddda 100644 --- a/projects/projects.html +++ b/projects/projects.html @@ -15,11 +15,11 @@ <comment> An up-to-date version of this file is available at - http://www.swox.com/gmp/projects.html. + <a href="http://www.swox.com/gmp/projects.html">http://www.swox.com/gmp/projects.html</a>. </comment> <p> This file lists projects suitable for volunteers. Please see the file - tasks, or <a href="http://www.swox.com/gmp/tasks.html"> + tasks, or <a href="tasks.html"> http://www.swox.com/gmp/tasks.html</a> for smaller tasks. <p> If you want to work on any of the projects below, please let tege@swox.com @@ -46,9 +46,8 @@ <p> <li> <strong>Faster multiplication</strong> - <p> The current multiplication code runs at O(n^1.58) when the operand are both - around n digits large. Robert Harley has contributed better code, - running at O(n^1.46), based on the 3-way Toom algorithm. + <p> The current multiplication code is 3-way Toom-Cook based and + runs at O(n^1.465) when the operands are both around n digits large. Several new developments are desirable: @@ -56,11 +55,21 @@ <li> Handle multiplication of operands with different digit count better than today. We now split the operands in a very inefficient way, see mpn/generic/mul.c. - <li> Implement the n-way Toom algorithm. See Knuth's Seminumerical - algorithms for details on Toom's algorithm. + + <li> Consider N-way Toom-Cook. See Knuth's Seminumerical + Algorithms for details on the method. Code implementing it + exists. This is asyptotically inferior to FFTs, but is finer + grained since it can split into any number of pieces, not + just powers of 2 (or of 2 and 3 or whatever). + <li> Implement some variant of FFT-based multiplication. I believe the most promising variant is the original Schönhage-Strassen mod 2^n+1 algorithm. - Paul Zimmermann and Robert Harley are working on this. + Paul Zimmermann and Robert Harley are working on this, Paul Zimmerman + has working code. + + <li> It's possible CPU dependent effects like cache locality will + have a greater impact on speed than algorithmic improvements. + </ol> <p> <li> <strong>Division and modular multiplication</strong> @@ -107,7 +116,7 @@ </ol> <p> The most important routines are mpn_addmul_1, mpn_mul_basecase and - mpn_sqr_basecase. The latter two doesn't exist for all machines, while + mpn_sqr_basecase. The latter two don't exist for all machines, while mpn_addmul_1 exists for almost all machines. <p> Standard techniques for these routines are unrolling, software @@ -125,12 +134,9 @@ <p> <li> <strong>Configuration</strong> - <p> Switching to autoconf is a highly desirable project, but not as simple - as one might think. - <p> We want to recognize the processor very accurately, more accurately than other GNU packages. config.guess does not currently make the distinctions we - would like it to do. + would like it to do and a --target often needs to be set explicitly. <p> For example, "sparc" is not very useful as a machine architecture denotation. We want to distinguish old 32-bit SPARC without multiply support @@ -144,24 +150,22 @@ therefore sometimes need to choose GMP configuration depending both on processor and operating system. - <p> Other examples of things autoconf needs to care about: + <p> Other things autoconf needs to care about: <ol> <li> Identify Mips processor under Irix: `hinv -c processor' <li> Identify Alpha processor under OSF: "/usr/sbin/sizer -c". Unfortunately, sizer is not available before some revision of Dec Unix 4.0, and it also returns some rather cryptic names for processors. Perhaps the <code>implver</code> and <code>amask</code> assembly - instructions are better, but that doesn't differeniate between ev5 and + instructions are better, but that doesn't differentiate between ev5 and ev56. <li> Get lots of information about a Solaris system: prtconf -vp - <li> Figure out `local label prefix', the character sequence that makes the - chosen assembler omit the label from the object file. - <li> Figure out if identifiers need a `_' prefix. <li> For specific target machines, figure out assembly language variant - (x86 <code>shld</code>/<code>shrd</code>, and some other). + (like for x86 <code>shldl</code>). <li> For some target machines and some compilers, specific options are needed (sparcv8/gcc needs -mv8, sparcv8/cc needs -cg92, Irix64/cc needs -64, - Irix32/cc might need -n32, etc) see top-level config directory. + Irix32/cc might need -n32, etc). Some are set already, add more, see + configure.in. <li> Options to be passed to the assembler (via the compiler, using whatever syntax the compiler uses for passing options to the assembler). <li> On Solaris 7, check if gcc supports native v9 64-bit arithmetic. @@ -172,11 +176,11 @@ <p> In general, getting the exact right configuration, passing the exact right options to the compiler, etc, might mean that the GMP performance more than - double. + doubles. - <p> When testing the ready result, make sure to test at least the + <p> When testing, make sure to test at least the following for all out target machines: (1) Both gcc and cc (and c89). - (2) Both 32-bit more and 64-bit mode (such as -n32 vs -64 under + (2) Both 32-bit mode and 64-bit mode (such as -n32 vs -64 under Irix). (3) Both the system `make' and GNU `make'. (4) With and without GNU binutils. @@ -184,17 +188,19 @@ under certain systems. An example is -n32, -o32, and -64 on Irix. <p> Improve config.guess. It should say mips2, mips3, and mips4. - It should say supersparc, microsparc, ultrasparc1, ultrasparc2. Etc. - Likewise for HPPA. (Alpha and x86 works already.) + It should say supersparc, microsparc, ultrasparc1, ultrasparc2. + Etc. Likewise for HPPA. (Alpha and x86 work already. Ensure + config.sub accepts the guesses.) <p> <li> <strong>Faster extended GCD</strong> - <p> We currently have extended GCD based on Lehmer's method in the GMP working - sources. But the binary method can quite easily be improved for bignums + <p> We currently have extended GCD based on Lehmer's method. + But the binary method can quite easily be improved for bignums in a similar way Lehmer improved Euclid's algorithm. The result should - beat Lehmer's method by about a factor 2. Furthermore, the multiprecision + beat Lehmer's method by about a factor of 2. Furthermore, the multiprecision step of Lehmer's method and the binary method will be identical, so the - current code can mostly be reused. + current code can mostly be reused. It should be possible to share code + between GCD and GCDEXT, and probably with Jacobi symbols too. <p> <li> <strong>Math functions for the mpf layer</strong> |