summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKevin Ryde <user42@zip.com.au>2000-04-15 09:15:38 +0200
committerKevin Ryde <user42@zip.com.au>2000-04-15 09:15:38 +0200
commita80e17ab82901df71012b2f95f7be13304b3bcd9 (patch)
tree6b7e2421dcb57b9902b3eefc4ca10d7658403590
parent8f2cb0a54ec2958e0a25e2e4ab9cb45e303ccc10 (diff)
downloadgmp-a80e17ab82901df71012b2f95f7be13304b3bcd9.tar.gz
Updates for things now completed or progressed.
A few typos corrected.
-rw-r--r--projects/projects.html64
1 files changed, 35 insertions, 29 deletions
diff --git a/projects/projects.html b/projects/projects.html
index b1526e342..c35b8ddda 100644
--- a/projects/projects.html
+++ b/projects/projects.html
@@ -15,11 +15,11 @@
<comment>
An up-to-date version of this file is available at
- http://www.swox.com/gmp/projects.html.
+ <a href="http://www.swox.com/gmp/projects.html">http://www.swox.com/gmp/projects.html</a>.
</comment>
<p> This file lists projects suitable for volunteers. Please see the file
- tasks, or <a href="http://www.swox.com/gmp/tasks.html">
+ tasks, or <a href="tasks.html">
http://www.swox.com/gmp/tasks.html</a> for smaller tasks.
<p> If you want to work on any of the projects below, please let tege@swox.com
@@ -46,9 +46,8 @@
<p> <li> <strong>Faster multiplication</strong>
- <p> The current multiplication code runs at O(n^1.58) when the operand are both
- around n digits large. Robert Harley has contributed better code,
- running at O(n^1.46), based on the 3-way Toom algorithm.
+ <p> The current multiplication code is 3-way Toom-Cook based and
+ runs at O(n^1.465) when the operands are both around n digits large.
Several new developments are desirable:
@@ -56,11 +55,21 @@
<li> Handle multiplication of operands with different digit count better than
today. We now split the operands in a very inefficient way, see
mpn/generic/mul.c.
- <li> Implement the n-way Toom algorithm. See Knuth's Seminumerical
- algorithms for details on Toom's algorithm.
+
+ <li> Consider N-way Toom-Cook. See Knuth's Seminumerical
+ Algorithms for details on the method. Code implementing it
+ exists. This is asyptotically inferior to FFTs, but is finer
+ grained since it can split into any number of pieces, not
+ just powers of 2 (or of 2 and 3 or whatever).
+
<li> Implement some variant of FFT-based multiplication. I believe the most
promising variant is the original Schönhage-Strassen mod 2^n+1 algorithm.
- Paul Zimmermann and Robert Harley are working on this.
+ Paul Zimmermann and Robert Harley are working on this, Paul Zimmerman
+ has working code.
+
+ <li> It's possible CPU dependent effects like cache locality will
+ have a greater impact on speed than algorithmic improvements.
+
</ol>
<p> <li> <strong>Division and modular multiplication</strong>
@@ -107,7 +116,7 @@
</ol>
<p> The most important routines are mpn_addmul_1, mpn_mul_basecase and
- mpn_sqr_basecase. The latter two doesn't exist for all machines, while
+ mpn_sqr_basecase. The latter two don't exist for all machines, while
mpn_addmul_1 exists for almost all machines.
<p> Standard techniques for these routines are unrolling, software
@@ -125,12 +134,9 @@
<p> <li> <strong>Configuration</strong>
- <p> Switching to autoconf is a highly desirable project, but not as simple
- as one might think.
-
<p> We want to recognize the processor very accurately, more accurately than
other GNU packages. config.guess does not currently make the distinctions we
- would like it to do.
+ would like it to do and a --target often needs to be set explicitly.
<p> For example, "sparc" is not very useful as a machine architecture
denotation. We want to distinguish old 32-bit SPARC without multiply support
@@ -144,24 +150,22 @@
therefore sometimes need to choose GMP configuration depending both on
processor and operating system.
- <p> Other examples of things autoconf needs to care about:
+ <p> Other things autoconf needs to care about:
<ol>
<li> Identify Mips processor under Irix: `hinv -c processor'
<li> Identify Alpha processor under OSF: "/usr/sbin/sizer -c".
Unfortunately, sizer is not available before some revision of Dec Unix
4.0, and it also returns some rather cryptic names for processors.
Perhaps the <code>implver</code> and <code>amask</code> assembly
- instructions are better, but that doesn't differeniate between ev5 and
+ instructions are better, but that doesn't differentiate between ev5 and
ev56.
<li> Get lots of information about a Solaris system: prtconf -vp
- <li> Figure out `local label prefix', the character sequence that makes the
- chosen assembler omit the label from the object file.
- <li> Figure out if identifiers need a `_' prefix.
<li> For specific target machines, figure out assembly language variant
- (x86 <code>shld</code>/<code>shrd</code>, and some other).
+ (like for x86 <code>shldl</code>).
<li> For some target machines and some compilers, specific options are needed
(sparcv8/gcc needs -mv8, sparcv8/cc needs -cg92, Irix64/cc needs -64,
- Irix32/cc might need -n32, etc) see top-level config directory.
+ Irix32/cc might need -n32, etc). Some are set already, add more, see
+ configure.in.
<li> Options to be passed to the assembler (via the compiler, using whatever
syntax the compiler uses for passing options to the assembler).
<li> On Solaris 7, check if gcc supports native v9 64-bit arithmetic.
@@ -172,11 +176,11 @@
<p> In general, getting the exact right configuration, passing the exact right
options to the compiler, etc, might mean that the GMP performance more than
- double.
+ doubles.
- <p> When testing the ready result, make sure to test at least the
+ <p> When testing, make sure to test at least the
following for all out target machines: (1) Both gcc and cc (and c89).
- (2) Both 32-bit more and 64-bit mode (such as -n32 vs -64 under
+ (2) Both 32-bit mode and 64-bit mode (such as -n32 vs -64 under
Irix). (3) Both the system `make' and GNU `make'. (4) With and without
GNU binutils.
@@ -184,17 +188,19 @@
under certain systems. An example is -n32, -o32, and -64 on Irix.
<p> Improve config.guess. It should say mips2, mips3, and mips4.
- It should say supersparc, microsparc, ultrasparc1, ultrasparc2. Etc.
- Likewise for HPPA. (Alpha and x86 works already.)
+ It should say supersparc, microsparc, ultrasparc1, ultrasparc2.
+ Etc. Likewise for HPPA. (Alpha and x86 work already. Ensure
+ config.sub accepts the guesses.)
<p> <li> <strong>Faster extended GCD</strong>
- <p> We currently have extended GCD based on Lehmer's method in the GMP working
- sources. But the binary method can quite easily be improved for bignums
+ <p> We currently have extended GCD based on Lehmer's method.
+ But the binary method can quite easily be improved for bignums
in a similar way Lehmer improved Euclid's algorithm. The result should
- beat Lehmer's method by about a factor 2. Furthermore, the multiprecision
+ beat Lehmer's method by about a factor of 2. Furthermore, the multiprecision
step of Lehmer's method and the binary method will be identical, so the
- current code can mostly be reused.
+ current code can mostly be reused. It should be possible to share code
+ between GCD and GCDEXT, and probably with Jacobi symbols too.
<p> <li> <strong>Math functions for the mpf layer</strong>