Update/remove several itemised tasks.

author: Torbjorn Granlund <tege@gmplib.org> 2011-03-03 15:25:47 +0100
committer: Torbjorn Granlund <tege@gmplib.org> 2011-03-03 15:25:47 +0100
commit: d1dc2d2d0b0ff4ace87f0f68200e7fa04b2913f3 (patch)
tree: 9b62805102655929b883c5e19c0ad82e2584b473 /doc
parent: dd74e3080c7a55d2e2bbbe1899918d432232a6e2 (diff)
download: gmp-d1dc2d2d0b0ff4ace87f0f68200e7fa04b2913f3.tar.gz
1 files changed, 13 insertions, 30 deletions
diff --git a/doc/tasks.html b/doc/tasks.html
index d86e79428..dc5a5361f 100644
--- a/doc/tasks.html
+++ b/doc/tasks.html
@@ -37,7 +37,7 @@ along with the GNU MP Library.  If not, see http://www.gnu.org/licenses/.
 
 <hr>
 <!-- NB. timestamp updated automatically by emacs -->
-  This file current as of 28 Dec 2009.  An up-to-date version is available at
+  This file current as of 3 Mar 2011.  An up-to-date version is available at
   <a href="http://gmplib.org/tasks.html">http://gmplib.org/tasks.html</a>.
   Please send comments about this page to gmp-devel<font>@</font>gmplib.org.
 
@@ -122,9 +122,6 @@ either already been taken care of, or have become irrelevant.
      subsequent operations, especially if the value is otherwise only small.
      If low bits of the low limb are zero, use <code>mpn_rshift</code> so as
      to not increase the size.
-<li> <code>mpn_dc_sqrtrem</code>: Don't use <code>mpn_addmul_1</code> with
-     multiplier==2, instead either <code>mpn_addlsh1_n</code> when available,
-     or <code>mpn_lshift</code>+<code>mpn_add_n</code> if not.
 <li> <code>mpn_dc_sqrtrem</code>, <code>mpn_sqrtrem2</code>: Don't use
      <code>mpn_add_1</code> and <code>mpn_sub_1</code> for 1 limb operations,
      instead <code>ADDC_LIMB</code> and <code>SUBC_LIMB</code>.
@@ -133,20 +130,12 @@ either already been taken care of, or have become irrelevant.
      aliasing between <code>sp</code> and <code>rp</code>.
 <li> <code>mpn_sqrtrem</code>: Some work can be saved in the last step when
      the remainder is not required, as noted in Paul's paper.
-<li> <code>mpq_add</code>, <code>mpq_add</code>: The division "op1.den / gcd"
-     is done twice, where of course only once is necessary.  Reported by Larry
-     Lambe.
 <li> <code>mpq_add</code>, <code>mpq_sub</code>: The gcd fits a single limb
-     with high probability and in this case <code>modlimb_invert</code> could
+     with high probability and in this case <code>binvert_limb</code> could
      be used to calculate the inverse just once for the two exact divisions
      "op1.den / gcd" and "op2.den / gcd", rather than letting
-     <code>mpn_divexact_1</code> do it each time.  This would require a new
-     <code>mpn_preinv_divexact_1</code> interface.  Not sure if it'd be worth
-     the trouble.
-<li> <code>mpq_add</code>, <code>mpq_sub</code>: The use of
-     <code>mpz_mul(x,y,x)</code> causes temp allocation or copying in
-     <code>mpz_mul</code> which can probably be avoided.  A rewrite using
-     <code>mpn</code> might be best.
+     <code>mpn_bdiv_q_1</code> do it each time.  This would require calling
+     <code>mpn_pi1_bdiv_q_1</code>.
 <li> <code>mpn_gcdext</code>: Don't test <code>count_leading_zeros</code> for
      zero, instead check the high bit of the operand and avoid invoking
      <code>count_leading_zeros</code>.  This is an optimization on all
@@ -173,26 +162,20 @@ either already been taken care of, or have become irrelevant.
      since there's no apparent way to get <code>SHRT_MAX</code> with an
      expression (since <code>short</code> and <code>unsigned short</code> can
      be different sizes).
-<li> <code>mpz_powm</code> and <code>mpz_powm_ui</code> aren't very
-     fast on one or two limb moduli, due to a lot of function call
-     overheads.  These could perhaps be handled as special cases.
-<li> <code>mpz_powm</code> and <code>mpz_powm_ui</code> want better
-     algorithm selection, and the latter should use REDC.  Both could
-     change to use an <code>mpn_powm</code> and <code>mpn_redc</code>.
+<li> <code>mpz_powm</code> and <code>mpz_powm_ui</code> aren't very fast on one
+     or two limb moduli, due to a lot of function call overheads.  These could
+     perhaps be handled as special cases.
+<li> Make sure <code>mpz_powm_ui</code> is never slower than the corresponding
+     computation using <code>mpz_powm</code>.
 <li> <code>mpz_powm</code> REDC should do multiplications by <code>g[]</code>
      using the division method when they're small, since the REDC form of a
      small multiplier is normally a full size product.  Probably would need a
      new tuned parameter to say what size multiplier is "small", as a function
      of the size of the modulus.
-<li> <code>mpz_powm</code> REDC should handle even moduli if possible.  Maybe
-     this would mean for m=n*2^k doing mod n using REDC and an auxiliary
-     calculation mod 2^k, then putting them together at the end.
-<li> <code>mpn_gcd</code> might be able to be sped up on small to
-     moderate sizes by improving <code>find_a</code>, possibly just by
-     providing an alternate implementation for CPUs with slowish
+<li> <code>mpn_gcd</code> might be able to be sped up on small to moderate
+     sizes by improving <code>find_a</code>, possibly just by providing an
+     alternate implementation for CPUs with slowish
      <code>count_leading_zeros</code>.
-<li> Toom3 could use a low to high cache localized evaluate and interpolate.
-     The necessary <code>mpn_divexact_by3c</code> exists.
 <li> <code>mpf_set_str</code> produces low zero limbs when a string has a
      fraction but is exactly representable, eg. 0.5 in decimal.  These could be
      stripped to save work in later operations.
@@ -371,7 +354,7 @@ either already been taken care of, or have become irrelevant.
 <li> UltraSPARC/32: <code>mpn_divexact_by3c</code> can work 64-bits at a time
      using <code>mulx</code>, in assembler.  This would be the same as for
      sparc64.
-<li> UltraSPARC: <code>modlimb_invert</code> might save a few cycles from
+<li> UltraSPARC: <code>binvert_limb</code> might save a few cycles from
      masking down to just the useful bits at each point in the calculation,
      since <code>mulx</code> speed depends on the highest bit set.  Either
      explicit masks or small types like <code>short</code> and
author	Torbjorn Granlund <tege@gmplib.org>	2011-03-03 15:25:47 +0100
committer	Torbjorn Granlund <tege@gmplib.org>	2011-03-03 15:25:47 +0100
commit	d1dc2d2d0b0ff4ace87f0f68200e7fa04b2913f3 (patch)
tree	9b62805102655929b883c5e19c0ad82e2584b473 /doc
parent	dd74e3080c7a55d2e2bbbe1899918d432232a6e2 (diff)
download	gmp-d1dc2d2d0b0ff4ace87f0f68200e7fa04b2913f3.tar.gz