Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask".

The comment following used to say: /* We use ~hash instead of hash, as degenerate hash functions, such as for ints <sigh>, can have lots of leading zeros. It's not really a performance risk, but better safe than sorry. 12-Dec-00 tim: so ~hash produces lots of leading ones instead -- what's the gain? */ That is, there was never a good reason for doing it. And to the contrary, as explained on Python-Dev last December, it tended to make the *sum* (i + incr) & mask (which is the first table index examined in case of collison) the same "too often" across distinct hashes. Changing to the simpler "i = hash & mask" reduced the number of string-dict collisions (== # number of times we go around the lookup for-loop) from about 6 million to 5 million during a full run of the test suite (these are approximate because the test suite does some random stuff from run to run). The number of collisions in non-string dicts also decreased, but not as dramatically. Note that this may, for a given dict, change the order (wrt previous releases) of entries exposed by .keys(), .values() and .items(). A number of std tests suffered bogus failures as a result. For dicts keyed by small ints, or (less so) by characters, the order is much more likely to be in increasing order of key now; e.g., >>> d = {} >>> for i in range(10): ... d[i] = i ... >>> d {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9} >>> Unfortunately. people may latch on to that in small examples and draw a bogus conclusion. test_support.py Moved test_extcall's sortdict() into test_support, made it stronger, and imported sortdict into other std tests that needed it. test_unicode.py Excluced cp875 from the "roundtrip over range(128)" test, because cp875 doesn't have a well-defined inverse for unicode("?", "cp875"). See Python-Dev for excruciating details. Cookie.py Chaged various output functions to sort dicts before building strings from them. test_extcall Fiddled the expected-result file. This remains sensitive to native dict ordering, because, e.g., if there are multiple errors in a keyword-arg dict (and test_extcall sets up many cases like that), the specific error Python complains about first depends on native dict ordering.
author: Tim Peters <tim.peters@gmail.com> 2001-05-13 00:19:31 +0000
committer: Tim Peters <tim.peters@gmail.com> 2001-05-13 00:19:31 +0000
commit: 2f228e75e4d5ac8c3eb4a6334dbc43243bff1095 (patch)
tree: ce1923e23fad608ef3d5749ed5a0e59f08530182 /Lib/test/test_extcall.py
parent: 0194ad5c7d2a0ffe473b87933768cb509417ff59 (diff)
download: cpython-git-2f228e75e4d5ac8c3eb4a6334dbc43243bff1095.tar.gz
1 files changed, 4 insertions, 11 deletions
diff --git a/Lib/test/test_extcall.py b/Lib/test/test_extcall.py
index 274e943ec6..9effac7585 100644
--- a/Lib/test/test_extcall.py
+++ b/Lib/test/test_extcall.py
@@ -1,14 +1,6 @@
-from test_support import verify, verbose, TestFailed
+from test_support import verify, verbose, TestFailed, sortdict
 from UserList import UserList
 
-def sortdict(d):
-    keys = d.keys()
-    keys.sort()
-    lst = []
-    for k in keys:
-        lst.append("%r: %r" % (k, d[k]))
-    return "{%s}" % ", ".join(lst)
-
 def f(*a, **k):
     print a, sortdict(k)
 
@@ -228,8 +220,9 @@ for args in ['', 'a', 'ab']:
                     lambda x: '%s="%s"' % (x, x), defargs)
                 if vararg: arglist.append('*' + vararg)
                 if kwarg: arglist.append('**' + kwarg)
-                decl = 'def %s(%s): print "ok %s", a, b, d, e, v, k' % (
-                    name, ', '.join(arglist), name)
+                decl = (('def %s(%s): print "ok %s", a, b, d, e, v, ' +
+                         'type(k) is type ("") and k or sortdict(k)')
+                         % (name, ', '.join(arglist), name))
                 exec(decl)
                 func = eval(name)
                 funcs.append(func)
author	Tim Peters <tim.peters@gmail.com>	2001-05-13 00:19:31 +0000
committer	Tim Peters <tim.peters@gmail.com>	2001-05-13 00:19:31 +0000
commit	2f228e75e4d5ac8c3eb4a6334dbc43243bff1095 (patch)
tree	ce1923e23fad608ef3d5749ed5a0e59f08530182 /Lib/test/test_extcall.py
parent	0194ad5c7d2a0ffe473b87933768cb509417ff59 (diff)
download	cpython-git-2f228e75e4d5ac8c3eb4a6334dbc43243bff1095.tar.gz