Many updates to draft PEP incorporating feedback

author: Nathaniel J. Smith <njs@pobox.com> 2014-02-23 23:08:59 -0500
committer: Nathaniel J. Smith <njs@pobox.com> 2014-02-23 23:08:59 -0500
commit: 703fcc60c69974e2ec860e39583dc5d2dccb788c (patch)
tree: 4bb4a1a3e4d4c4d2fbf4d9cad5885657a70e1fde /doc
parent: 64473572d9ce6c981c921667e5c558a2f1612e1f (diff)
download: numpy-703fcc60c69974e2ec860e39583dc5d2dccb788c.tar.gz
1 files changed, 361 insertions, 308 deletions
diff --git a/doc/neps/return-of-revenge-of-matmul-pep.rst b/doc/neps/return-of-revenge-of-matmul-pep.rst
index 799a0eef3..758a5dd92 100644
--- a/doc/neps/return-of-revenge-of-matmul-pep.rst
+++ b/doc/neps/return-of-revenge-of-matmul-pep.rst
@@ -5,7 +5,7 @@ Last-Modified: $Date$
 Author: Nathaniel J. Smith <njs@pobox.com>
 Status: Draft
 Type: Standards Track
-python-Version: 3.5
+Python-Version: 3.5
 Content-Type: text/x-rst
 Created: 20-Feb-2014
 Post-History:
@@ -21,8 +21,8 @@ respectively.  (Mnemonic: ``@`` is ``*`` for mATrices.)
 Specification
 =============
 
-Two new binary operators are added, together with corresponding
-in-place versions:
+Two new binary operators are added to the Python language, together
+with corresponding in-place versions:
 
 =======  ========================= ===============================
  Op      Precedence/associativity     Methods
@@ -33,59 +33,118 @@ in-place versions:
 ``@@=``  n/a                       ``__imatpow__``
 =======  ========================= ===============================
 
-The intention is that these will be overridden by numpy (and other
-libraries that define array-like objects, e.g. pandas, Theano,
-scipy.sparse, blaze, OpenCV, ...) to perform matrix multiplication, in
-contrast with ``*``'s elementwise multiplication.
+No implementations of these methods are added to the builtin or
+standard library types.
 
-For scalar/scalar operations, matrix, scalar, and elementwise
-multiplication all coincide, so we also add the following methods to
-``numbers.Complex`` and all built-in numeric types::
 
-    def __matmul__(self, other):
-        if isinstance(other, numbers.Number):
-            return self * other
-        else:
-            return NotImplemented
-
-    def __matpow__(self, other):
-        return self ** other
-
-    # The reverse version isn't really needed given the above, but
-    # doesn't hurt either, and improves forwards compatibility with
-    # any 3rd-party numbers.Number types that merely .register as
-    # subclasses of Complex without actually inheriting.
-    def __rmatmul__(self, other):
-        if isinstance(other, numbers.Number):
-            return other * self
-        else:
-            return NotImplemented
+Intended use
+------------
+
+This section is informative, rather than normative -- it documents the
+consensus of a number of 3rd party libraries on how the ``@`` and
+``@@`` operators will be implemented.  Not all matrix-like data types
+will provide all of the different dimensionalities described here; in
+particular, many will implement only the 2d or 1d+2d subsets.
+
+The recommended semantics for ``@`` are:
+
+* 0d (scalar) inputs raise an error.  Scalar * matrix multiplication
+  is a mathematically distinct operation, and should go through ``*``
+  instead.  (This is consistent both with the dominant convention that
+  ``*`` refer to elementwise multiplication with broadcasting
+  [#broadcasting], and with the minority convention that ``*`` be used
+  for both scalar and matrix multiplication, decided on a call-by-call
+  basis.)
+
+* 1d vector inputs are promoted to 2d by appending a '1' to the shape
+  on the appropriate side, the operation is performed, and then this
+  added dimension is removed from the output.  The result is that
+  matrix @ vector and vector @ matrix are both legal (assuming
+  compatible shapes), and both return vectors.  This is clearer with
+  examples.  If ``arr(2, 3)`` represents a 2x3 array, and ``arr(3)``
+  represents a 1d vector with 3 elements, then:
+
+  * ``arr(2, 3) @ arr(3, 1)`` is a regular matrix product, and returns
+    an array with shape (2, 1), i.e., a column vector.
+
+  * ``arr(2, 3) @ arr(3)`` performs the same computation as the
+    previous, but returns the result with shape (2,), i.e., a 1d
+    vector.
+
+  * ``arr(1, 3) @ arr(3, 2)`` is a regular matrix product, and returns
+    an array with shape (1, 2), i.e., a row vector.
+
+  * ``arr(3) @ arr(3, 2)`` performs the same computation as the
+    previous, but returns the result with shape (2,), i.e., a 1d
+    vector.
+
+  * ``arr(1, 3) @ arr(3, 1)`` is a regular matrix product, and returns
+    an array with shape (1, 1), i.e., a single value in matrix form.
+
+  * ``arr(3) @ arr(3)`` performs the same computation as the
+    previous, but returns the result with shape (), i.e., a single
+    scalar value, not in matrix form.  So this is the standard inner
+    product on vectors.
 
-    # Likewise.
-    def __rmatpow__(self, other):
-        if isinstance(other, numbers.Number):
-            return other ** self
+* 2d inputs are conventional matrices, and treated in the obvious
+  way.
+
+* For higher dimensional inputs, we treat the last two dimensions as
+  being the dimensions of the matrices to multiply, and 'broadcast'
+  [#broadcasting] across the other dimensions.  This provides a
+  convenient way to quickly compute many matrix products in a single
+  operation.  For example, ``arr(10, 2, 3) @ arr(10, 3, 4)`` performs
+  10 separate matrix multiplies, each between a 2x3 and a 3x4 matrix,
+  and returns the results together in an array with shape (10, 2, 4).
+
+The recommended semantics for ``@@`` are::
+
+    def __matpow__(self, n):
+        if n == 0:
+            return identity_matrix_with_shape(self.shape)
         else:
-            return NotImplemented
+            return self @ (self @@ (n - 1))
+
+The following projects have expressed an intention to implement ``@``
+and ``@@`` on their matrix types in a manner consistent with the above
+definitions:
+
+* numpy
+
+* scipy.sparse
 
-And for builtin types, statements like ``a @= b`` will perform
-``a = (a @ b)`` via the usual mechanism.
+* XX (try: pandas, Theano, blaze, OpenCV, cvxopt, any others?
+  QTransform in PyQt? PyOpenGL doesn't seem to provide a matrix
+  type. panda3d?)
 
 
 Motivation
 ==========
 
-The main motivation for this PEP is the addition of a binary operator
-``@`` for matrix multiplication.  No-one cares terribly much about the
-matrix power operator ``@@`` -- it's useful and well-defined, but not
-really necessary.  It is included here for aesthetic reasons: if we
-have an ``@`` that is like ``*``, then it would be weird and
-surprising to *not* have an ``@@`` that is like ``**``.  Similarly,
-the in-place operators ``@=`` and ``@@=`` are of marginal utility --
-it is not generally possible to implement in-place matrix
-multiplication any more efficiently than by doing ``a = (a @ b)`` --
-but are included for completeness and symmetry. So let's focus on the
-motivation for ``@``; everything else follows from that.
+Executive summary
+-----------------
+
+Matrix multiplication is uniquely deserving of a new, dedicated infix
+operator:
+
+* Adding an infix matrix multiplication operator brings Python into
+  alignment with universal notational practice across all fields of
+  mathematics, science, and engineering.
+
+* ``@`` greatly clarifies real-world code.
+
+* ``@`` provides a smoother onramp for less experienced users.
+
+* ``@`` benefits a large and growing user community.
+
+* ``@`` will be used frequently -- quite possibly more frequently than
+  ``//`` or the bitwise operators.
+
+* ``@`` helps this community finally standardize on a single duck type
+  for all matrix-like objects.
+
+And, given the existence of ``@``, it makes more sense than not to
+have ``@@``, ``@=``, and ``@@=``, so they are added as well.
 
 
 Why should matrix multiplication be infix?
@@ -99,23 +158,25 @@ multiplication::
   [2, 3] * [4, 5] = [2 * 4, 3 * 5] = [8, 15]
 
 and the other is the `matrix product`_.  For various reasons, the
-numerical Python ecosystem has universally settled on the convention
-that ``*`` refers to elementwise multiplication.  However, this leaves
-us with no convenient notation for matrix multiplication.
+numerical Python ecosystem has settled on the convention that ``*``
+refers to elementwise multiplication.  However, this leaves us with no
+convenient notation for matrix multiplication.
 
 .. _matrix product: https://en.wikipedia.org/wiki/Matrix_multiplication
 
 Matrix multiplication is similar to ordinary arithmetic operations
-like addition and scalar multiplication in two ways: (a) it is used
-very heavily in numerical programs -- often multiple times per line of
-code -- and (b) it has an ancient and universally adopted tradition of
-being written using infix syntax with varying precedence.  This is
-because, for typical formulas, this notation is dramatically more
-readable than any function syntax.  For example, one of the most
-useful tools for testing a statistical hypothesis is the linear
-hypothesis test for OLS regression models. If we want to implement
-this, we will look up some textbook or paper on it, and encounter many
-mathematical formulas that look like:
+like addition and multiplication on scalars in two ways: (a) it is
+used very heavily in numerical programs -- often multiple times per
+line of code -- and (b) it has an ancient and universally adopted
+tradition of being written using infix syntax with varying precedence.
+This is because, for typical formulas, this notation is dramatically
+more readable than any function syntax.
+
+Here's a concrete example.  One of the most useful tools for testing a
+statistical hypothesis is the linear hypothesis test for OLS
+regression models.  If we want to implement this, we will look up some
+textbook or paper on it, and encounter many mathematical formulas that
+look like:
 
 .. math::
 
@@ -124,10 +185,10 @@ mathematical formulas that look like:
 Here the various variables are all vectors or matrices (details for
 the curious: [#lht]).
 
-Our job is to write code to perform this calculation. In
-current numpy, matrix multiplication can be performed using either the
-function numpy.dot, or the .dot method on arrays. Neither provides a
-particularly readable translation of the formula::
+Now we need to write code to perform this calculation. In current
+numpy, matrix multiplication can be performed using either the
+function ``numpy.dot``, or the ``.dot`` method on arrays. Neither
+provides a particularly readable translation of the formula::
 
     import numpy as np
     from numpy.linalg import inv, solve
@@ -139,7 +200,8 @@ particularly readable translation of the formula::
     # Using dot method:
     S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)
 
-With the ``@`` operator, the direct translation of the above formula is::
+With the ``@`` operator, the direct translation of the above formula
+becomes::
 
     S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
 
@@ -147,11 +209,11 @@ Notice that there is now a transparent, 1-to-1 mapping between symbols
 in the original formula and the code.
 
 Of course, a more sophisticated programmer will probably notice that
-this is not the best way to compute this expression. The repeated
+this is not the best way to compute this expression.  The repeated
 computation of :math:`H \beta - r` should perhaps be factored out;
 and, expressions of the form ``dot(inv(A), B)`` should almost always
 be replaced by the more numerically stable ``solve(A, B)``.  When
-using ``@``, performing these refactorings give us::
+using ``@``, performing these refactorings gives us::
 
     # Version 1 (as above)
     S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
@@ -186,23 +248,24 @@ readability: when using function-call syntax, the required parentheses
 on every operation create visual clutter that makes it very difficult
 to parse out the overall structure of the formula by eye, even for a
 relatively simple formula like this one.  I made and caught many
-errors while trying to write out the 'dot' formulas above, and I'm
-still not certain I got all the parentheses right.  (Exercise: check
-my parentheses.)  But the ``@`` examples are obviously correct.
+errors while trying to write out the 'dot' formulas above.  They still
+contain at least one error.  (Exercise: find it, or them.)  In
+comparison, the ``@`` examples are not only correct, they're obviously
+correct at a glance.
 
 
-Importance for teaching
------------------------
+Simple syntax is especially critical for non-expert programmers
+---------------------------------------------------------------
 
 A large proportion of scientific code is written by people who are
-experts in their domain, but not experts in programming.  And there
-are many university courses with titles like "Data analysis for social
-scientists" which assume no programming background, and teach some
-combination of mathematical techniques, introduction to programming,
-and the use of programming to implement these mathematical techniques,
-all within a 10-15 week period.  These courses are more and more often
-being taught in Python rather than special-purpose languages like R or
-Matlab.
+experts in their domain, but are not experts in programming.  And
+there are many university courses run each year with titles like "Data
+analysis for social scientists" which assume no programming
+background, and teach some combination of mathematical techniques,
+introduction to programming, and the use of programming to implement
+these mathematical techniques, all within a 10-15 week period.  These
+courses are more and more often being taught in Python rather than
+special-purpose languages like R or Matlab.
 
 For these kinds of users, whose programming knowledge is fragile, the
 existence of a transparent mapping between formulas and code often
@@ -210,10 +273,10 @@ means the difference between succeeding and failing to write that code
 at all.  This is so important that such classes often use the
 ``numpy.matrix`` type which defines ``*`` to mean matrix
 multiplication, even though this type is buggy and heavily deprecated
-by the rest of the numpy community.  Adding ``@`` will benefit both
-beginning and advanced users; and furthermore, it will allow both
-groups to standardize on the same notation, providing a smoother
-on-ramp to expertise.
+by the rest of the numpy community for the fragmentation that it
+causes.  Adding ``@`` will benefit both beginning and advanced users;
+and furthermore, it will allow both groups to standardize on the same
+notation from the start, providing a smoother on-ramp to expertise.
 
 
 But isn't matrix multiplication a pretty niche requirement?
@@ -225,8 +288,8 @@ lingua franca of finance, machine learning, 3d graphics, computer
 vision, robotics, operations research, econometrics, meteorology,
 computational linguistics, recommendation systems, neuroscience,
 bioinformatics (including genetics, cancer research, drug discovery,
-etc.), physics simulation, quantum mechanics, network analysis, and
-many other application areas.
+etc.), physics engines, quantum mechanics, network analysis, and many
+other application areas.
 
 In most or all of these areas, Python is rapidly becoming a dominant
 player, in large part because of its ability to elegantly mix
@@ -259,75 +322,74 @@ adding a new operator.
 When the going gets tough, the tough get empirical.  To get a rough
 estimate of how useful the ``@`` operator will be, this table shows
 the rate at which different Python operators are used in the stdlib,
-and also in two high-profile numerical projects -- the sklearn machine
-learning library, and the nipy neuroimaging library.  Units are
-(rounded) usages per 10,000 source lines of code (SLOC).  Rows are
-sorted by the 'combined' column, which gives the usage per 10,000 SLOC
-when the three code bases are pooled together.  The combined column is
-thus strongly weighted towards the stdlib, which is much larger than
-both projects put together (stdlib: 411575 SLOC, sklearn: 50924 SLOC,
-nipy: 37078 SLOC). [#sloc-details]
-
-The ``dot`` row counts matrix multiply operations, estimated by
-assuming there to be zero matrix multiplies in the stdlib, and in
-sklearn/nipy assuming -- reasonably -- that all instances of the token
-``dot`` are calls to ``np.dot``.
-
-======= ======= ======= ======= ========
-     Op  stdlib sklearn    nipy combined
-======= ======= ======= ======= ========
-  ``(``    6979    6861    7644     7016
-  ``)``    6979    6861    7644     7016
-  ``=``    2969    5536    4932     3376
-  ``-``     218     444     496      261
-  ``+``     224     201     348      231
- ``==``     177     248     334      196
-  ``*``     156     284     465      192
-  ``%``     121     114     107      119
-  ``}``     106      56      63       98
-  ``{``     106      56      63       98
- ``**``      59     111     118       68
- ``!=``      40      56      74       44
-  ``/``      18     121     183       41
-  ``>``      29      70     110       39
- ``+=``      34      61      67       39
-  ``<``      32      62      76       38
- ``>=``      19      17      17       18
- ``<=``      18      27      12       18
-  ``|``      18       1       2       15
-``dot``       0      80      74       14
-  ``&``      14       0       6       12
- ``<<``      10       1       1        8
- ``//``       9       9       1        8
- ``-=``       5      21      14        8
- ``*=``       2      19      22        5
- ``/=``       0      23      16        4
- ``>>``       4       0       0        3
-  ``^``       3       0       0        3
-  ``~``       2       4       5        2
- ``|=``       3       0       0        2
- ``&=``       1       0       0        1
-``//=``       1       0       0        1
- ``^=``       1       0       0        0
-``**=``       0       2       0        0
- ``%=``       0       0       0        0
-``<<=``       0       0       0        0
-``>>=``       0       0       0        0
-======= ======= ======= ======= ========
-
-We see that sklearn and nipy together contain nearly 700 uses of
-matrix multiplication.  Within these two libraries, matrix
-multiplication is used more heavily than most comparison operators
-(``<`` ``>`` ``!=`` ``<=`` ``>=``), and more heavily even than ``{``
-and ``}``.  In total across all three of the codebases examined here,
-matrix multiplication is used more often than almost all the bitwise
-operators (only ``|`` just barely edges it out), and ~2x as often as
-``//``.  This is true even though the stdlib, which contains a fair
+and also in two high-profile numerical packages -- the scikit-learn
+machine learning library, and the nipy neuroimaging library --
+normalized by source lines of code (SLOC).  Rows are sorted by the
+'combined' column, which pools all three code bases together.  The
+combined column is thus strongly weighted towards the stdlib, which is
+much larger than both projects put together (stdlib: 411575 SLOC,
+scikit-learn: 50924 SLOC, nipy: 37078 SLOC). [#sloc-details]
+
+The ``dot`` row counts how common matrix multiply operations are in
+each codebase.
+
+Table units: Average occurrences per 10,000 SLOC.
+
+======= ======= ============ ======= ========
+     Op  stdlib scikit-learn    nipy combined
+======= ======= ============ ======= ========
+  ``=``    2969         5536    4932     3376
+  ``:``    3011         2380    2658     2921
+  ``-``     218          444     496      261
+  ``+``     224          201     348      231
+ ``==``     177          248     334      196
+  ``*``     156          284     465      192
+  ``%``     121          114     107      119
+  ``}``     106           56      63       98
+  ``{``     106           56      63       98
+ ``**``      59          111     118       68
+ ``!=``      40           56      74       44
+  ``/``      18          121     183       41
+  ``>``      29           70     110       39
+ ``+=``      34           61      67       39
+  ``<``      32           62      76       38
+ ``>=``      19           17      17       18
+ ``<=``      18           27      12       18
+``dot``       0           99      74       16
+  ``|``      18            1       2       15
+  ``&``      14            0       6       12
+ ``<<``      10            1       1        8
+ ``//``       9            9       1        8
+``...``       7            2      32        8
+ ``-=``       5           21      14        8
+ ``*=``       2           19      22        5
+ ``/=``       0           23      16        4
+ ``>>``       4            0       0        3
+  ``^``       3            0       0        3
+  ``~``       2            4       5        2
+ ``|=``       3            0       0        2
+ ``&=``       1            0       0        1
+``//=``       1            0       0        1
+ ``^=``       1            0       0        0
+``**=``       0            2       0        0
+ ``%=``       0            0       0        0
+``<<=``       0            0       0        0
+``>>=``       0            0       0        0
+======= ======= ============ ======= ========
+
+These numerical packages together contain ~780 uses of matrix
+multiplication.  Within these packages, matrix multiplication is used
+more heavily than most comparison operators (``<`` ``!=`` ``<=``
+``>=``), and more heavily even than ``{`` and ``}``.  When we include
+the stdlib into our comparisons, matrix multiplication is still used
+more often in total than any of the bitwise operators, and 2x as often
+as ``//``.  This is true even though the stdlib, which contains a fair
 amount of integer arithmetic and no matrix operations, is ~4x larger
 than the numeric libraries put together.  While it's impossible to
-know for certain, from this data it seems plausible that on net across
-the whole Python ecosystem, matrix multiplication is currently used
-more often than ``//`` or other integer operations.
+know for certain, from this data it seems plausible -- even likely --
+that on net across all Python code currently being written, matrix
+multiplication is used more often than ``//`` or other integer
+operations.
 
 
 But isn't it weird to add an operator with no stdlib uses?
@@ -342,23 +404,18 @@ by helping the Python numerical community finally standardize on a
 single duck type for all matrix-like objects.
 
 
-Summary
--------
-
-Matrix multiplication is uniquely deserving of a new, dedicated infix
-operator.  The addition of ``@`` will:
-
-* bring Python into alignment with universal notational practice
-  across all fields of mathematics, science, and engineering,
-
-* greatly clarify a large quantity of real-world code,
+Matrix power and in-place operators
+-----------------------------------
 
-* provide a smoother onramp for new users,
-
-* benefit a large and growing user community,
-
-* and help this community finally standardize on a single duck type
-  for all matrix-like objects.
+No-one cares terribly much about the other operators proposed in this
+PEP.  The matrix power operator ``@@`` is useful and well-defined, but
+not really necessary.  It is included here for consistency: if we have
+an ``@`` that is analogous to ``*``, then it would be weird and
+surprising to *not* have an ``@@`` that is analogous to ``**``.
+Similarly, the in-place operators ``@=`` and ``@@=`` are of marginal
+utility -- it is not generally possible to implement in-place matrix
+multiplication any more efficiently than by doing ``a = (a @ b)`` --
+but are included for completeness and symmetry.
 
 
 Compatibility considerations
@@ -386,44 +443,29 @@ Choice of operator
 ''''''''''''''''''
 
 Why ``@`` instead of some other punctuation symbol? It doesn't matter
-much, but ``@`` has a few advantages:
+much, and there isn't any consensus across languages about how this
+operator should be named [#matmul-other-langs], but ``@`` has a few
+advantages:
 
 * ``@`` is a friendly character that Pythoneers are already used to
   typing in decorators, and its use in email addresses means it is
   more likely to be easily accessible across keyboard layouts than
   some other characters (e.g. $).
-* The mATrices mnemonic is cute.
-* The swirly shape is reminiscent of the simultaneous sweeps over rows
-  and columns that define matrix multiplication.
 
+* The mATrices mnemonic is cute.
 
-Built-ins
-'''''''''
+* It's round like ``*`` and :math:`\cdot`.
 
-Why are the new special methods defined the way they are for Python
-builtins? The three goals are:
+* The swirly shape is reminiscent of the simultaneous sweeps over rows
+  and columns that define matrix multiplication.
 
-* Define a meaningful ``@`` and ``@@`` for builtin and user-defined
-  numeric types, to maximize duck compatibility between Python scalars
-  and 1x1 matrices, single-element vectors, and zero-dimensional
-  arrays.
-* Do this in as forward-compatible a way as possible.
-* Ensure that ``scalar @ matrix`` does *not* delegate to ``scalar *
-  matrix``; ``scalar * matrix`` is well-defined, but ``scalar @
-  matrix`` should raise an error.
 
-Therefore, we implement these methods so that numbers.Number objects
-will in general delegate ``@`` to ``*``, but only when dealing with
-other numbers.Number objects. In other cases NotImplemented is
-returned.
+Definition for built-ins
+''''''''''''''''''''''''
 
-An alternative approach would be for these methods on builtin types to
-always return NotImplemented.  It probably doesn't make much
-difference which we choose, since we still won't have full duck
-compatibility between Python builtins and numpy scalars (e.g.,
-builtins will still miss the very common ``.T`` transpose operator).
-But the approach taken here seems marginally more semantically
-consistent.
+No ``__matmul__`` or ``__matpow__`` are defined for builtin numeric
+types, because these are scalars, and the consensus semantics for
+``@`` are that it should raise an error on scalars.
 
 We do not (for now) define a ``__matmul__`` operator on the standard
 ``memoryview`` or ``array.array`` objects, for several reasons.  There
@@ -433,18 +475,18 @@ information needed to interpret their contents numerically (e.g., as
 float32 versus int32).  Array objects are typed, but cannot represent
 multidimensional data.  And finally, providing a quality
 implementation of matrix multiplication is highly non-trivial.  The
-naive nested loop implementation is very slow and would become an
-attractive nuisance; but, providing a competitive matrix multiply
-would require that Python link to a BLAS library, which brings a set
-of new complications -- among them that several popular BLAS libraries
-(including the one that ships by default on OS X) currently break the
-use of ``multiprocessing`` [#blas-fork].  Thus we'll continue to
-delegate dealing with these problems to numpy and friends, at least
-for now.
+naive nested loop implementation is very slow and its use would create
+a dangeous trap for users.  But the alternative of providing a
+competitive matrix multiply would require that Python link to a BLAS
+library, which brings a set of new complications -- among them that
+several popular BLAS libraries (including the one that ships by
+default on OS X) currently break the use of ``multiprocessing``
+[#blas-fork].  Thus we'll continue to delegate dealing with these
+problems to numpy and friends, at least for now.
 
-While there are non-numeric Python builtins that define ``__mul__``
-(``str``, ``list``, ...), we do not define ``__matmul__`` for these
-types either, because that makes no sense and TOOWTDI.
+There are also non-numeric Python builtins which define ``__mul__``
+(``str``, ``list``, ...).  We do not define ``__matmul__`` for these
+types either, because why would we even do that.
 
 
 Alternatives to adding a new operator at all
@@ -452,97 +494,92 @@ Alternatives to adding a new operator at all
 
 Over the past 15+ years, the Python numeric community has explored a
 variety of ways to handle the tension between matrix and elementwise
-multiplication operations.  PEP 211 and PEP 225, both proposed in
-2000, were early attempts to add new operators to solve this problem,
-but suffered from serious flaws; in particular, at that time the
-Python numerical community had not yet reached consensus on the proper
-API for array objects, or on what operators might be needed or useful
-(e.g., PEP 225 proposes 6 new operators with underspecified
-semantics).  Experience since then has eventually led to consensus
-among the numerical community that the best solution is to add a
-single infix operator for matrix multiply (together with any other new
-operators this implies like ``@=``).
-
-We review some of these alternatives here.
-
-Use a type that defines ``__mul__`` as matrix multiplication:
-    Numpy has had such a type for many years: ``np.matrix``.  And
-    based on this experience, a strong consensus has developed that it
-    should essentially never be used.  The problem is that the
-    presence of two different duck-types for numeric data -- one where
-    ``*`` means matrix multiply, and one where ``*`` means elementwise
-    multiplication -- makes it impossible to write generic functions
-    that can operate on arbitrary data.  In practice, the entire
-    Python numeric ecosystem has standardized on using ``*`` for
-    elementwise multiplication, and deprecated the use of
-    ``np.matrix``.  Most 3rd-party libraries which receive a
-    ``matrix`` as input will either error out, return incorrect
-    results, or simply convert the input into a standard ``ndarray``,
-    and return ``ndarray``s as well.  The only reason ``np.matrix``
-    survives is because of strong arguments from some educators who
-    find that its problems are outweighed by the need to provide a
-    simple and clear mapping between mathematical notation and code
-    for novices.
-
-Add a new ``@`` (or whatever) operator that has some other meaning in
-general Python, and then overload it in numeric code:
-    This was the approach proposed by PEP 211, which suggested
-    defining ``@`` to be the equivalent of ``itertools.product``. The
-    problem with this is that when taken on its own terms, adding an
-    infix operator for ``itertools.product`` is just silly.  Matrix
-    multiplication has a uniquely strong rationale for inclusion as an
-    infix operator.  There almost certainly don't exist any other
-    binary operations that will ever justify adding another infix
-    operator.
-
-Add a ``.dot`` method to array types so as to allow "pseudo-infix"
-A.dot(B) syntax:
-    This has been in numpy for some years, and in many cases it's
-    better than dot(A, B).  But it's still much less readable than
-    real infix notation, and in particular still suffers from an
-    extreme overabundance of parentheses.  See `Motivation`_ above.
-
-Add lots of new operators / add a new generic syntax for defining
-infix operators:
-    In addition to this being generally un-Pythonic and repeatedly
-    rejected by BDFL fiat, this would be using a sledgehammer to smash
-    a fly.  There is a strong consensus in the scientific python
-    community that matrix multiplication really is the only missing
-    infix operator that matters enough to bother about. (In
-    retrospect, we all think PEP 225 was a bad idea too.)
-
-Use a language preprocessor that adds extra operators and perhaps
-other syntax (as per recent BDFL suggestion [#preprocessor]):
-    Aside from matrix multiplication, there are no other operators or
-    syntax that anyone cares enough about to bother adding.  But
-    defining a new language (presumably with its own parser which
-    would have to be kept in sync with Python's, etc.), just to
-    support a single binary operator, is neither practical nor
-    desireable.  In the scientific context, Python's competition is
-    special-purpose numerical languages (Matlab, R, IDL, etc.).
-    Compared to these, Python's killer feature is exactly that one can
-    mix specialized numerical code with general-purpose code for XML
-    parsing, web page generation, database access, network
-    programming, GUI libraries, etc., and we also gain major benefits
-    from the huge variety of tutorials, reference material,
-    introductory classes, etc., which use Python.  Fragmenting
-    "numerical Python" from "real Python" would be a major source of
-    confusion.  Having to set up a preprocessor would be an especially
-    prohibitive complication for unsophisticated users.  And we use
-    Python because we like Python!  We don't want
-    almost-but-not-quite-Python.
-
-Use overloading hacks to define a "new infix operator" like ``*dot*``,
-as in a well-known Python recipe [#infix-hack]:
-    Beautiful is better than ugly. This solution is so ugly that most
-    developers will simply refuse to consider it for use in serious,
-    reusable code.  This isn't just speculation -- a variant of this
-    recipe is actually distributed as a supported part of a major
-    Python mathematics system [#sage-infix], so it's widely available,
-    yet still receives minimal use.  OTOH, the fact that people even
-    consider such a 'solution', and are supporting it in shipping
-    code, could be taken as further evidence for the need for a proper
-    infix operator for matrix product.
+multiplication operations.  PEP 211 and PEP 225, both proposed in 2000
+and last seriously discussed in 2008 [#threads-2008], were early
+attempts to add new operators to solve this problem, but suffered from
+serious flaws; in particular, at that time the Python numerical
+community had not yet reached consensus on the proper API for array
+objects, or on what operators might be needed or useful (e.g., PEP 225
+proposes 6 new operators with underspecified semantics).  Experience
+since then has eventually led to consensus among the numerical
+community that the best solution is to add a single infix operator for
+matrix multiply (together with any other new operators this implies
+like ``@=``).
+
+We review some of the rejected alternatives here.
+
+**Use a type that defines ``__mul__`` as matrix multiplication:**
+Numpy has had such a type for many years: ``np.matrix``.  And based on
+this experience, a strong consensus has developed that it should
+essentially never be used.  The problem is that the presence of two
+different duck-types for numeric data -- one where ``*`` means matrix
+multiply, and one where ``*`` means elementwise multiplication --
+makes it impossible to write generic functions that can operate on
+arbitrary data.  In practice, the vast majority of the Python numeric
+ecosystem has standardized on using ``*`` for elementwise
+multiplication, and deprecated the use of ``np.matrix``.  Most
+3rd-party libraries which receive a ``matrix`` as input will either
+error out, return incorrect results, or simply convert the input into
+a standard ``ndarray``, and return ``ndarray``s as well.  The only
+reason ``np.matrix`` survives is because of strong arguments from some
+educators who find that its problems are outweighed by the need to
+provide a simple and clear mapping between mathematical notation and
+code for novices; and this, as described above, causes its own
+problems.
+
+**Add a new ``@`` (or whatever) operator that has some other meaning
+in general Python, and then overload it in numeric code:** This was
+the approach proposed by PEP 211, which suggested defining ``@`` to be
+the equivalent of ``itertools.product``.  The problem with this is
+that when taken on its own terms, adding an infix operator for
+``itertools.product`` is just silly.  Matrix multiplication has a
+uniquely strong rationale for inclusion as an infix operator.  There
+almost certainly don't exist any other binary operations that will
+ever justify adding another infix operator.
+
+**Add a ``.dot`` method to array types so as to allow "pseudo-infix"
+A.dot(B) syntax:** This has been in numpy for some years, and in many
+cases it's better than dot(A, B).  But it's still much less readable
+than real infix notation, and in particular still suffers from an
+extreme overabundance of parentheses.  See `Motivation`_ above.
+
+**Add lots of new operators / add a new generic syntax for defining
+infix operators:** In addition to this being generally un-Pythonic and
+repeatedly rejected by BDFL fiat, this would be using a sledgehammer
+to smash a fly.  There is a strong consensus in the scientific python
+community that matrix multiplication really is the only missing infix
+operator that matters enough to bother about. (In retrospect, we all
+think PEP 225 was a bad idea too.)
+
+**Use a language preprocessor that adds extra operators and perhaps
+other syntax (as per recent BDFL suggestion [#preprocessor]):** Aside
+from matrix multiplication, there are no other operators or syntax
+that anyone cares enough about to bother adding.  But defining a new
+language (presumably with its own parser which would have to be kept
+in sync with Python's, etc.), just to support a single binary
+operator, is neither practical nor desireable.  In the scientific
+context, Python's competition is special-purpose numerical languages
+(Matlab, R, IDL, etc.).  Compared to these, Python's killer feature is
+exactly that one can mix specialized numerical code with
+general-purpose code for XML parsing, web page generation, database
+access, network programming, GUI libraries, etc., and we also gain
+major benefits from the huge variety of tutorials, reference material,
+introductory classes, etc., which use Python.  Fragmenting "numerical
+Python" from "real Python" would be a major source of confusion.
+Having to set up a preprocessor would be an especially prohibitive
+complication for unsophisticated users.  And we use Python because we
+like Python!  We don't want almost-but-not-quite-Python.
+
+**Use overloading hacks to define a "new infix operator" like
+``*dot*``, as in a well-known Python recipe [#infix-hack]:** Beautiful
+is better than ugly. This solution is so ugly that most developers
+will simply refuse to consider it for use in serious, reusable code.
+This isn't just speculation -- a variant of this recipe is actually
+distributed as a supported part of a major Python mathematics system
+[#sage-infix], so it's widely available, yet still receives minimal
+use.  OTOH, the fact that people even consider such a 'solution', and
+are supporting it in shipping code, could be taken as further evidence
+for the need for a proper infix operator for matrix product.
 
 
 References
@@ -608,33 +645,49 @@ References
 
    See: https://us.pycon.org/2014/schedule/tutorials/
 
-.. [#sloc-details] SLOCs are defined as physical lines which contain
+.. [#sloc-details] SLOCs were defined as physical lines which contain
    at least one token that is not a COMMENT, NEWLINE, ENCODING,
    INDENT, or DEDENT.  Counts were made by using ``tokenize`` module
    from Python 3.2.3 to examine the tokens in all files ending ``.py``
    underneath some directory.  Only tokens which occur at least once
-   in the source trees are included in the table. Several distracting
-   rows were trimmed by hand (e.g. ``.``, ``:``, ``...``).  The
-   counting script will be available as an auxiliary file once this
-   PEP is submitted; until then, it can be found here:
+   in the source trees are included in the table.  The counting script
+   will be available as an auxiliary file once this PEP is submitted;
+   until then, it can be found here:
    https://gist.github.com/njsmith/9157645
 
+   Matrix multiply counts were estimated by counting how often certain
+   tokens which are used as matrix multiply function names occurred in
+   each package.  In principle this could create false positives, but
+   as far as I know the counts are exact; it's unlikely that anyone is
+   using ``dot`` as a variable name when it's also the name of one of
+   the most widely-used numpy functions.
+
    All counts were made using the latest development version of each
    project as of 21 Feb 2014.
 
    'stdlib' is the contents of the Lib/ directory in commit
-   d6aa3fa646e2 to the cpython hg repository.
+   d6aa3fa646e2 to the cpython hg repository, and treats the following
+   tokens as indicating matrix multiply: n/a.
 
-   'sklearn' is the contents of the sklearn/ directory in commit
-   69b71623273ccfc1181ea83d8fb9e05ae96f57c7 to the scikits-learn
-   repository: https://github.com/scikit-learn/scikit-learn
+   'scikit-learn' is the contents of the sklearn/ directory in commit
+   69b71623273ccfc1181ea83d8fb9e05ae96f57c7 to the scikit-learn
+   repository (https://github.com/scikit-learn/scikit-learn), and
+   treats the following tokens as indicating matrix multiply: ``dot``,
+   ``fast_dot``, ``safe_sparse_dot``.
 
    'nipy' is the contents of the nipy/ directory in commit
-   5419911e99546401b5a13bd8ccc3ad97f0d31037 to the nipy repository:
-   https://github.com/nipy/nipy/
+   5419911e99546401b5a13bd8ccc3ad97f0d31037 to the nipy repository
+   (https://github.com/nipy/nipy/), and treats the following tokens as
+   indicating matrix multiply: ``dot``.
 
 .. [#blas-fork]: BLAS libraries have a habit of secretly spawning
    threads, even when used from single-threaded programs.  And threads
    play very poorly with ``fork()``; the usual symptom is that
    attempting to perform linear algebra in a child process causes an
    immediate deadlock.
+
+.. [#threads-2008]: http://fperez.org/py4science/numpy-pep225/numpy-pep225.html
+
+.. [#broadcasting]: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
+
+.. [#matmul-other-langs]: http://mail.scipy.org/pipermail/scipy-user/2014-February/035499.html
author	Nathaniel J. Smith <njs@pobox.com>	2014-02-23 23:08:59 -0500
committer	Nathaniel J. Smith <njs@pobox.com>	2014-02-23 23:08:59 -0500
commit	703fcc60c69974e2ec860e39583dc5d2dccb788c (patch)
tree	4bb4a1a3e4d4c4d2fbf4d9cad5885657a70e1fde /doc
parent	64473572d9ce6c981c921667e5c558a2f1612e1f (diff)
download	numpy-703fcc60c69974e2ec860e39583dc5d2dccb788c.tar.gz