summaryrefslogtreecommitdiff
path: root/Doc
diff options
context:
space:
mode:
authorSerhiy Storchaka <storchaka@gmail.com>2015-02-03 11:04:19 +0200
committerSerhiy Storchaka <storchaka@gmail.com>2015-02-03 11:04:19 +0200
commit83e802796c80f46be616b48020356f7f51be533d (patch)
treee896b143abc3523f96e20d88ebcc22512af16aa7 /Doc
parent32ca3dcb97a75c05dc2b90c88bbf82a541c57c61 (diff)
downloadcpython-git-83e802796c80f46be616b48020356f7f51be533d.tar.gz
Issue #22818: Splitting on a pattern that could match an empty string now
raises a warning. Patterns that can only match empty strings are now rejected.
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/re.rst32
-rw-r--r--Doc/whatsnew/3.5.rst7
2 files changed, 33 insertions, 6 deletions
diff --git a/Doc/library/re.rst b/Doc/library/re.rst
index 60ded8b2f4..8e20496011 100644
--- a/Doc/library/re.rst
+++ b/Doc/library/re.rst
@@ -626,17 +626,37 @@ form.
That way, separator components are always found at the same relative
indices within the result list.
- Note that *split* will never split a string on an empty pattern match.
- For example:
+ .. note::
+
+ :func:`split` doesn't currently split a string on an empty pattern match.
+ For example:
+
+ >>> re.split('x*', 'axbc')
+ ['a', 'bc']
- >>> re.split('x*', 'foo')
- ['foo']
- >>> re.split("(?m)^$", "foo\n\nbar\n")
- ['foo\n\nbar\n']
+ Even though ``'x*'`` also matches 0 'x' before 'a', between 'b' and 'c',
+ and after 'c', currently these matches are ignored. The correct behavior
+ (i.e. splitting on empty matches too and returning ``['', 'a', 'b', 'c',
+ '']``) will be implemented in future versions of Python, but since this
+ is a backward incompatible change, a :exc:`FutureWarning` will be raised
+ in the meanwhile.
+
+ Patterns that can only match empty strings currently never split the
+ string. Since this doesn't match the expected behavior, a
+ :exc:`ValueError` will be raised starting from Python 3.5::
+
+ >>> re.split("^$", "foo\n\nbar\n", flags=re.M)
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ ...
+ ValueError: split() requires a non-empty pattern match.
.. versionchanged:: 3.1
Added the optional flags argument.
+ .. versionchanged:: 3.5
+ Splitting on a pattern that could match an empty string now raises
+ a warning. Patterns that can only match empty strings are now rejected.
.. function:: findall(pattern, string, flags=0)
diff --git a/Doc/whatsnew/3.5.rst b/Doc/whatsnew/3.5.rst
index c309aa80c7..f7b9a83c2e 100644
--- a/Doc/whatsnew/3.5.rst
+++ b/Doc/whatsnew/3.5.rst
@@ -482,6 +482,13 @@ Changes in the Python API
simply define :meth:`~importlib.machinery.Loader.create_module` to return
``None`` (:issue:`23014`).
+* :func:`re.split` always ignored empty pattern matches, so the ``'x*'``
+ pattern worked the same as ``'x+'``, and the ``'\b'`` pattern never worked.
+ Now :func:`re.split` raises a warning if the pattern could match
+ an empty string. For compatibility use patterns that never match an empty
+ string (e.g. ``'x+'`` instead of ``'x*'``). Patterns that could only match
+ an empty string (such as ``'\b'``) now raise an error.
+
Changes in the C API
--------------------