diff options
author | Serhiy Storchaka <storchaka@gmail.com> | 2015-02-03 11:04:19 +0200 |
---|---|---|
committer | Serhiy Storchaka <storchaka@gmail.com> | 2015-02-03 11:04:19 +0200 |
commit | 83e802796c80f46be616b48020356f7f51be533d (patch) | |
tree | e896b143abc3523f96e20d88ebcc22512af16aa7 /Doc | |
parent | 32ca3dcb97a75c05dc2b90c88bbf82a541c57c61 (diff) | |
download | cpython-git-83e802796c80f46be616b48020356f7f51be533d.tar.gz |
Issue #22818: Splitting on a pattern that could match an empty string now
raises a warning. Patterns that can only match empty strings are now
rejected.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/re.rst | 32 | ||||
-rw-r--r-- | Doc/whatsnew/3.5.rst | 7 |
2 files changed, 33 insertions, 6 deletions
diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 60ded8b2f4..8e20496011 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -626,17 +626,37 @@ form. That way, separator components are always found at the same relative indices within the result list. - Note that *split* will never split a string on an empty pattern match. - For example: + .. note:: + + :func:`split` doesn't currently split a string on an empty pattern match. + For example: + + >>> re.split('x*', 'axbc') + ['a', 'bc'] - >>> re.split('x*', 'foo') - ['foo'] - >>> re.split("(?m)^$", "foo\n\nbar\n") - ['foo\n\nbar\n'] + Even though ``'x*'`` also matches 0 'x' before 'a', between 'b' and 'c', + and after 'c', currently these matches are ignored. The correct behavior + (i.e. splitting on empty matches too and returning ``['', 'a', 'b', 'c', + '']``) will be implemented in future versions of Python, but since this + is a backward incompatible change, a :exc:`FutureWarning` will be raised + in the meanwhile. + + Patterns that can only match empty strings currently never split the + string. Since this doesn't match the expected behavior, a + :exc:`ValueError` will be raised starting from Python 3.5:: + + >>> re.split("^$", "foo\n\nbar\n", flags=re.M) + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ... + ValueError: split() requires a non-empty pattern match. .. versionchanged:: 3.1 Added the optional flags argument. + .. versionchanged:: 3.5 + Splitting on a pattern that could match an empty string now raises + a warning. Patterns that can only match empty strings are now rejected. .. function:: findall(pattern, string, flags=0) diff --git a/Doc/whatsnew/3.5.rst b/Doc/whatsnew/3.5.rst index c309aa80c7..f7b9a83c2e 100644 --- a/Doc/whatsnew/3.5.rst +++ b/Doc/whatsnew/3.5.rst @@ -482,6 +482,13 @@ Changes in the Python API simply define :meth:`~importlib.machinery.Loader.create_module` to return ``None`` (:issue:`23014`). +* :func:`re.split` always ignored empty pattern matches, so the ``'x*'`` + pattern worked the same as ``'x+'``, and the ``'\b'`` pattern never worked. + Now :func:`re.split` raises a warning if the pattern could match + an empty string. For compatibility use patterns that never match an empty + string (e.g. ``'x+'`` instead of ``'x*'``). Patterns that could only match + an empty string (such as ``'\b'``) now raise an error. + Changes in the C API -------------------- |