summaryrefslogtreecommitdiff
path: root/Doc
diff options
context:
space:
mode:
authorGregory P. Smith <gps@google.com>2022-09-02 09:35:08 -0700
committerGitHub <noreply@github.com>2022-09-02 09:35:08 -0700
commit511ca9452033ef95bc7d7fc404b8161068226002 (patch)
treecefd49e0c9c75f912fa28d05eae15335273aaa8e /Doc
parent656167db81a53934da55d90ed431449d8a4fc14b (diff)
downloadcpython-git-511ca9452033ef95bc7d7fc404b8161068226002.tar.gz
gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96499)
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds. This PR comes fresh from a pile of work done in our private PSRT security response team repo. Signed-off-by: Christian Heimes [Red Hat] <christian@python.org> Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org> Reviews via the private PSRT repo via many others (see the NEWS entry in the PR). <!-- gh-issue-number: gh-95778 --> * Issue: gh-95778 <!-- /gh-issue-number --> I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/functions.rst7
-rw-r--r--Doc/library/json.rst11
-rw-r--r--Doc/library/stdtypes.rst166
-rw-r--r--Doc/library/sys.rst57
-rw-r--r--Doc/library/test.rst10
-rw-r--r--Doc/using/cmdline.rst13
-rw-r--r--Doc/whatsnew/3.12.rst11
7 files changed, 262 insertions, 13 deletions
diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst
index e86e1857c7..b9cf02e87e 100644
--- a/Doc/library/functions.rst
+++ b/Doc/library/functions.rst
@@ -910,6 +910,13 @@ are always available. They are listed here in alphabetical order.
.. versionchanged:: 3.11
The delegation to :meth:`__trunc__` is deprecated.
+ .. versionchanged:: 3.12
+ :class:`int` string inputs and string representations can be limited to
+ help avoid denial of service attacks. A :exc:`ValueError` is raised when
+ the limit is exceeded while converting a string *x* to an :class:`int` or
+ when converting an :class:`int` into a string would exceed the limit.
+ See the :ref:`integer string conversion length limitation
+ <int_max_str_digits>` documentation.
.. function:: isinstance(object, classinfo)
diff --git a/Doc/library/json.rst b/Doc/library/json.rst
index 467d5d9e15..d05d62e78c 100644
--- a/Doc/library/json.rst
+++ b/Doc/library/json.rst
@@ -23,6 +23,11 @@ is a lightweight data interchange format inspired by
`JavaScript <https://en.wikipedia.org/wiki/JavaScript>`_ object literal syntax
(although it is not a strict subset of JavaScript [#rfc-errata]_ ).
+.. warning::
+ Be cautious when parsing JSON data from untrusted sources. A malicious
+ JSON string may cause the decoder to consume considerable CPU and memory
+ resources. Limiting the size of data to be parsed is recommended.
+
:mod:`json` exposes an API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules.
@@ -253,6 +258,12 @@ Basic Usage
be used to use another datatype or parser for JSON integers
(e.g. :class:`float`).
+ .. versionchanged:: 3.12
+ The default *parse_int* of :func:`int` now limits the maximum length of
+ the integer string via the interpreter's :ref:`integer string
+ conversion length limitation <int_max_str_digits>` to help avoid denial
+ of service attacks.
+
*parse_constant*, if specified, will be called with one of the following
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``.
This can be used to raise an exception if invalid JSON numbers
diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst
index f68cf46a6c..163ac70413 100644
--- a/Doc/library/stdtypes.rst
+++ b/Doc/library/stdtypes.rst
@@ -622,6 +622,13 @@ class`. float also has the following additional methods.
:exc:`OverflowError` on infinities and a :exc:`ValueError` on
NaNs.
+ .. note::
+
+ The values returned by ``as_integer_ratio()`` can be huge. Attempts
+ to render such integers into decimal strings may bump into the
+ :ref:`integer string conversion length limitation
+ <int_max_str_digits>`.
+
.. method:: float.is_integer()
Return ``True`` if the float instance is finite with integral
@@ -5460,6 +5467,165 @@ types, where they are relevant. Some of these are not reported by the
[<class 'bool'>]
+.. _int_max_str_digits:
+
+Integer string conversion length limitation
+===========================================
+
+CPython has a global limit for converting between :class:`int` and :class:`str`
+to mitigate denial of service attacks. This limit *only* applies to decimal or
+other non-power-of-two number bases. Hexadecimal, octal, and binary conversions
+are unlimited. The limit can be configured.
+
+The :class:`int` type in CPython is an abitrary length number stored in binary
+form (commonly known as a "bignum"). There exists no algorithm that can convert
+a string to a binary integer or a binary integer to a string in linear time,
+*unless* the base is a power of 2. Even the best known algorithms for base 10
+have sub-quadratic complexity. Converting a large value such as ``int('1' *
+500_000)`` can take over a second on a fast CPU.
+
+Limiting conversion size offers a practical way to avoid `CVE-2020-10735
+<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
+
+The limit is applied to the number of digit characters in the input or output
+string when a non-linear conversion algorithm would be involved. Underscores
+and the sign are not counted towards the limit.
+
+When an operation would exceed the limit, a :exc:`ValueError` is raised:
+
+.. doctest::
+
+ >>> import sys
+ >>> sys.set_int_max_str_digits(4300) # Illustrative, this is the default.
+ >>> _ = int('2' * 5432)
+ Traceback (most recent call last):
+ ...
+ ValueError: Exceeds the limit (4300) for integer string conversion: value has 5432 digits.
+ >>> i = int('2' * 4300)
+ >>> len(str(i))
+ 4300
+ >>> i_squared = i*i
+ >>> len(str(i_squared))
+ Traceback (most recent call last):
+ ...
+ ValueError: Exceeds the limit (4300) for integer string conversion: value has 8599 digits.
+ >>> len(hex(i_squared))
+ 7144
+ >>> assert int(hex(i_squared), base=16) == i*i # Hexadecimal is unlimited.
+
+The default limit is 4300 digits as provided in
+:data:`sys.int_info.default_max_str_digits <sys.int_info>`.
+The lowest limit that can be configured is 640 digits as provided in
+:data:`sys.int_info.str_digits_check_threshold <sys.int_info>`.
+
+Verification:
+
+.. doctest::
+
+ >>> import sys
+ >>> assert sys.int_info.default_max_str_digits == 4300, sys.int_info
+ >>> assert sys.int_info.str_digits_check_threshold == 640, sys.int_info
+ >>> msg = int('578966293710682886880994035146873798396722250538762761564'
+ ... '9252925514383915483333812743580549779436104706260696366600'
+ ... '571186405732').to_bytes(53, 'big')
+ ...
+
+.. versionadded:: 3.12
+
+Affected APIs
+-------------
+
+The limition only applies to potentially slow conversions between :class:`int`
+and :class:`str` or :class:`bytes`:
+
+* ``int(string)`` with default base 10.
+* ``int(string, base)`` for all bases that are not a power of 2.
+* ``str(integer)``.
+* ``repr(integer)``
+* any other string conversion to base 10, for example ``f"{integer}"``,
+ ``"{}".format(integer)``, or ``b"%d" % integer``.
+
+The limitations do not apply to functions with a linear algorithm:
+
+* ``int(string, base)`` with base 2, 4, 8, 16, or 32.
+* :func:`int.from_bytes` and :func:`int.to_bytes`.
+* :func:`hex`, :func:`oct`, :func:`bin`.
+* :ref:`formatspec` for hex, octal, and binary numbers.
+* :class:`str` to :class:`float`.
+* :class:`str` to :class:`decimal.Decimal`.
+
+Configuring the limit
+---------------------
+
+Before Python starts up you can use an environment variable or an interpreter
+command line flag to configure the limit:
+
+* :envvar:`PYTHONINTMAXSTRDIGITS`, e.g.
+ ``PYTHONINTMAXSTRDIGITS=640 python3`` to set the limit to 640 or
+ ``PYTHONINTMAXSTRDIGITS=0 python3`` to disable the limitation.
+* :option:`-X int_max_str_digits <-X>`, e.g.
+ ``python3 -X int_max_str_digits=640``
+* :data:`sys.flags.int_max_str_digits` contains the value of
+ :envvar:`PYTHONINTMAXSTRDIGITS` or :option:`-X int_max_str_digits <-X>`.
+ If both the env var and the ``-X`` option are set, the ``-X`` option takes
+ precedence. A value of *-1* indicates that both were unset, thus a value of
+ :data:`sys.int_info.default_max_str_digits` was used during initilization.
+
+From code, you can inspect the current limit and set a new one using these
+:mod:`sys` APIs:
+
+* :func:`sys.get_int_max_str_digits` and :func:`sys.set_int_max_str_digits` are
+ a getter and setter for the interpreter-wide limit. Subinterpreters have
+ their own limit.
+
+Information about the default and minimum can be found in :attr:`sys.int_info`:
+
+* :data:`sys.int_info.default_max_str_digits <sys.int_info>` is the compiled-in
+ default limit.
+* :data:`sys.int_info.str_digits_check_threshold <sys.int_info>` is the lowest
+ accepted value for the limit (other than 0 which disables it).
+
+.. versionadded:: 3.12
+
+.. caution::
+
+ Setting a low limit *can* lead to problems. While rare, code exists that
+ contains integer constants in decimal in their source that exceed the
+ minimum threshold. A consequence of setting the limit is that Python source
+ code containing decimal integer literals longer than the limit will
+ encounter an error during parsing, usually at startup time or import time or
+ even at installation time - anytime an up to date ``.pyc`` does not already
+ exist for the code. A workaround for source that contains such large
+ constants is to convert them to ``0x`` hexadecimal form as it has no limit.
+
+ Test your application thoroughly if you use a low limit. Ensure your tests
+ run with the limit set early via the environment or flag so that it applies
+ during startup and even during any installation step that may invoke Python
+ to precompile ``.py`` sources to ``.pyc`` files.
+
+Recommended configuration
+-------------------------
+
+The default :data:`sys.int_info.default_max_str_digits` is expected to be
+reasonable for most applications. If your application requires a different
+limit, set it from your main entry point using Python version agnostic code as
+these APIs were added in security patch releases in versions before 3.12.
+
+Example::
+
+ >>> import sys
+ >>> if hasattr(sys, "set_int_max_str_digits"):
+ ... upper_bound = 68000
+ ... lower_bound = 4004
+ ... current_limit = sys.get_int_max_str_digits()
+ ... if current_limit == 0 or current_limit > upper_bound:
+ ... sys.set_int_max_str_digits(upper_bound)
+ ... elif current_limit < lower_bound:
+ ... sys.set_int_max_str_digits(lower_bound)
+
+If you need to disable it entirely, set it to ``0``.
+
+
.. rubric:: Footnotes
.. [1] Additional information on these special methods may be found in the Python
diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst
index 43db4baf62..cc41b996d2 100644
--- a/Doc/library/sys.rst
+++ b/Doc/library/sys.rst
@@ -502,9 +502,9 @@ always available.
The :term:`named tuple` *flags* exposes the status of command line
flags. The attributes are read only.
- ============================= ================================================================
+ ============================= ==============================================================================================================
attribute flag
- ============================= ================================================================
+ ============================= ==============================================================================================================
:const:`debug` :option:`-d`
:const:`inspect` :option:`-i`
:const:`interactive` :option:`-i`
@@ -521,7 +521,8 @@ always available.
:const:`dev_mode` :option:`-X dev <-X>` (:ref:`Python Development Mode <devmode>`)
:const:`utf8_mode` :option:`-X utf8 <-X>`
:const:`safe_path` :option:`-P`
- ============================= ================================================================
+ :const:`int_max_str_digits` :option:`-X int_max_str_digits <-X>` (:ref:`integer string conversion length limitation <int_max_str_digits>`)
+ ============================= ==============================================================================================================
.. versionchanged:: 3.2
Added ``quiet`` attribute for the new :option:`-q` flag.
@@ -543,6 +544,9 @@ always available.
.. versionchanged:: 3.11
Added the ``safe_path`` attribute for :option:`-P` option.
+ .. versionchanged:: 3.12
+ Added the ``int_max_str_digits`` attribute.
+
.. data:: float_info
@@ -723,6 +727,13 @@ always available.
.. versionadded:: 3.6
+.. function:: get_int_max_str_digits()
+
+ Returns the current value for the :ref:`integer string conversion length
+ limitation <int_max_str_digits>`. See also :func:`set_int_max_str_digits`.
+
+ .. versionadded:: 3.12
+
.. function:: getrefcount(object)
Return the reference count of the *object*. The count returned is generally one
@@ -996,19 +1007,31 @@ always available.
.. tabularcolumns:: |l|L|
- +-------------------------+----------------------------------------------+
- | Attribute | Explanation |
- +=========================+==============================================+
- | :const:`bits_per_digit` | number of bits held in each digit. Python |
- | | integers are stored internally in base |
- | | ``2**int_info.bits_per_digit`` |
- +-------------------------+----------------------------------------------+
- | :const:`sizeof_digit` | size in bytes of the C type used to |
- | | represent a digit |
- +-------------------------+----------------------------------------------+
+ +----------------------------------------+-----------------------------------------------+
+ | Attribute | Explanation |
+ +========================================+===============================================+
+ | :const:`bits_per_digit` | number of bits held in each digit. Python |
+ | | integers are stored internally in base |
+ | | ``2**int_info.bits_per_digit`` |
+ +----------------------------------------+-----------------------------------------------+
+ | :const:`sizeof_digit` | size in bytes of the C type used to |
+ | | represent a digit |
+ +----------------------------------------+-----------------------------------------------+
+ | :const:`default_max_str_digits` | default value for |
+ | | :func:`sys.get_int_max_str_digits` when it |
+ | | is not otherwise explicitly configured. |
+ +----------------------------------------+-----------------------------------------------+
+ | :const:`str_digits_check_threshold` | minimum non-zero value for |
+ | | :func:`sys.set_int_max_str_digits`, |
+ | | :envvar:`PYTHONINTMAXSTRDIGITS`, or |
+ | | :option:`-X int_max_str_digits <-X>`. |
+ +----------------------------------------+-----------------------------------------------+
.. versionadded:: 3.1
+ .. versionchanged:: 3.12
+ Added ``default_max_str_digits`` and ``str_digits_check_threshold``.
+
.. data:: __interactivehook__
@@ -1308,6 +1331,14 @@ always available.
.. availability:: Unix.
+.. function:: set_int_max_str_digits(n)
+
+ Set the :ref:`integer string conversion length limitation
+ <int_max_str_digits>` used by this interpreter. See also
+ :func:`get_int_max_str_digits`.
+
+ .. versionadded:: 3.12
+
.. function:: setprofile(profilefunc)
.. index::
diff --git a/Doc/library/test.rst b/Doc/library/test.rst
index f3bc7e7560..eff3751323 100644
--- a/Doc/library/test.rst
+++ b/Doc/library/test.rst
@@ -1011,6 +1011,16 @@ The :mod:`test.support` module defines the following functions:
.. versionadded:: 3.10
+.. function:: adjust_int_max_str_digits(max_digits)
+
+ This function returns a context manager that will change the global
+ :func:`sys.set_int_max_str_digits` setting for the duration of the
+ context to allow execution of test code that needs a different limit
+ on the number of digits when converting between an integer and string.
+
+ .. versionadded:: 3.12
+
+
The :mod:`test.support` module defines the following classes:
diff --git a/Doc/using/cmdline.rst b/Doc/using/cmdline.rst
index fa2b07e468..6a33d98a05 100644
--- a/Doc/using/cmdline.rst
+++ b/Doc/using/cmdline.rst
@@ -505,6 +505,9 @@ Miscellaneous options
stored in a traceback of a trace. Use ``-X tracemalloc=NFRAME`` to start
tracing with a traceback limit of *NFRAME* frames. See the
:func:`tracemalloc.start` for more information.
+ * ``-X int_max_str_digits`` configures the :ref:`integer string conversion
+ length limitation <int_max_str_digits>`. See also
+ :envvar:`PYTHONINTMAXSTRDIGITS`.
* ``-X importtime`` to show how long each import takes. It shows module
name, cumulative time (including nested imports) and self time (excluding
nested imports). Note that its output may be broken in multi-threaded
@@ -583,6 +586,9 @@ Miscellaneous options
The ``-X frozen_modules`` option.
.. versionadded:: 3.12
+ The ``-X int_max_str_digits`` option.
+
+ .. versionadded:: 3.12
The ``-X perf`` option.
@@ -763,6 +769,13 @@ conflict.
.. versionadded:: 3.2.3
+.. envvar:: PYTHONINTMAXSTRDIGITS
+
+ If this variable is set to an integer, it is used to configure the
+ interpreter's global :ref:`integer string conversion length limitation
+ <int_max_str_digits>`.
+
+ .. versionadded:: 3.12
.. envvar:: PYTHONIOENCODING
diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst
index f9fa8ac312..70a1104127 100644
--- a/Doc/whatsnew/3.12.rst
+++ b/Doc/whatsnew/3.12.rst
@@ -83,6 +83,17 @@ Other Language Changes
mapping is hashable.
(Contributed by Serhiy Storchaka in :gh:`87995`.)
+* Converting between :class:`int` and :class:`str` in bases other than 2
+ (binary), 4, 8 (octal), 16 (hexadecimal), or 32 such as base 10 (decimal)
+ now raises a :exc:`ValueError` if the number of digits in string form is
+ above a limit to avoid potential denial of service attacks due to the
+ algorithmic complexity. This is a mitigation for `CVE-2020-10735
+ <https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
+ This limit can be configured or disabled by environment variable, command
+ line flag, or :mod:`sys` APIs. See the :ref:`integer string conversion
+ length limitation <int_max_str_digits>` documentation. The default limit
+ is 4300 digits in string form.
+
New Modules
===========