diff options
Diffstat (limited to 'Doc/library/stdtypes.rst')
-rw-r--r-- | Doc/library/stdtypes.rst | 297 |
1 files changed, 275 insertions, 22 deletions
diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index ce3c5320d8..f274edb880 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -354,7 +354,7 @@ Notes: The numeric literals accepted include the digits ``0`` to ``9`` or any Unicode equivalent (code points with the ``Nd`` property). - See http://www.unicode.org/Public/6.3.0/ucd/extracted/DerivedNumericType.txt + See http://www.unicode.org/Public/8.0.0/ucd/extracted/DerivedNumericType.txt for a complete list of code points with the ``Nd`` property. @@ -1950,6 +1950,16 @@ expression support in the :mod:`re` module). >>> 'www.example.com'.strip('cmowz.') 'example' + The outermost leading and trailing *chars* argument values are stripped + from the string. Characters are removed from the leading end until + reaching a string character that is not contained in the set of + characters in *chars*. A similar action takes place on the trailing end. + For example:: + + >>> comment_string = '#....... Section 3.2.1 Issue #32 .......' + >>> comment_string.strip('.#! ') + 'Section 3.2.1 Issue #32' + .. method:: str.swapcase() @@ -2303,6 +2313,19 @@ the bytes type has an additional class method to read data in that format: >>> bytes.fromhex('2Ef0 F1f2 ') b'.\xf0\xf1\xf2' +A reverse conversion function exists to transform a bytes object into its +hexadecimal representation. + +.. method:: bytes.hex() + + Return a string object containing two hexadecimal digits for each + byte in the instance. + + >>> b'\xf0\xf1\xf2'.hex() + 'f0f1f2' + + .. versionadded:: 3.5 + Since bytes objects are sequences of integers (akin to a tuple), for a bytes object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes object of length 1. (This contrasts with text strings, where both indexing @@ -2358,6 +2381,19 @@ the bytearray type has an additional class method to read data in that format: >>> bytearray.fromhex('2Ef0 F1f2 ') bytearray(b'.\xf0\xf1\xf2') +A reverse conversion function exists to transform a bytearray object into its +hexadecimal representation. + +.. method:: bytearray.hex() + + Return a string object containing two hexadecimal digits for each + byte in the instance. + + >>> bytearray(b'\xf0\xf1\xf2').hex() + 'f0f1f2' + + .. versionadded:: 3.5 + Since bytearray objects are sequences of integers (akin to a list), for a bytearray object *b*, ``b[0]`` will be an integer, while ``b[0:1]`` will be a bytearray object of length 1. (This contrasts with text strings, where @@ -3103,6 +3139,203 @@ place, and instead produce new objects. always produces a new object, even if no changes were made. +.. _bytes-formatting: + +``printf``-style Bytes Formatting +---------------------------------- + +.. index:: + single: formatting, bytes (%) + single: formatting, bytearray (%) + single: interpolation, bytes (%) + single: interpolation, bytearray (%) + single: bytes; formatting + single: bytearray; formatting + single: bytes; interpolation + single: bytearray; interpolation + single: printf-style formatting + single: sprintf-style formatting + single: % formatting + single: % interpolation + +.. note:: + + The formatting operations described here exhibit a variety of quirks that + lead to a number of common errors (such as failing to display tuples and + dictionaries correctly). If the value being printed may be a tuple or + dictionary, wrap it in a tuple. + +Bytes objects (``bytes``/``bytearray``) have one unique built-in operation: +the ``%`` operator (modulo). +This is also known as the bytes *formatting* or *interpolation* operator. +Given ``format % values`` (where *format* is a bytes object), ``%`` conversion +specifications in *format* are replaced with zero or more elements of *values*. +The effect is similar to using the :c:func:`sprintf` in the C language. + +If *format* requires a single argument, *values* may be a single non-tuple +object. [5]_ Otherwise, *values* must be a tuple with exactly the number of +items specified by the format bytes object, or a single mapping object (for +example, a dictionary). + +A conversion specifier contains two or more characters and has the following +components, which must occur in this order: + +#. The ``'%'`` character, which marks the start of the specifier. + +#. Mapping key (optional), consisting of a parenthesised sequence of characters + (for example, ``(somename)``). + +#. Conversion flags (optional), which affect the result of some conversion + types. + +#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the + actual width is read from the next element of the tuple in *values*, and the + object to convert comes after the minimum field width and optional precision. + +#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If + specified as ``'*'`` (an asterisk), the actual precision is read from the next + element of the tuple in *values*, and the value to convert comes after the + precision. + +#. Length modifier (optional). + +#. Conversion type. + +When the right argument is a dictionary (or other mapping type), then the +formats in the bytes object *must* include a parenthesised mapping key into that +dictionary inserted immediately after the ``'%'`` character. The mapping key +selects the value to be formatted from the mapping. For example: + + >>> print(b'%(language)s has %(number)03d quote types.' % + ... {b'language': b"Python", b"number": 2}) + b'Python has 002 quote types.' + +In this case no ``*`` specifiers may occur in a format (since they require a +sequential parameter list). + +The conversion flag characters are: + ++---------+---------------------------------------------------------------------+ +| Flag | Meaning | ++=========+=====================================================================+ +| ``'#'`` | The value conversion will use the "alternate form" (where defined | +| | below). | ++---------+---------------------------------------------------------------------+ +| ``'0'`` | The conversion will be zero padded for numeric values. | ++---------+---------------------------------------------------------------------+ +| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` | +| | conversion if both are given). | ++---------+---------------------------------------------------------------------+ +| ``' '`` | (a space) A blank should be left before a positive number (or empty | +| | string) produced by a signed conversion. | ++---------+---------------------------------------------------------------------+ +| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion | +| | (overrides a "space" flag). | ++---------+---------------------------------------------------------------------+ + +A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it +is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``. + +The conversion types are: + ++------------+-----------------------------------------------------+-------+ +| Conversion | Meaning | Notes | ++============+=====================================================+=======+ +| ``'d'`` | Signed integer decimal. | | ++------------+-----------------------------------------------------+-------+ +| ``'i'`` | Signed integer decimal. | | ++------------+-----------------------------------------------------+-------+ +| ``'o'`` | Signed octal value. | \(1) | ++------------+-----------------------------------------------------+-------+ +| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(8) | ++------------+-----------------------------------------------------+-------+ +| ``'x'`` | Signed hexadecimal (lowercase). | \(2) | ++------------+-----------------------------------------------------+-------+ +| ``'X'`` | Signed hexadecimal (uppercase). | \(2) | ++------------+-----------------------------------------------------+-------+ +| ``'e'`` | Floating point exponential format (lowercase). | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'E'`` | Floating point exponential format (uppercase). | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'f'`` | Floating point decimal format. | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'F'`` | Floating point decimal format. | \(3) | ++------------+-----------------------------------------------------+-------+ +| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) | +| | format if exponent is less than -4 or not less than | | +| | precision, decimal format otherwise. | | ++------------+-----------------------------------------------------+-------+ +| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) | +| | format if exponent is less than -4 or not less than | | +| | precision, decimal format otherwise. | | ++------------+-----------------------------------------------------+-------+ +| ``'c'`` | Single byte (accepts integer or single | | +| | byte objects). | | ++------------+-----------------------------------------------------+-------+ +| ``'b'`` | Bytes (any object that follows the | \(5) | +| | :ref:`buffer protocol <bufferobjects>` or has | | +| | :meth:`__bytes__`). | | ++------------+-----------------------------------------------------+-------+ +| ``'s'`` | ``'s'`` is an alias for ``'b'`` and should only | \(6) | +| | be used for Python2/3 code bases. | | ++------------+-----------------------------------------------------+-------+ +| ``'a'`` | Bytes (converts any Python object using | \(5) | +| | ``repr(obj).encode('ascii','backslashreplace)``). | | ++------------+-----------------------------------------------------+-------+ +| ``'r'`` | ``'r'`` is an alias for ``'a'`` and should only | \(7) | +| | be used for Python2/3 code bases. | | ++------------+-----------------------------------------------------+-------+ +| ``'%'`` | No argument is converted, results in a ``'%'`` | | +| | character in the result. | | ++------------+-----------------------------------------------------+-------+ + +Notes: + +(1) + The alternate form causes a leading zero (``'0'``) to be inserted between + left-hand padding and the formatting of the number if the leading character + of the result is not already a zero. + +(2) + The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether + the ``'x'`` or ``'X'`` format was used) to be inserted between left-hand padding + and the formatting of the number if the leading character of the result is not + already a zero. + +(3) + The alternate form causes the result to always contain a decimal point, even if + no digits follow it. + + The precision determines the number of digits after the decimal point and + defaults to 6. + +(4) + The alternate form causes the result to always contain a decimal point, and + trailing zeroes are not removed as they would otherwise be. + + The precision determines the number of significant digits before and after the + decimal point and defaults to 6. + +(5) + If precision is ``N``, the output is truncated to ``N`` characters. + +(6) + ``b'%s'`` is deprecated, but will not be removed during the 3.x series. + +(7) + ``b'%r'`` is deprecated, but will not be removed during the 3.x series. + +(8) + See :pep:`237`. + +.. note:: + + The bytearray version of this method does *not* operate in place - it + always produces a new object, even if no changes were made. + +.. seealso:: :pep:`461`. +.. versionadded:: 3.5 + .. _typememoryview: Memory Views @@ -3131,10 +3364,8 @@ copying. the view. The :class:`~memoryview.itemsize` attribute will give you the number of bytes in a single element. - A :class:`memoryview` supports slicing to expose its data. If - :class:`~memoryview.format` is one of the native format specifiers - from the :mod:`struct` module, indexing will return a single element - with the correct type. Full slicing will result in a subview:: + A :class:`memoryview` supports slicing and indexing to expose its data. + One-dimensional slicing will result in a subview:: >>> v = memoryview(b'abcefg') >>> v[1] @@ -3146,25 +3377,29 @@ copying. >>> bytes(v[1:4]) b'bce' - Other native formats:: + If :class:`~memoryview.format` is one of the native format specifiers + from the :mod:`struct` module, indexing with an integer or a tuple of + integers is also supported and returns a single *element* with + the correct type. One-dimensional memoryviews can be indexed + with an integer or a one-integer tuple. Multi-dimensional memoryviews + can be indexed with tuples of exactly *ndim* integers where *ndim* is + the number of dimensions. Zero-dimensional memoryviews can be indexed + with the empty tuple. + + Here is an example with a non-byte format:: >>> import array >>> a = array.array('l', [-11111111, 22222222, -33333333, 44444444]) - >>> a[0] + >>> m = memoryview(a) + >>> m[0] -11111111 - >>> a[-1] + >>> m[-1] 44444444 - >>> a[2:3].tolist() - [-33333333] - >>> a[::2].tolist() + >>> m[::2].tolist() [-11111111, -33333333] - >>> a[::-1].tolist() - [44444444, -33333333, 22222222, -11111111] - - .. versionadded:: 3.3 - If the underlying object is writable, the memoryview supports slice - assignment. Resizing is not allowed:: + If the underlying object is writable, the memoryview supports + one-dimensional slice assignment. Resizing is not allowed:: >>> data = bytearray(b'abcefg') >>> v = memoryview(data) @@ -3197,12 +3432,16 @@ copying. True .. versionchanged:: 3.3 + One-dimensional memoryviews can now be sliced. One-dimensional memoryviews with formats 'B', 'b' or 'c' are now hashable. .. versionchanged:: 3.4 memoryview is now registered automatically with :class:`collections.abc.Sequence` + .. versionchanged:: 3.5 + memoryviews can now be indexed with tuple of integers. + :class:`memoryview` has several methods: .. method:: __eq__(exporter) @@ -3269,6 +3508,17 @@ copying. supports all format strings, including those that are not in :mod:`struct` module syntax. + .. method:: hex() + + Return a string object containing two hexadecimal digits for each + byte in the buffer. :: + + >>> m = memoryview(b"abc") + >>> m.hex() + '616263' + + .. versionadded:: 3.5 + .. method:: tolist() Return the data in the buffer as a list of elements. :: @@ -3324,10 +3574,10 @@ copying. Cast a memoryview to a new format or shape. *shape* defaults to ``[byte_length//new_itemsize]``, which means that the result view will be one-dimensional. The return value is a new memoryview, but - the buffer itself is not copied. Supported casts are 1D -> C-contiguous + the buffer itself is not copied. Supported casts are 1D -> C-:term:`contiguous` and C-contiguous -> 1D. - Both formats are restricted to single element native formats in + The destination format is restricted to a single element native format in :mod:`struct` syntax. One of the formats must be a byte format ('B', 'b' or 'c'). The byte length of the result must be the same as the original length. @@ -3408,6 +3658,9 @@ copying. .. versionadded:: 3.3 + .. versionchanged:: 3.5 + The source format is no longer restricted when casting to a byte view. + There are also several readonly attributes available: .. attribute:: obj @@ -3512,19 +3765,19 @@ copying. .. attribute:: c_contiguous - A bool indicating whether the memory is C-contiguous. + A bool indicating whether the memory is C-:term:`contiguous`. .. versionadded:: 3.3 .. attribute:: f_contiguous - A bool indicating whether the memory is Fortran contiguous. + A bool indicating whether the memory is Fortran :term:`contiguous`. .. versionadded:: 3.3 .. attribute:: contiguous - A bool indicating whether the memory is contiguous. + A bool indicating whether the memory is :term:`contiguous`. .. versionadded:: 3.3 |