diff options
| author | Seth M Morton <seth.m.morton@gmail.com> | 2015-05-17 19:23:27 -0700 |
|---|---|---|
| committer | Seth M Morton <seth.m.morton@gmail.com> | 2015-05-17 19:23:27 -0700 |
| commit | 04f4fd8e22ca755d3b1fcf06c3089f0797d3a872 (patch) | |
| tree | 0b6ddc1463203496ff7ee50ca4fcdcb28508f105 | |
| parent | 7ad21a2d1671e7fec5e7bf785bda369a6eee3bda (diff) | |
| download | natsort-04f4fd8e22ca755d3b1fcf06c3089f0797d3a872.tar.gz | |
Updated version and documentation.
| -rw-r--r-- | README.rst | 108 | ||||
| -rw-r--r-- | docs/source/changelog.rst | 11 | ||||
| -rw-r--r-- | docs/source/examples.rst | 28 | ||||
| -rw-r--r-- | docs/source/intro.rst | 45 | ||||
| -rw-r--r-- | natsort/_version.py | 2 | ||||
| -rw-r--r-- | natsort/natsort.py | 31 | ||||
| -rw-r--r-- | natsort/ns_enum.py | 3 | ||||
| -rw-r--r-- | setup.py | 1 |
8 files changed, 162 insertions, 67 deletions
@@ -11,10 +11,10 @@ Natural sorting for python. - Source Code: https://github.com/SethMMorton/natsort - Downloads: https://pypi.python.org/pypi/natsort - - Documentation: http://pythonhosted.org/natsort/ + - Documentation: http://pythonhosted.org/natsort -Please see `Deprecation Notices`_ for an `important` backwards incompatibility notice -for ``natsort`` version 4.0.0. +Please see `Moving from older Natsort versions`_ to see if this update requires +you to modify your ``natsort`` calls in your code (99% of users will not). Quick Description ----------------- @@ -47,7 +47,7 @@ Using ``natsorted`` is simple: ``natsorted`` identifies real numbers anywhere in a string and sorts them naturally. -Sorting is handled properly by default (as of ``natsort`` version >= 4.0.0): +Sorting versions is handled properly by default (as of ``natsort`` version >= 4.0.0): .. code-block:: python @@ -55,6 +55,9 @@ Sorting is handled properly by default (as of ``natsort`` version >= 4.0.0): >>> natsorted(a) ['version-1.9', 'version-1.10', 'version-1.11', 'version-2.0'] +If you need to sort release candidates, please see +`this useful hack <http://pythonhosted.org//natsort/examples.htm#rc-sorting>`_ . + You can also perform locale-aware sorting (or "human sorting"), where the non-numeric characters are ordered based on their meaning, not on their ordinal value; this can be achieved with the ``humansorted`` function: @@ -139,9 +142,9 @@ from the command line with ``python -m natsort``. Requirements ------------ -``natsort`` requires python version 2.6 or greater -(this includes python 3.x). To run version 2.6, 3.0, or 3.1 the -`argparse <https://pypi.python.org/pypi/argparse>`_ module is required. +``natsort`` requires Python version 2.7 or greater or Python 3.3 or greater. +Python 2.6 and 3.2 are no longer officially supported (no unit tests are performed) +but it should work. .. _optional: @@ -163,36 +166,56 @@ PyICU ''''' On BSD-based systems (this includes Mac OS X), the underlying ``locale`` library -can be buggy (please see http://bugs.python.org/issue23195), so ``natsort`` will use -`PyICU <https://pypi.python.org/pypi/PyICU>`_ under the hood if it is installed -on your computer; this will give more reliable cross-platform results. -``natsort`` will not require (or check) that -`PyICU <https://pypi.python.org/pypi/PyICU>`_ is installed at installation -since in Linux-based systems and Windows systems ``locale`` should work just fine. -Please visit https://github.com/SethMMorton/natsort/issues/21 for more details and -how to install on Mac OS X. +can be buggy (please see http://bugs.python.org/issue23195); ``locale`` is +used for the ``ns.LOCALE`` option and ``humansorted`` function.. To remedy this, +one can + + 1. Use "\*.ISO8859-1" locale (i.e. 'en_US.ISO8859-1') rather than "\*.UTF-8" + encoding. These encodings do not suffer from as many problems as "UTF-8" + and thus should give expected results. + 2. Use `PyICU <https://pypi.python.org/pypi/PyICU>`_. If + `PyICU <https://pypi.python.org/pypi/PyICU>`_ is installed, ``natsort`` + will use it under the hood if it is installed; this will give more + reliable cross-platform results in the long run. ``natsort`` will not + require (or check) that `PyICU <https://pypi.python.org/pypi/PyICU>`_ + is installed at installation. Please visit + https://github.com/SethMMorton/natsort/issues/21 for more details and + how to install on Mac OS X. **Please note** that using + `PyICU <https://pypi.python.org/pypi/PyICU>`_ is the only way to + guarantee correct results for all input on BSD-based systems, since + every other suggestion is a workaround. + 3. Do nothing. As of ``natsort`` version 4.0.0, ``natsort`` is configured + to compensate for a broken ``locale`` library in terms of case-handling; + if you do not need to be able to properly handle non-ASCII characters + then this may be the best option for you. + +Note that the above solutions *should not* be required for Windows or +Linux since in Linux-based systems and Windows systems ``locale`` *should* work +just fine. .. _deprecate: -Deprecation Notices -------------------- - - - The default sorting algorithm for ``natsort`` will change in version 4.0.0 - from signed floats (with exponents) to unsigned integers. The motivation - for this change is that it will cause ``natsort`` to return results that - pass the "least astonishment" test for the most common use case, which is - sorting version numbers. If you currently rely on the default behavior - to be signed floats, it is recommend that you add ``alg=ns.F`` to your - ``natsort`` calls or switch to the new ``realsorted`` function which - behaves identically to the current ``natsorted`` with default values. - This will also affect the default behavior of the ``natsort`` shell script. - - In ``natsort`` version 4.0.0, the ``number_type``, ``signed``, ``exp``, - ``as_path``, and ``py3_safe`` options will be removed from the (documented) - API, in favor of the ``alg`` option and ``ns`` enum. They will remain as - keyword-only arguments after that (for the foreseeable future). - - In ``natsort`` version 4.0.0, the ``natsort_key`` function will be removed - from the public API. All future development should use ``natsort_keygen`` - in preparation for this. +Moving from older Natsort versions +---------------------------------- + + - The default sorting algorithm for ``natsort`` has changed in version 4.0.0 + from signed floats (with exponents) to unsigned integers. The motivation + for this change is that it will cause ``natsort`` to return results that + pass the "least astonishment" test for the most common use case, which is + sorting version numbers. If you relied on the default behavior + to be signed floats, it is add ``alg=ns.F | ns.S`` to your + ``natsort`` calls or switch to the new ``realsorted`` function which + behaves identically to the current ``natsorted`` with default values. + For 99% of users this will have no effect... it is only expected that this + will effect users using ``natsort`` for science and engineering. What it + will do is make it so you no longer need ``ns.V`` or ``ns.I | ns.U`` to sort + version-like strings. + This will also affect the default behavior of the ``natsort`` shell script. + - In ``natsort`` version 4.0.0, the ``number_type``, ``signed``, ``exp``, + ``as_path``, and ``py3_safe`` options have be removed from the (documented) + API in favor of the ``alg`` option and ``ns`` enum. + - In ``natsort`` version 4.0.0, the ``natsort_key`` function has be removed + from the public API. Author ------ @@ -205,6 +228,17 @@ History These are the last three entries of the changelog. See the package documentation for the complete `changelog <http://pythonhosted.org//natsort/changelog.html>`_. +05-17-2015 v. 4.0.0 +''''''''''''''''''' + + - Made default behavior of 'natsort' search for unsigned ints, + rather than signed floats. This is a backwards-incompatible + change but in 99% of use cases it should not required any + end-user changes. + - Improved handling of locale-aware sorting on systems where the + underlying locale library is broken. + - Greatly improved all unit tests by adding the hypothesis library. + 04-06-2015 v. 3.5.6 ''''''''''''''''''' @@ -219,9 +253,3 @@ for the complete `changelog <http://pythonhosted.org//natsort/changelog.html>`_. - Added 'realsorted' and 'index_realsorted' functions for forward-compatibility with >= 4.0.0. - Made explanation of when to use "TYPESAFE" more clear in the docs. - -04-02-2015 v. 3.5.4 -''''''''''''''''''' - - - Fixed bug where a 'TypeError' was raised if a string containing a leading - number was sorted with alpha-only strings when 'LOCALE' is used. diff --git a/docs/source/changelog.rst b/docs/source/changelog.rst index 2803377..834373a 100644 --- a/docs/source/changelog.rst +++ b/docs/source/changelog.rst @@ -3,6 +3,17 @@ Changelog --------- +05-17-2015 v. 4.0.0 +''''''''''''''''''' + + - Made default behavior of 'natsort' search for unsigned ints, + rather than signed floats. This is a backwards-incompatible + change but in 99% of use cases it should not required any + end-user changes. + - Improved handling of locale-aware sorting on systems where the + underlying locale library is broken. + - Greatly improved all unit tests by adding the hypothesis library. + 04-06-2015 v. 3.5.6 ''''''''''''''''''' diff --git a/docs/source/examples.rst b/docs/source/examples.rst index 53aa6f9..02783f4 100644 --- a/docs/source/examples.rst +++ b/docs/source/examples.rst @@ -29,6 +29,8 @@ As of :mod:`natsort` version >= 4.0.0, :func:`~natsorted` will now properly sort version numbers. The old function :func:`~versorted` exists for backwards compatibility but new development should use :func:`~natsorted`. +.. _rc_sorting: + Sorting with Alpha, Beta, and Release Candidates ++++++++++++++++++++++++++++++++++++++++++++++++ @@ -107,6 +109,32 @@ with the ``locale`` module from the standard library that are solved when using `PyICU <https://pypi.python.org/pypi/PyICU>`_; you can read about them here: http://bugs.python.org/issue23195. +If you have problems with ``ns.LOCALE`` (or :func:`~humansorted`), +especially on BSD-based systems, you can try the following: + + 1. Use "\*.ISO8859-1" locale (i.e. 'en_US.ISO8859-1') rather than "\*.UTF-8" + encoding. These encodings do not suffer from as many problems as "UTF-8" + and thus should give expected results. + 2. Use `PyICU <https://pypi.python.org/pypi/PyICU>`_. If + `PyICU <https://pypi.python.org/pypi/PyICU>`_ is installed, ``natsort`` + will use it under the hood if it is installed; this will give more + reliable cross-platform results in the long run. ``natsort`` will not + require (or check) that `PyICU <https://pypi.python.org/pypi/PyICU>`_ + is installed at installation. Please visit + https://github.com/SethMMorton/natsort/issues/21 for more details and + how to install on Mac OS X. **Please note** that using + `PyICU <https://pypi.python.org/pypi/PyICU>`_ is the only way to + guarantee correct results for all input on BSD-based systems, since + every other suggestion is a workaround. + 3. Do nothing. As of ``natsort`` version 4.0.0, ``natsort`` is configured + to compensate for a broken ``locale`` library in terms of case-handling; + if you do not need to be able to properly handle non-ASCII characters + then this may be the best option for you. + +Note that the above solutions *should not* be required for Windows or +Linux since in Linux-based systems and Windows systems ``locale`` *should* work +just fine. + Controlling Case When Sorting ----------------------------- diff --git a/docs/source/intro.rst b/docs/source/intro.rst index 86c6fbf..d454094 100644 --- a/docs/source/intro.rst +++ b/docs/source/intro.rst @@ -50,7 +50,7 @@ or as versions. Using :func:`~natsorted` is simple:: :func:`~natsorted` identifies numbers anywhere in a string and sorts them naturally. -Sorting is handled properly by default (as of :mod:`natsort` version >= 4.0.0): +Sorting versions is handled properly by default (as of :mod:`natsort` version >= 4.0.0): .. code-block:: python @@ -58,6 +58,9 @@ Sorting is handled properly by default (as of :mod:`natsort` version >= 4.0.0): >>> natsorted(a) ['version-1.9', 'version-1.10', 'version-1.11', 'version-2.0'] +If you need to sort release candidates, please see :ref:`rc_sorting` for +a useful hack. + You can also perform locale-aware sorting (or "human sorting"), where the non-numeric characters are ordered based on their meaning, not on their ordinal value; this can be achieved with the :func:`~humansorted` function:: @@ -155,9 +158,9 @@ If you want to build this documentation, enter:: python setup.py build_sphinx -:mod:`natsort` requires python version 2.6 or greater -(this includes python 3.x). To run version 2.6, 3.0, or 3.1 the -`argparse <https://pypi.python.org/pypi/argparse>`_ module is required. +:mod:`natsort` requires Python version 2.7 or greater or Python 3.3 or greater. +Python 2.6 and 3.2 are no longer officially supported (no unit tests are performed) +but it should work. The most efficient sorting can occur if you install the `fastnumbers <https://pypi.python.org/pypi/fastnumbers>`_ package (it helps @@ -167,14 +170,32 @@ recommended you include this as a dependency. ``natsort`` will not require (or check) that `fastnumbers <https://pypi.python.org/pypi/fastnumbers>`_ is installed. On BSD-based systems (this includes Mac OS X), the underlying ``locale`` library -can be buggy (please see http://bugs.python.org/issue23195), so ``natsort`` will use -`PyICU <https://pypi.python.org/pypi/PyICU>`_ under the hood if it is installed -on your computer; this will give more reliable cross-platform results. -``natsort`` will not require (or check) that -`PyICU <https://pypi.python.org/pypi/PyICU>`_ is installed at installation -since in Linux-based systems and Windows systems ``locale`` should work just fine. -Please visit https://github.com/SethMMorton/natsort/issues/21 for more details and -how to install on Mac OS X. +can be buggy (please see http://bugs.python.org/issue23195); ``locale`` is +used for the ``ns.LOCALE`` option and ``humansorted`` function.. To remedy this, +one can + + 1. Use "\*.ISO8859-1" locale (i.e. 'en_US.ISO8859-1') rather than "\*.UTF-8" + encoding. These encodings do not suffer from as many problems as "UTF-8" + and thus should give expected results. + 2. Use `PyICU <https://pypi.python.org/pypi/PyICU>`_. If + `PyICU <https://pypi.python.org/pypi/PyICU>`_ is installed, ``natsort`` + will use it under the hood if it is installed; this will give more + reliable cross-platform results in the long run. ``natsort`` will not + require (or check) that `PyICU <https://pypi.python.org/pypi/PyICU>`_ + is installed at installation. Please visit + https://github.com/SethMMorton/natsort/issues/21 for more details and + how to install on Mac OS X. **Please note** that using + `PyICU <https://pypi.python.org/pypi/PyICU>`_ is the only way to + guarantee correct results for all input on BSD-based systems, since + every other suggestion is a workaround. + 3. Do nothing. As of ``natsort`` version 4.0.0, ``natsort`` is configured + to compensate for a broken ``locale`` library in terms of case-handling; + if you do not need to be able to properly handle non-ASCII characters + then this may be the best option for you. + +Note that the above solutions *should not* be required for Windows or +Linux since in Linux-based systems and Windows systems ``locale`` *should* work +just fine. :mod:`natsort` comes with a shell script called :mod:`natsort`, or can also be called from the command line with ``python -m natsort``. The command line script is diff --git a/natsort/_version.py b/natsort/_version.py index eea91d6..cc26564 100644 --- a/natsort/_version.py +++ b/natsort/_version.py @@ -2,4 +2,4 @@ from __future__ import (print_function, division, unicode_literals, absolute_import) -__version__ = '3.5.6' +__version__ = '4.0.0' diff --git a/natsort/natsort.py b/natsort/natsort.py index 0cdb6af..78c0c24 100644 --- a/natsort/natsort.py +++ b/natsort/natsort.py @@ -152,7 +152,7 @@ def natsort_keygen(key=None, alg=0, **_kwargs): alg : ns enum, optional This option is used to control which algorithm `natsort` uses when sorting. For details into these options, please see - the :class:`ns` class documentation. The default is `ns.FLOAT`. + the :class:`ns` class documentation. The default is `ns.INT`. Returns ------- @@ -206,7 +206,7 @@ def natsorted(seq, key=None, reverse=False, alg=0, **_kwargs): alg : ns enum, optional This option is used to control which algorithm `natsort` uses when sorting. For details into these options, please see - the :class:`ns` class documentation. The default is `ns.FLOAT`. + the :class:`ns` class documentation. The default is `ns.INT`. Returns ------- @@ -277,7 +277,8 @@ def humansorted(seq, key=None, reverse=False, alg=0): C library that Python's locale module uses is broken. On these systems it is recommended that you install `PyICU <https://pypi.python.org/pypi/PyICU>`_ - if you wish to use ``humansorted``. If you are on + if you wish to use ``humansorted``, especially if you need + to handle non-ASCII characters. If you are on one of systems and get unexpected results, please try using `PyICU <https://pypi.python.org/pypi/PyICU>`_ before filing a bug report to `natsort`. @@ -313,10 +314,11 @@ def humansorted(seq, key=None, reverse=False, alg=0): Notes ----- You may find that if you do not explicitly set - the locale your results may not be as you expect... I have found that - it depends on the system you are on. To do this is straightforward - (in the below example I use 'en_US.UTF-8', but you should use your - locale):: + the locale your results may not be as you expect, although + as of ``natsort`` version 4.0.0 the sorting algorithm has been + updated to account for a buggy ``locale`` installation. + In the below example 'en_US.UTF-8' is used, but you should use your + locale:: >>> import locale >>> # The 'str' call is only to get around a bug on Python 2.x @@ -327,7 +329,7 @@ def humansorted(seq, key=None, reverse=False, alg=0): It is preferred that you do this before importing `natsort`. If you use `PyICU <https://pypi.python.org/pypi/PyICU>`_ (see warning - above) then you should not need to do this. + above) then you should not need to do explicitly set a locale. Examples -------- @@ -510,6 +512,8 @@ def index_humansorted(seq, key=None, reverse=False, alg=0): of the given sequence. This is a wrapper around ``index_natsorted(seq, alg=ns.LOCALE)``. + Please see the ``humansorted`` documentation for caveats of + using ``index_humansorted``. Parameters ---------- @@ -543,10 +547,11 @@ def index_humansorted(seq, key=None, reverse=False, alg=0): Notes ----- You may find that if you do not explicitly set - the locale your results may not be as you expect... I have found that - it depends on the system you are on. To do this is straightforward - (in the below example I use 'en_US.UTF-8', but you should use your - locale):: + the locale your results may not be as you expect, although + as of ``natsort`` version 4.0.0 the sorting algorithm has been + updated to account for a buggy ``locale`` installation. + In the below example 'en_US.UTF-8' is used, but you should use your + locale:: >>> import locale >>> # The 'str' call is only to get around a bug on Python 2.x @@ -557,7 +562,7 @@ def index_humansorted(seq, key=None, reverse=False, alg=0): It is preferred that you do this before importing `natsort`. If you use `PyICU <https://pypi.python.org/pypi/PyICU>`_ (see warning - above) then you should not need to do this. + above) then you should not need to explicitly set a locale. Examples -------- diff --git a/natsort/ns_enum.py b/natsort/ns_enum.py index 35eee0a..8b9d794 100644 --- a/natsort/ns_enum.py +++ b/natsort/ns_enum.py @@ -20,7 +20,8 @@ class ns(object): C library that Python's locale module uses is broken. On these systems it is recommended that you install `PyICU <https://pypi.python.org/pypi/PyICU>`_ - if you wish to use ``LOCALE``. If you are on one of + if you wish to use ``LOCALE``, especially if you need + to handle non-ASCII characters. If you are on one of systems and get unexpected results, please try using `PyICU <https://pypi.python.org/pypi/PyICU>`_ before filing a bug report to ``natsort``. @@ -26,6 +26,7 @@ class PyTest(TestCommand): '--flakes', '--pep8', # '--failed', + # '-v', ]) err2 = pytest.main(['--doctest-modules', 'natsort']) err3 = pytest.main(['README.rst', |
