summaryrefslogtreecommitdiff
path: root/docs/source/shell.rst
blob: 5359250d2c446ea921b59f66aa8856e130a3594f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
.. default-domain:: py
.. currentmodule:: natsort

.. _shell:

Shell Script
============

The ``natsort`` shell script is automatically installed when you install
:mod:`natsort` with pip.

Below is the usage and some usage examples for the ``natsort`` shell script.

Usage
-----

::

    usage: natsort [-h] [--version] [-p] [-f LOW HIGH] [-F LOW HIGH] [-e EXCLUDE]
                   [-r] [-t {digit,int,float,version,ver}] [--nosign] [--noexp]
                   [--locale]
                   [entries [entries ...]]

    Performs a natural sort on entries given on the command-line.
    A natural sort sorts numerically then alphabetically, and will sort
    by numbers in the middle of an entry.

    positional arguments:
      entries               The entries to sort. Taken from stdin if nothing is
                            given on the command line.

    optional arguments:
      -h, --help            show this help message and exit
      --version             show program's version number and exit
      -p, --paths           Interpret the input as file paths. This is not
                            strictly necessary to sort all file paths, but in
                            cases where there are OS-generated file paths like
                            "Folder/" and "Folder (1)/", this option is needed to
                            make the paths sorted in the order you expect
                            ("Folder/" before "Folder (1)/").
      -f LOW HIGH, --filter LOW HIGH
                            Used for keeping only the entries that have a number
                            falling in the given range.
      -F LOW HIGH, --reverse-filter LOW HIGH
                            Used for excluding the entries that have a number
                            falling in the given range.
      -e EXCLUDE, --exclude EXCLUDE
                            Used to exclude an entry that contains a specific
                            number.
      -r, --reverse         Returns in reversed order.
      -t {digit,int,float,version,ver}, --number-type {digit,int,float,version,ver}
                            Choose the type of number to search for. "float" will
                            search for floating-point numbers. "int" will only
                            search for integers. "digit", "version", and "ver" are
                            shortcuts for "int" with --nosign.
      --nosign              Do not consider "+" or "-" as part of a number, i.e.
                            do not take sign into consideration.
      --noexp               Do not consider an exponential as part of a number,
                            i.e. 1e4, would be considered as 1, "e", and 4, not as
                            10000. This only effects the --number-type=float.
      --locale, -l          Causes natsort to use locale-aware sorting. On some
                            systems, the underlying C library is broken, so if you
                            get results that you do not expect please install
                            PyICU and try again.
Description
-----------

``natsort`` was originally written to aid in computational chemistry
research so that it would be easy to analyze large sets of output files
named after the parameter used::

    $ ls *.out
    mode1000.35.out mode1243.34.out mode744.43.out mode943.54.out

(Obviously, in reality there would be more files, but you get the idea.)  Notice
that the shell sorts in lexicographical order.  This is the behavior of programs like
``find`` as well as ``ls``.  The problem is passing these files to an
analysis program causes them not to appear in numerical order, which can lead
to bad analysis.  To remedy this, use ``natsort``::

    $ natsort *.out
    mode744.43.out
    mode943.54.out
    mode1000.35.out 
    mode1243.34.out
    $ natsort *.out | xargs your_program

You can also place natsort in the middle of a pipe::

    $ find . -name "*.out" | natsort | xargs your_program

To sort version numbers, use the ``--number-type version`` option
(or ``-t ver`` for short)::

    $ ls *
    prog-1.10.zip prog-1.9.zip prog-2.0.zip
    $ natsort -t ver *
    prog-1.9.zip
    prog-1.10.zip
    prog-2.0.zip

In general, all ``natsort`` shell script options mirror the :func:`~natsorted` API,
with notable exception of the ``--filter``, ``--reverse-filter``, and ``--exclude``
options.  These three options are used as follows::

    $ ls *.out
    mode1000.35.out mode1243.34.out mode744.43.out mode943.54.out
    $ natsort *.out -f 900 1100 # Select only numbers between 900-1100
    mode943.54.out
    mode1000.35.out 
    $ natsort *.out -F 900 1100 # Select only numbers NOT between 900-1100
    mode744.43.out
    mode1243.34.out
    $ natsort *.out -e 1000.35 # Exclude 1000.35 from search
    mode744.43.out
    mode943.54.out
    mode1243.34.out

If you are sorting paths with OS-generated filenames, you may require the
``--paths``/``-p`` option::

    $ find . ! -path . -type f
    ./folder/file (1).txt
    ./folder/file.txt
    ./folder (1)/file.txt
    ./folder (10)/file.txt
    ./folder (2)/file.txt
    $ find . ! -path . -type f | natsort
    ./folder (1)/file.txt
    ./folder (2)/file.txt
    ./folder (10)/file.txt
    ./folder/file (1).txt
    ./folder/file.txt
    $ find . ! -path . -type f | natsort -p
    ./folder/file.txt
    ./folder/file (1).txt
    ./folder (1)/file.txt
    ./folder (2)/file.txt
    ./folder (10)/file.txt