| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| | |
|
| |
|
|
| |
https://github.com/avian2/unidecode/issues/50
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
This adds:
- SQUARED LATIN CAPITAL LETTERs,
- NEGATIVE CIRCLED LATIN CAPITAL LETTERs,
- NEGATIVE SQUARED LATIN CAPITAL LETTERs,
- TORTOISE SHELL BRACKETED LATIN CAPITAL LETTERs and
- CIRCLED ITALIC LATIN CAPITAL LETTERs
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Allows running the following command to execute the CLI
$ python -m unidecode ...
https://docs.python.org/3/library/__main__.html
> For a package, the same effect can be achieved by including a
> __main__.py module, the contents of which will be executed when the
> module is run with -m.
|
| |\ \ |
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| | |
- Use print() function, not statement
- Replace chr() with unichr()
- Convert tabs to spaces to follow PEP8
- Replace variable 'all' with 'total' to avoid shadowing the builtin
- Compile the regular expression once to avoid the overhead in the loop
- Use a context manager to always close the "NamesList.txt" file
|
| |\ \ |
|
| | |/ |
|
| |\ \ |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | |\ \
| |/ /
|/| | |
|
| |\ \ \ |
|
| | | |/
| |/|
| | |
| | |
| | | |
Simpler and more forward compatible. The b prefix syntax is available on
all supported Pythons.
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The Python project considers the optparse module as deprecated. See:
https://docs.python.org/3/library/optparse.html
> Deprecated since version 3.2: The optparse module is deprecated and
> will not be developed further; development will continue with the
> argparse module.
Replace the project's use with the newer argparse. The CLI is fully
equivalent and should not result in any backwards comparability
concerns.
https://docs.python.org/3/library/argparse.html
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Helps pip decide what version of the library to install. Include this
information in the README as well. These versions were already
documented as trove classifiers.
https://packaging.python.org/guides/distributing-packages-using-setuptools/#python-requires
> If your project only runs on certain Python versions, setting the
> python_requires argument to the appropriate PEP 440 version specifier
> string will prevent pip from installing the project on other Python
> versions.
https://setuptools.readthedocs.io/en/latest/setuptools.html#new-and-changed-setup-keywords
> python_requires
>
> A string corresponding to a version specifier (as defined in PEP 440)
> for the Python version, used to specify the Requires-Python defined in
> PEP 345.
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | | |
Both Python 2 & Python 3 support representing Unicode strings with a u
prefix. Can drop the _u() compat shim and let the interpreter handle it.
|
| | | | |
|
| |\ \ \ |
|
| | |/ / |
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes the following warning when Python warnings are enabled during
tests:
setup.py:9: ResourceWarning: unclosed file <_io.TextIOWrapper name='README.rst' mode='r' encoding='UTF-8'>
return open(os.path.join(os.path.dirname(__file__), "README.rst")).read()
To enable warnings, use the pass the -Walways argument to Python:
https://docs.python.org/3/using/cmdline.html#cmdoption-w
|
| |\ \ \
| |/ /
|/| | |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Test Python 3.7 and PyPy in tox.ini and Travis CI. Python 3.7 was
released on 2018-06-27.
https://docs.python.org/3/whatsnew/3.7.html
Use 'dist: xenial' in Travis to allow using the latest Python 3.7.
Travis officially added support for Xenial on 2018-11-08.
https://blog.travis-ci.com/2018-11-08-xenial-release
|
| | |/
|/|
| |
| |
| |
| |
| |
| |
| |
| | |
- Trim extraneous trailing whitespace
- Use double backquotes for inline fixed-space literals (Syntax
described here:
http://docutils.sourceforge.net/docs/user/rst/quickstart.html#text-styles)
- Always end functions with `()` to make clear to the reader the symbol
is a function.
- Always capitalize the project's name.
|
| |/
|
|
|
| |
If x.isspace() is True, then unidecode(x).isspace() should be True as well
(unless it is an empty string)
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Python 2.6 and 3.3 are end of life. They are no longer receiving bug
fixes, including for security issues. Python 2.6 went EOL on 2013-10-29
and 3.3 on 2017-09-29. For additional details on supported Python
versions, see:
https://devguide.python.org/#status-of-python-branches
Removing support for EOL Pythons will reduce testing and maintenance
resources.
Using pypinfo, here are the download statistics for Unidecode over the
last 30 days, showing minimal 3.3 & 2.6 installs.
$ pypinfo --percent unidecode pyversion
| python_version | percent | download_count |
| -------------- | ------- | -------------- |
| 2.7 | 66.07% | 244,263 |
| 3.6 | 19.69% | 72,777 |
| 3.5 | 8.27% | 30,585 |
| 3.4 | 5.59% | 20,663 |
| 3.7 | 0.29% | 1,062 |
| 2.6 | 0.06% | 210 |
| 3.8 | 0.02% | 72 |
| 3.3 | 0.01% | 44 |
| 3.2 | 0.00% | 6 |
| None | 0.00% | 2 |
|
| |/
|
|
|
|
| |
The project requires a very simple tox file and so can remove a lot of
boilerplate and re-specifying of tox defaults. When using py36 notation,
the "basepython" configuration option has a correct default.
|
| | |
|
| |
|
|
|
|
|
| |
These codepoints are defined as "Greek small letter mu" and a Latin capital
letter, not with spelled-out unit names.
"u" is a common way of representing "micro" SI prefix in ASCII.
|
| |
|
|
| |
https://unicode-table.com/en/blocks/phonetic-extensions/
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert double letter translation to capital letter as very hard to understand
what the translation is because of duplicate, for example:
kh - is it k and h or kh?
tskh - is it t,s,kh or ts,k,h or ts,kh, etc...
0xa2
Hebrew bible puncheation mark, should be ignored.
0xc6
Opposite Nun, same as 'n'.
0xba
Hulam Haser, vawel as 'o'.
0xbf
Makaf Raphe, same as Makaf (0xbe).
0xc5
Hebrew bible puncheation mark, should be ignored.
0xc7
Makaf katan, vowel as 'o'.
0xd0
Aleph, sounds as AHA must exist to make string readbale.
Distinguish from '`' use capital A to distinguish from 'a' vowel.
0xf5
Splitted Vave, same as 'v'.
0xf6
Opposite Nun, same as 'n'.
0xf7
Small Kuf, same as 'q'.
Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
|
| | |
|
| | |
|
| | |
|