| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| | |
# Conflicts:
# tests/test_shell.py
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* all: remove "u" string prefix
* util: remove unirange
Since Python 3.3, all builds are wide unicode compatible.
* unistring: remove support for narrow-unicode builds
which stopped being relevant with Python 3.3
|
| |
| |
| |
| |
| | |
This fixes an empty token appearing in the Angular lexer (and
apparently also in the MSDOS lexer.)
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| | |
Remove support for single-quoted strings.
Update fennelview example to latest version of library.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Rename the "Javascript" tests to reflect that they are for CoffeeScript
This change also modifies the module docstring to reflect the file's purpose.
* Overhaul the Javascript numeric literal parsing
Fixes #307
This patch contains the following changes:
* Adds 50+ unit tests for Javascript numeric literals
* Forces ASCII numbers for float literals (so, now reject `.୪`)
* Adds support for Javascript's BigInt notation (`100n`)
* Adds support for leading-zero-only octal notation (`0777`)
* Adds support for scientific notation with no significand (`1e10`)
Numeric literal parsing is based on information at:
* https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Grammar_and_types
* https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Overhaul the MySQL lexer
Fixes #975, #1063, #1453
Changes include:
Documentation
-------------
* Note in the lexer docstring that Oracle MySQL is the target syntax.
MariaDB syntax is not a target (though there is significant overlap).
Unit tests
----------
* Add 140 unit tests for MySQL.
Literals
--------
* Hexadecimal/binary/date/time/timestamp literals are supported.
* Integer mantissas are supported for scientific notation.
* In-string escapes are now tokenized properly.
* Support the "unknown" constant.
Comments
--------
* Optimizer hints are now supported, and keywords are
recognized and tokenized as preprocessor instructions.
* Remove nested multi-line comment support, which is no
longer supported in MySQL.
Variables
---------
* Support the '@' prefix for variable names.
* Lift restrictions on characters in unquoted variable names.
(MySQL does not impose a restriction on lead characters.)
* Support single/double/backtick-quoted variable names, including escapes.
* Support the '@@' prefix for system variable names.
* Support '?' as a variable so people can demonstrate prepared statements.
Keywords
--------
* Keyword / data type / function are now in a separate, auto-updating file.
* Support 25 additional data types (including spatial and JSON types).
* Support 460 additional MySQL keywords.
* Support 372 MySQL functions.
Explicit function support resolves a bug that causes non-function
items to be treated as functions simply because they have a trailing
opening parenthesis.
* Support exceptions for the 'SET' keyword, which is both a datatype and
a keyword depending on context.
Schema object names
-------------------
* Support Unicode in MySQL schema object names.
* Support parsing of backtick-quoted schema object name escapes.
(Escapes do not produce a distinct token type at this time.)
Operators
---------
* Remove non-operator characters from the list of operators.
* Remove non-punctuation characters from the list of punctuation.
* Cleanup items based on feedback
* Remove an unnecessary optional newline lookahead for single-line comments
|
| | |
|
| |
| |
| |
| |
| | |
This lexer is based on the PythonConsoleLexer and provides the ability
to highlight console input and output for PsySH, a developer console and
REPL for PHP. See https://psysh.org.
|
| |
| |
| |
| |
| | |
* more explicitly define escape sequencies in JsonLexer (fix #1065)
* adding test coverage for #1065
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Fixed guessing of CMake by header.
* Version number can have multiple digits.
* Tabs are handled as white space.
* Trailing comments are ignored.
* Cleaned up regex to detect CMake header.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* add lexer for pointless
* lexer docstring formatting
* added link to languages doc
* update authors
* update version
* added double string
* added upval keyword
* simplify ptls example code
* rename doubleString -> multiString
|
| | |
|
| |\
| | |
| | | |
Improve HTML formatter output.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With the previous changes, we started to emit one <pre> per line for
line numbers. This breaks for instance the Sphinx-RTD-Theme, which
expects the line numbers to be formatted the same way as the normal
content. This commit makes the following changes:
* Emit a single <pre> inside the linenos div
* Wrap individual lines into <span> as needed
* Update all tests
* Don't yield empty <span> elements when no style is specified
This also makes the .html test files look correct when looked at with a
browser, as there is no extra whitespace in them which needs stripping.
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
This is a manual merge as we don't want to pull in the documentation
change as part of this fix for a cleaner history.
|
| | |
| | |
| | | |
Including tests and an example.promql file.
|
| |/
| |
| |
| |
| | |
* Update for Csound 6.15.0
* Update comment
|
|/
|
|
|
|
| |
This test triggers a bug, in which a spurious Token.Text('') appears at
the end. This seems to stem from the ('[^<&]+', Text), rule in the HTML
lexer which matches the \n that gets automatically added during lexing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add support for Setext-style headings in Markdown
* Improve inline code detection in Markdown
* Add support for indented code blocks in Markdown
* Improve italics & bold detection in Markdown
* Simplify italics & bold regexes in Markdown
* Add warning about possible unrecognized internal tags in Markdown
* Improve striktethrough detection in Markdown
* Small bugfix in Markdown
* Small bugfix in Markdown
* Small refactoring in Markdown
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add font and background colors to Style
* Move all styles to get_style_defs, add tests
* Remove hardcoded styles, add special lineno style
* Add styles for special line numbers in tables
* Update noclasses documentation
* Refactor linenos elements and styles, add tests
* Update AUTHORS
* Fix multiple CSS prefixes, add tests
|
|
|
|
|
|
|
|
|
|
|
| |
* Add support for PowerShell Remoting sessions
* Add test case for PowerShell Remoting sessions
* Make whitespace after prompt optional
* Fix test case containing backslashes
* Add test case for local PowerShell sessions
|
|
|
|
|
|
|
| |
* Add Arrow lexer
* Pass tests: raw string for regex
* Make requested changes
|
|
|
|
|
|
|
|
|
|
|
| |
The class looks like:
class class_identifier [#(param_decls)] [extends class_identifier #(params)];
...
endclass [: class_identifier]
Using the same Java convention of Keyword.Declaration and Name.Class.
Add a test_systemverilog_classes unit test to test_hdl.
|
|
|
| |
Co-authored-by: Bryton Hall <email@bryton.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Move SystemVerilog type keywords
Put them next to the generic keywords list.
* Change a couple SystemVerilog keywords to operators
The 'inside' and 'dist' keywords are described as operators in the
SystemVerilog standard, below unary increment/decrement, and above
concatenation in precedence.
See 1800-2017 tables 11-1 and 11-2 for a list of operators.
This matches the description of pygemnts Operator.Word token:
"For any operator that is a word (e.g. not)."
* Add a SystemVerilog operators unit test
Copy/paste the contents of 1800-2017 Table 11-2,
and see what the SV lexer chops it up into.
I made lots of comments for potential improvements.
Some operators, such as '[' and '.' are being labeled as punctuation.
Also, multi-character operators such as '<<<=' are being split up
into multiple, single-character tokens, eg '<' '<' '<' '='.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added GDScript lexer
* Fix regular expressions in GDScript lexer
* Update GDScript lexer with the current version from Godot docs
* Add tests for GDScript lexer
* Update authors
* Add an example file for GDScript
* Implement analyze_text for GAP and GDScript
* Fix example file name in tests
* Update license
Co-authored-by: Daniel J. Ramirez <djrmuv@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the contents of these two unit tests are static.
Move things around so the entire test fits on a single page,
for better readability/maintainability.
Name the code part <TEST_NAME>_TEXT,
and the tokens part <TEST_NAME>_TOKENS.
Choosing "text" b/c it's the parameter name to the
lexer.get_tokens(text) method.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add lexer for Devicetree language
Signed-off-by: Maxime Chretien <maxime.chretien@bootlin.com>
* Devicetree lexer: fix random input test error
Signed-off-by: Maxime Chretien <maxime.chretien@bootlin.com>
* Devicetree lexer: fix example file reference
Signed-off-by: Maxime Chretien <maxime.chretien@bootlin.com>
* Devicetree lexer: Reduce example file size
Also add some missing language elements
Signed-off-by: Maxime Chretien <maxime.chretien@bootlin.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original implementation was missing some of the more arcane features
such as underbars, the character 's' for signed/unsigned, support for
spaces before/after the base specifier, capital letter base specifiers
(ie 'B 'D 'H), and the 4-state 'xXzZ?' characters.
For regular integers, the 'l' and 'L' suffixes are not valid.
That is, unlike C, in Verilog '42L' is not a valid int literal.
Create a new test that exercises most of the interesting kinds of
SystemVerilog numbers.
This fixes a couple minor issues with what type of number the lexer
returns. For example, Numbers like '42' used to return Integer.Hex,
but now return Integer.Decimal.
|
|
|
|
|
|
|
|
|
|
|
| |
* add support for .tid files (TiddlyWiki5)
* add lexers/_mapping.py
* markup.py: change versionadded of TiddlyWiki5Lexer to 2.7
* markup.py, TiddlyWiki5Lexer: use non-greedy matcher for table headers, footers, captions and classes
* markup.py, TiddlyWiki5Lexer: make timestamps of type Number.Integer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix a few SystemVerilog type keywords
First, add a few missing type keywords:
chandle, const, event, string, time, type, var, void
These are most of the 'variable' types listed in 1800-2017 6.8
"Variable declarations".
Currently, this 'Keyword.Type' is not taking effect because the lexer is
finding these keywords in the 'Keyword' list above.
Remove the double declaration so we get the more specific token type.
* Change signed/unsigned to Keyword.Type
This is what the C/C++ lexer does, so it seems legit.
|
|
|
|
| |
addresses for the case where only a numeric HTTP status code is returned (eg. 200) and no textual reason phrase (eg. OK). Strictly according to RFC 7230, the whitespace just after the status code number is NOT optional, and in fact Tomcat 8.5 behaves this way, emiting status lines like "HTTP/1.1 200 \n" (note the whitespace after the 200). (#1432)
|
|
|
|
|
|
|
| |
* Add a basic SystemVerilog unit test
* Fix docstring
Calling it a "complete fragment" didn't make much sense.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add explicit line continuation for Matlab session
Matlab lines can be explicitly continued with the ... syntax at the end
of a line. In the Session lexer, this requires continuing to the next
line to add more text. Otherwise, the next line is marked as output.
* The ellipses in Matlab should be a Keyword
The built-in Matlab syntax highlighter highlights ... with the same
formatting as a keyword. Everything after that on the line should be a
comment.
* Update Matlab functions and keywords from R2018a
* Fix many spaces in assignment formatted as string
In command mode, MATLAB allows mutiple space separated arguments to a
function which are interpreted as char arrays, and are formatted as
Strings. This check was also catching cases where there were multiple
spaces following an assignment or comparison operation and formatting
the rest of the line as a string. Now, if an = or operator is found, the
commandargs state is popped and control returns to the root state.
* Add tests for MATLAB formatting
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add yang lexer for issue pygments/pygments#1407
* fix copyright statement
* adjust examplefile for yang
* fix to avoid duplicate code in lexer
* add more testcases for yang lexer
* simplify yang lexer
* simplify default rule in yang lexer
* change example yang file
* add version to yang lexer
|
|
|
|
|
|
|
|
|
| |
* a filter for math symbols
This filer replaces math symbols from e.g. LaTeX and Isabelle with
corresponding unicode. It could be expanded to other math-heavy
languages.
* add "symbols" filter to basic tests
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement lexer for execline.
This commit introduces a lexer for Laurent Bercot's execline scripting language
(https://skarnet.org/software/execline) based on Pygments' existing bash lexer,
with some minor adaptations for execline's variable naming rules.
* Add versionadded note and website link to execline lexer.
* Add execline to languages.rst and example execline script
* Explicitly mark non-special characters in execline lexer as Text
* Correct execline lexer version addded
Co-authored-by: Molly Miller <sysvinit@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* A lexer for F*, an ML dialect for program verification
* Fix treatment of infix applications, e.g.
* Correct modifications
* Better lexing
* Added F* to the list of supported languages
* Add example file
* Bumped versionadded field
* Added link to language
Co-authored-by: Jonathan Protzenko <jonathan.protzenko@gmail.com>
|
|
|
|
|
|
|
|
|
| |
* Add Typographic Number Theory lexer
Originally tried to use RegexLexer, but the
structure of TNT is too rigid for it to handle.
Went with a direct parser instead.
Co-authored-by: lonetwin <steve@lonetwin.net>
|
|
|
|
| |
From the fork at https://bitbucket.org/gebner/pygments-main/src/default/
|
|
|
|
|
|
|
| |
This fixes three test failures on Windows:
* Two due to incorrect handling of : (on Windows, multiple : can
be part of a path.)
* One due to newline differences
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove Python 2 compatibility
* remove 2/3 shims in pygments.util
* update setup.py metadata
* Remove unneeded object inheritance.
* Remove unneeded future imports.
|
|\
| |
| | |
Python f-strings: highlight expressions in curly braces
|
| |
| |
| |
| | |
Fixes #1228
|
|/ |
|
|
|
|
|
|
|
| |
- The walrus operator, also known as assignment expressions, was introduced in Python 3.8
- Moves the Token.Operator matching root above Token.Punctuation so the walrus operator takes precedence
- Includes a test to make sure this behavior doesn't regress since it's sensitive to the order of expressions
- Fixes #1381
|