summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Remove the alias for the RawTokenLexer.raw-aliasGeorg Brandl2020-12-243-2/+5
| | | | | | | | RawTokenLexer was broken until 2.7.4, so it seems pretty much unused, and it led to tracebacks when the "raw" alias was used from some markup that allows specifying a language alias. We'll still keep the class for special usage as intended.
* fix oversightGeorg Brandl2020-12-191-1/+1
|
* fix inefficient regexes for guessing lexersGeorg Brandl2020-12-193-3/+2
|
* scripts/debug_lexer: allow guessing from contentGeorg Brandl2020-12-191-8/+23
|
* Limit recursion with nesting Ruby heredocsGeorg Brandl2020-12-173-4/+11
| | | | fixes #1638
* Fix backtracking string regexes in JavascriptLexer und TypescriptLexer.Georg Brandl2020-12-173-6/+28
| | | | fixes #1637
* doc/demo: add ability to highlight text given in url query; copy link featureGeorg Brandl2020-12-163-1/+50
|
* fixes #1625: infinite loop in SML lexerGeorg Brandl2020-12-102-6/+14
| | | | | Reason was a lookahead-only pattern which was included in the state where the lookahead was transitioning to.
* Prepare 2.7.3 release.2.7.3Matthäus G. Chajdas2020-12-062-2/+6
|
* Update CHANGES.Matthäus G. Chajdas2020-12-061-9/+10
| | | | | Mention the Mason fix, add use past tense (mostly) for CHANGES. Mention the changes to CSS as well as this might affect various themes.
* Increase timeout.Matthäus G. Chajdas2020-12-051-4/+4
| | | | | This should fix the tests failing on PyPy. Eventually we'll need a more robust solution for this.
* Update CHANGES.Matthäus G. Chajdas2020-12-051-0/+2
|
* Unclosed script/style tag handling Fixes #1614 (#1615)Nick Gerner2020-12-052-0/+141
| | | | | | | | | | | | | | | | | | | Explicitly handle unclosed <script> and <style> tags which previously would result in O(n^2) work to lex as Error tokens per character up to the end of the line or end of file (whichever comes first). Now we try lexing the rest of the line as Javascript/CSS if there's no closing script/style tag. We recover on the next line in the root state if there is a newline, otherwise just keep parsing as Javascript/CSS. This is similar to how the error handling in lexer.py works except we get Javascript or CSS tokens instead of Error tokens. And we get to the end of the line much faster since we don't apply an O(n) regex for every character in the line. I added a new test suite for html lexer (there wasn't one except for coverage in test_examplefiles.py) including a trivial happy-path case and several cases around <script> and <style> fragments, including regression coverage that fails on the old logic.
* testing turtle prefix names where reference starts with number (#1590)elf Pavlik2020-12-053-17/+99
| | | | | | | | | * testing turtle prefix names where reference starts with number * remove case insensitive flag from Turtle lexer * use same end-of-string regex as in SPARQL and ShExC * make example.ttl valid turtle
* Update mapfiles and CHANGES.Matthäus G. Chajdas2020-12-052-1/+2
|
* Update jvm.py (#1587)Boris Kheyfets2020-12-051-1/+2
| | | Added support for kotlin scripts.
* Update CHANGES.Matthäus G. Chajdas2020-12-051-2/+16
|
* ImgFormatter: Use the start position based on the length of text (#1611)strawberry beach sandals2020-11-281-8/+20
|
* llvm lexer: add poison keyword (#1612)Nuno Lopes2020-11-281-1/+1
|
* fix ecl doc reference (#1609)Carlos Henrique Guardão Gandarez2020-11-251-1/+1
|
* lean: Add missing keywordsEric Wieser2020-11-191-0/+2
|
* JuttleLexer: Fix duplicate 'juttle' occurance in lexer aliases.Sumanth V Rao2020-11-192-2/+2
| | | | | | | | | | | | | | | | | The output from pygments.lexers.get_all_lexers() contains 'juttle' twice in the aliases section for the Juttle lexer entry. This could be reproduced using: >>> from pygments.lexers import get_all_lexers >>> lexers = get_all_lexers() >>> {alias[0]: alias[1] for alias in lexers}.get('Juttle') ('juttle', 'juttle') This patch fixes the duplicate entry and generates the associated _mapping.py file. Fixes: #1604
* Rust: update builtins/macros/keywords for 1.47Georg Brandl2020-11-191-27/+36
|
* minor variable name fixupGeorg Brandl2020-11-191-5/+5
|
* Rust lexer: changing rust macro typeK. Lux2020-11-191-1/+1
| | | | Rust macros seem to fit more into the "magic function" category than into the "builtin" one.
* Rust lexer: bug fix with regex lexer and '!' + r'\b'K. Lux2020-11-191-1/+1
| | | | | | Rust macros end with a '!'. The word border (regex '\b') for such expressions is located before the '!' (e. g. "print\b!(...)"). The regex here used the suffix option, which added an r'\b' after each regex (e. g. r'print!\b'). Therefore, the supplied regular expressions didn't match the rust macros. To fix this problem, the suffix is removed. As every macro ends with an '!' (which implicitely includes a word border before), it's not necessary anyway.
* Rust lexer: move keywords from funcs_macros to typesK. Lux2020-11-191-2/+1
| | | | 'drop', 'Some', 'None', 'Ok' and 'Err' are types, not macros.
* Add Javascript 'async', 'await' keywords (#1605)Chris Nevers2020-11-171-1/+1
|
* fix changelog entryGeorg Brandl2020-11-121-2/+1
|
* shell: improve docstrings for the "session" type lexersGeorg Brandl2020-11-111-5/+10
| | | | fixes #1599
* json: deprecate BareJsonObjectLexerGeorg Brandl2020-11-113-6/+13
| | | | fixes #1600
* Fix a catastrophic backtracking bug in JavaLexer (#1594)Kurt McKee2020-11-092-2/+32
| | | | | | | * JavaLexer: Demonstrate a catastrophic backtracking bug * JavaLexer: Fix a catastrophic backtracking bug Closes #1586
* Fix Mason regex.Matthäus G. Chajdas2020-11-082-5/+3
| | | | Previously, the tag was cut off.
* Fix Mason regex.Matthäus G. Chajdas2020-11-082-3/+66
| | | | | | | | | Previously, something like: <%class>text</%class> would not get matched correctly. This was due to the capturing group capturing the wrong part of the tag -- instead of class, it would capture the part after class before >. With this commit, the capturing group correctly matches the start/end tag. This commit also adds a unit test to verify this.
* fix closing tag for unnamed blocks on MasonLexer (#1592)Carlos Henrique Guardão Gandarez2020-11-082-2/+17
|
* test_templates: simplify and rename moduleGeorg Brandl2020-10-301-15/+3
|
* added documentationSean McElwain2020-10-301-0/+4
|
* removed \'{* ... *}\' as a django commentSean McElwain2020-10-301-1/+1
|
* added test to track djangojavascript lexer fixSean McElwain2020-10-301-0/+37
|
* Fix test.Matthäus G. Chajdas2020-10-281-1/+1
|
* Remove margin: 0 from <pre> styling.Matthäus G. Chajdas2020-10-2834-34/+34
| | | | | | This seems to break some themes which were not expecting Pygments to change margins, and it doesn't look like it makes a difference for standalone Pygments.
* MySQL: Tokenize quoted schema object names, and escape characters, uniquely ↵Kurt McKee2020-10-273-18/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (#1555) * MySQL: Tokenize quoted schema object names, and escape characters, uniquely Changes in this patch: * Name.Quoted and Name.Quoted.Escape are introduced as non-standard tokens * HTML and LaTeX formatters were confirmed to provide default formatting if they encounter these two non-standard tokens. They also add style classes based on the token name, like "n-Quoted" (HTML) or "nQuoted" (LaTeX) so that users can add custom styles for these. * Removed "\`" and "\\" as schema object name escapes. These are relics of the previous regular expression for backtick-quoted names and are not treated as escape sequences. The behavior was confirmed in the MySQL documentation as well as by running queries in MySQL Workbench. * Prevent "123abc" from being treated as an integer followed by a schema object name. MySQL allows leading numbers in schema object names as long as 0-9 are not the only characters in the schema object name. * Add ~10 more unit tests to validate behavior. Closes #1551 * Remove an end-of-line regex match that triggered a lint warning Also, add tests that confirm correct behavior. No tests failed before or after removing the '$' match in the regex, but now regexlint isn't complaining. Removing the '$' matching probably depends on the fact that Pygments adds a newline at the end of the input text, so there is always something after a bare integer literal.
* Add 'some' Ada reserved word (#1581)Léo Germond2020-10-271-3/+3
| | | | | | The some Ada reserved word is available since Ada 2012, it is used in the same context as the any keyword. See RM 2.9 - Reserved Words https://www.adaic.org/resources/add_content/standards/12rm/html/RM-2-9.html for a list of keywords (with this inclusion, all are covered if I'm not mistaken) and for usage example See RM 4.5.8 - Quantified expressions https://www.adaic.org/resources/add_content/standards/12rm/html/RM-4-5-8.html
* Speed up JSON and reduce HTML formatter consumption (#1569)Kurt McKee2020-10-263-158/+423
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Update the JSON-LD keyword list to match JSON-LD 1.1 Changes in this patch: * Update the JSON-LD URL to HTTPS * Update the list of JSON-LD keywords * Make the JSON-LD parser less dependent on the JSON lexer implementation * Add unit tests for the JSON-LD lexer * Add unit tests for the JSON parser This includes: * Testing valid literals * Testing valid string escapes * Testing that object keys are tokenized differently from string values * Rewrite the JSON lexer Related to #1425 Included in this change: * The JSON parser is rewritten * The JSON bare object parser no longer requires additional code * `get_tokens_unprocessed()` returns as much as it can to reduce yields (for example, side-by-side punctuation is not returned separately) * The unit tests were updated * Add unit tests based on Hypothesis test results * Reduce HTML formatter memory consumption by ~33% and speed it up Related to #1425 Tested on a 118MB JSON file. Memory consumption tops out at ~3GB before this patch and drops to only ~2GB with this patch. These were the command lines used: python -m pygments -l json -f html -o .\new-code-classes.html .\jc-output.txt python -m pygments -l json -f html -O "noclasses" -o .\new-code-styles.html .\jc-output.txt * Add an LRU cache to the HTML formatter's HTML-escaping and line-splitting For a 118MB JSON input file, this reduces memory consumption by ~500MB and reduces formatting time by ~15 seconds. * JSON: Add a catastrophic backtracking test back to the test suite * JSON: Update the comment that documents the internal queue * JSON: Document in comments that ints/floats/constants are not validated
* Prepare 2.7.2 release.2.7.2Matthäus G. Chajdas2020-10-242-2/+14
| | | | Update CHANGES, bump version.
* Speculative fix for #1579. (#1583)Matthäus G. Chajdas2020-10-2434-104/+104
| | | | This removes the top/bottom padding changes, and only keeps left/right padding, in the hope that this does not break all Sphinx themes.
* TNTLexer: Don't crash on unexpected EOL. (#1570)Ken2020-10-143-54/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * TNTLexer: Don't crash on unexpected EOL Catch IndexErrors in each line and error the rest of the line, leaving whatever tokens were found. * Write and pass tests for Typographic Number Theory pygments/lexers/tnt.py: * Fix indentation on import * Fix: TNTLexer.cur is class-level reference if not initialized in get_tokens_unprocessed, so init it in __init__ too * Fix: Fantasy markers are not allowed as components of other formulas, so have a dedicated check for them in the body of get_tokens_unprocessed which disables the normal formula handling if present * Clarify TNTLexer.lineno docstring * Attempt to discard tokens before an IndexError +tests/test_tnt.py: * Test every method, and test both +ve and -ve matches for most * Lexer fixture is test-level to reinitialize cur clean each time * Don't test actual get_tokens_unprocessed method (besides for fantasy markers) because the text testing is left to examplefiles AUTHORS: + Add myself to credits :) * Add a TNT test just to make sure no crashes
* Add Python 3.9 to CI builds.Matthäus G. Chajdas2020-10-061-1/+1
|
* Add Python 3.9 as a supported version (#1554)Kurt McKee2020-10-062-1/+2
| | | Co-authored-by: Matthäus G. Chajdas <Anteru@users.noreply.github.com>
* llvm lexer: add freeze instruction and bfloat type (#1565)Nuno Lopes2020-10-061-11/+12
|