1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
|
=============================
What's New in Pyparsing 3.0.0
=============================
:author: Paul McGuire
:date: April, 2022
:abstract: This document summarizes the changes made
in the 3.0.0 release of pyparsing.
(Updated to reflect changes up to 3.0.8)
.. sectnum:: :depth: 4
.. contents:: :depth: 4
New Features
============
PEP-8 naming
------------
This release of pyparsing will (finally!) include PEP-8 compatible names and arguments.
Backward-compatibility is maintained by defining synonyms using the old camelCase names
pointing to the new snake_case names.
This code written using non-PEP8 names::
wd = pp.Word(pp.printables, excludeChars="$")
wd_list = pp.delimitedList(wd, delim="$")
print(wd_list.parseString("dkls$134lkjk$lsd$$").asList())
can now be written as::
wd = pp.Word(pp.printables, exclude_chars="$")
wd_list = pp.delimited_list(wd, delim="$")
print(wd_list.parse_string("dkls$134lkjk$lsd$$").as_list())
Pyparsing 3.0 will run both versions of this example.
New code should be written using the PEP-8 compatible names. The compatibility
synonyms will be removed in a future version of pyparsing.
Railroad diagramming
--------------------
An excellent new enhancement is the new railroad diagram
generator for documenting pyparsing parsers.::
import pyparsing as pp
# define a simple grammar for parsing street addresses such
# as "123 Main Street"
# number word...
number = pp.Word(pp.nums).set_name("number")
name = pp.Word(pp.alphas).set_name("word")[1, ...]
parser = number("house_number") + name("street")
parser.set_name("street address")
# construct railroad track diagram for this parser and
# save as HTML
parser.create_diagram('parser_rr_diag.html')
To use this new feature, install the supporting diagramming packages using::
pip install pyparsing[diagrams]
See more in the examples directory: ``make_diagram.py`` and ``railroad_diagram_demo.py``.
(Railroad diagram enhancement contributed by Michael Milton)
Support for left-recursive parsers
----------------------------------
Another significant enhancement in 3.0 is support for left-recursive (LR)
parsers. Previously, given a left-recursive parser, pyparsing would
recurse repeatedly until hitting the Python recursion limit. Following
the methods of the Python PEG parser, pyparsing uses a variation of
packrat parsing to detect and handle left-recursion during parsing.::
import pyparsing as pp
pp.ParserElement.enable_left_recursion()
# a common left-recursion definition
# define a list of items as 'list + item | item'
# BNF:
# item_list := item_list item | item
# item := word of alphas
item_list = pp.Forward()
item = pp.Word(pp.alphas)
item_list <<= item_list + item | item
item_list.run_tests("""\
To parse or not to parse that is the question
""")
Prints::
['To', 'parse', 'or', 'not', 'to', 'parse', 'that', 'is', 'the', 'question']
See more examples in ``left_recursion.py`` in the pyparsing examples directory.
(LR parsing support contributed by Max Fischer)
Packrat/memoization enable and disable methods
----------------------------------------------
As part of the implementation of left-recursion support, new methods have been added
to enable and disable packrat parsing.
====================== =======================================================
Name Description
---------------------- -------------------------------------------------------
enable_packrat Enable packrat parsing (with specified cache size)
enable_left_recursion Enable left-recursion cache
disable_memoization Disable all internal parsing caches
====================== =======================================================
Type annotations on all public methods
--------------------------------------
Python 3.6 and upward compatible type annotations have been added to most of the
public methods in pyparsing. This should facilitate developing pyparsing-based
applications using IDEs for development-time type checking.
New string constants ``identchars`` and ``identbodychars`` to help in defining identifier Word expressions
----------------------------------------------------------------------------------------------------------
Two new module-level strings have been added to help when defining identifiers,
``identchars`` and ``identbodychars``.
Instead of writing::
import pyparsing as pp
identifier = pp.Word(pp.alphas + "_", pp.alphanums + "_")
you will be able to write::
identifier = pp.Word(pp.identchars, pp.identbodychars)
Those constants have also been added to all the Unicode string classes::
import pyparsing as pp
ppu = pp.pyparsing_unicode
cjk_identifier = pp.Word(ppu.CJK.identchars, ppu.CJK.identbodychars)
greek_identifier = pp.Word(ppu.Greek.identchars, ppu.Greek.identbodychars)
Refactored/added diagnostic flags
---------------------------------
Expanded ``__diag__`` and ``__compat__`` to actual classes instead of
just namespaces, to add some helpful behavior:
- ``pyparsing.enable_diag()`` and ``pyparsing.disable_diag()`` methods to give extra
help when setting or clearing flags (detects invalid
flag names, detects when trying to set a ``__compat__`` flag
that is no longer settable). Use these methods now to
set or clear flags, instead of directly setting to ``True`` or
``False``::
import pyparsing as pp
pp.enable_diag(pp.Diagnostics.warn_multiple_tokens_in_named_alternation)
- ``pyparsing.enable_all_warnings()`` is another helper that sets
all "warn*" diagnostics to ``True``::
pp.enable_all_warnings()
- added support for calling ``enable_all_warnings()`` if warnings are enabled
using the Python ``-W`` switch, or setting a non-empty value to the environment
variable ``PYPARSINGENABLEALLWARNINGS``. (If using ``-Wd`` for testing, but
wishing to disable pyparsing warnings, add ``-Wi:::pyparsing``.)
- added new warning, ``warn_on_match_first_with_lshift_operator`` to
warn when using ``'<<'`` with a ``'|'`` ``MatchFirst`` operator,
which will
create an unintended expression due to precedence of operations.
Example: This statement will erroneously define the ``fwd`` expression
as just ``expr_a``, even though ``expr_a | expr_b`` was intended,
since ``'<<'`` operator has precedence over ``'|'``::
fwd << expr_a | expr_b
To correct this, use the ``'<<='`` operator (preferred) or parentheses
to override operator precedence::
fwd <<= expr_a | expr_b
or::
fwd << (expr_a | expr_b)
- ``warn_on_parse_using_empty_Forward`` - warns that a ``Forward``
has been included in a grammar, but no expression was
attached to it using ``'<<='`` or ``'<<'``
- ``warn_on_assignment_to_Forward`` - warns that a ``Forward`` has
been created, but was probably later overwritten by
erroneously using ``'='`` instead of ``'<<='`` (this is a common
mistake when using Forwards)
(**currently not working on PyPy**)
Support for yielding native Python ``list`` and ``dict`` types in place of ``ParseResults``
-------------------------------------------------------------------------------------------
To support parsers that are intended to generate native Python collection
types such as lists and dicts, the ``Group`` and ``Dict`` classes now accept an
additional boolean keyword argument ``aslist`` and ``asdict`` respectively. See
the ``jsonParser.py`` example in the ``pyparsing/examples`` source directory for
how to return types as ``ParseResults`` and as Python collection types, and the
distinctions in working with the different types.
In addition parse actions that must return a value of list type (which would
normally be converted internally to a ``ParseResults``) can override this default
behavior by returning their list wrapped in the new ``ParseResults.List`` class::
# this parse action tries to return a list, but pyparsing
# will convert to a ParseResults
def return_as_list_but_still_get_parse_results(tokens):
return tokens.asList()
# this parse action returns the tokens as a list, and pyparsing will
# maintain its list type in the final parsing results
def return_as_list(tokens):
return ParseResults.List(tokens.asList())
This is the mechanism used internally by the ``Group`` class when defined
using ``aslist=True``.
New Located class to replace ``locatedExpr`` helper method
----------------------------------------------------------
The new ``Located`` class will replace the current ``locatedExpr`` method for
marking parsed results with the start and end locations of the parsed data in
the input string. ``locatedExpr`` had several bugs, and returned its results
in a hard-to-use format (location data and results names were mixed in with
the located expression's parsed results, and wrapped in an unnecessary extra
nesting level).
For this code::
wd = Word(alphas)
for match in locatedExpr(wd).search_string("ljsdf123lksdjjf123lkkjj1222"):
print(match)
the docs for ``locatedExpr`` show this output::
[[0, 'ljsdf', 5]]
[[8, 'lksdjjf', 15]]
[[18, 'lkkjj', 23]]
The parsed values and the start and end locations are merged into a single
nested ``ParseResults`` (and any results names in the parsed values are also
merged in with the start and end location names).
Using ``Located``, the output is::
[0, ['ljsdf'], 5]
[8, ['lksdjjf'], 15]
[18, ['lkkjj'], 23]
With ``Located``, the parsed expression values and results names are kept
separate in the second parsed value, and there is no extra grouping level
on the whole result.
The existing ``locatedExpr`` is retained for backward-compatibility, but will be
deprecated in a future release.
New ``AtLineStart`` and ``AtStringStart`` classes
-------------------------------------------------
As part of fixing some matching behavior in ``LineStart`` and ``StringStart``, two new
classes have been added: ``AtLineStart`` and ``AtStringStart``.
``LineStart`` and ``StringStart`` can be treated as separate elements, including whitespace skipping.
``AtLineStart`` and ``AtStringStart`` enforce that an expression starts exactly at column 1, with no
leading whitespace.::
(LineStart() + Word(alphas)).parseString("ABC") # passes
(LineStart() + Word(alphas)).parseString(" ABC") # passes
AtLineStart(Word(alphas)).parseString(" ABC") # fails
[This is a fix to behavior that was added in 3.0.0, but was actually a regression from 2.4.x.]
New ``IndentedBlock`` class to replace ``indentedBlock`` helper method
----------------------------------------------------------------------
The new ``IndentedBlock`` class will replace the current ``indentedBlock`` method
for defining indented blocks of text, similar to Python source code. Using
``IndentedBlock``, the expression instance itself keeps track of the indent stack,
so a separate external ``indentStack`` variable is no longer required.
Here is a simple example of an expression containing an alphabetic key, followed
by an indented list of integers::
integer = pp.Word(pp.nums)
group = pp.Group(pp.Char(pp.alphas) + pp.IndentedBlock(integer))
parses::
A
100
101
B
200
201
as::
[['A', [100, 101]], ['B', [200, 201]]]
By default, the results returned from the ``IndentedBlock`` are grouped.
``IndentedBlock`` may also be used to define a recursive indented block (containing nested
indented blocks).
The existing ``indentedBlock`` is retained for backward-compatibility, but will be
deprecated in a future release.
Shortened tracebacks
--------------------
Cleaned up default tracebacks when getting a ``ParseException`` when calling
``parse_string``. Exception traces should now stop at the call in ``parse_string``,
and not include the internal pyparsing traceback frames. (If the full traceback
is desired, then set ``ParserElement.verbose_traceback`` to ``True``.)
Improved debug logging
----------------------
Debug logging has been improved by:
- Including ``try/match/fail`` logging when getting results from the
packrat cache (previously cache hits did not show debug logging).
Values returned from the packrat cache are marked with an '*'.
- Improved fail logging, showing the failed expression, text line, and marker where
the failure occurred.
- Adding ``with_line_numbers`` to ``pyparsing_testing``. Use ``with_line_numbers``
to visualize the data being parsed, with line and column numbers corresponding
to the values output when enabling ``set_debug()`` on an expression::
data = """\
A
100"""
expr = pp.Word(pp.alphanums).set_name("word").set_debug()
print(ppt.with_line_numbers(data))
expr[...].parseString(data)
prints::
. 1
1234567890
1: A
2: 100
Match word at loc 3(1,4)
A
^
Matched word -> ['A']
Match word at loc 11(2,7)
100
^
Matched word -> ['100']
New / improved examples
-----------------------
- ``number_words.py`` includes a parser/evaluator to parse ``"forty-two"``
and return ``42``. Also includes example code to generate a railroad
diagram for this parser.
- ``BigQueryViewParser.py`` added to examples directory, submitted
by Michael Smedberg.
- ``booleansearchparser.py`` added to examples directory, submitted
by xecgr. Builds on searchparser.py, adding support for '*'
wildcards and non-Western alphabets.
- Improvements in ``select_parser.py``, to include new SQL syntax
from SQLite, submitted by Robert Coup.
- Off-by-one bug found in the ``roman_numerals.py`` example, a bug
that has been there for about 14 years! Submitted by
Jay Pedersen.
- A simplified Lua parser has been added to the examples
(``lua_parser.py``).
- Demonstration of defining a custom Unicode set for cuneiform
symbols, as well as simple Cuneiform->Python conversion is included
in ``cuneiform_python.py``.
- Fixed bug in ``delta_time.py`` example, when using a quantity
of seconds/minutes/hours/days > 999.
Other new features
------------------
- ``url`` expression added to ``pyparsing_common``, with named fields for
common fields in URLs. See the updated ``urlExtractorNew.py`` file in the
``examples`` directory. Submitted by Wolfgang Fahl.
- ``delimited_list`` now supports an additional flag ``allow_trailing_delim``,
to optionally parse an additional delimiter at the end of the list.
Submitted by Kazantcev Andrey.
- Added global method ``autoname_elements()`` to call ``set_name()`` on all locally
defined ``ParserElements`` that haven't been explicitly named using ``set_name()``, using
their local variable name. Useful for setting names on multiple elements when
creating a railroad diagram::
a = pp.Literal("a")
b = pp.Literal("b").set_name("bbb")
pp.autoname_elements()
``a`` will get named "a", while ``b`` will keep its name "bbb".
- Enhanced default strings created for ``Word`` expressions, now showing
string ranges if possible. ``Word(alphas)`` would formerly
print as ``W:(ABCD...)``, now prints as ``W:(A-Za-z)``.
- Better exception messages to show full word where an exception occurred.::
Word(alphas)[...].parse_string("abc 123", parse_all=True)
Was::
pyparsing.ParseException: Expected end of text, found '1' (at char 4), (line:1, col:5)
Now::
pyparsing.exceptions.ParseException: Expected end of text, found '123' (at char 4), (line:1, col:5)
- Using ``...`` for ``SkipTo`` can now be wrapped in ``Suppress`` to suppress
the skipped text from the returned parse results.::
source = "lead in START relevant text END trailing text"
start_marker = Keyword("START")
end_marker = Keyword("END")
find_body = Suppress(...) + start_marker + ... + end_marker
print(find_body.parse_string(source).dump())
Prints::
['START', 'relevant text ', 'END']
- _skipped: ['relevant text ']
- Added ``ignore_whitespace(recurse:bool = True)`` and added a
``recurse`` argument to ``leave_whitespace``, both added to provide finer
control over pyparsing's whitespace skipping. Contributed by
Michael Milton.
- Added ``ParserElement.recurse()`` method to make it simpler for
grammar utilities to navigate through the tree of expressions in
a pyparsing grammar.
- The ``repr()`` string for ``ParseResults`` is now of the form::
ParseResults([tokens], {named_results})
The previous form omitted the leading ``ParseResults`` class name,
and was easily misinterpreted as a ``tuple`` containing a ``list`` and
a ``dict``.
- Minor reformatting of output from ``run_tests`` to make embedded
comments more visible.
- New ``pyparsing_test`` namespace, assert methods and classes added to support writing
unit tests.
- ``assertParseResultsEquals``
- ``assertParseAndCheckList``
- ``assertParseAndCheckDict``
- ``assertRunTestResults``
- ``assertRaisesParseException``
- ``reset_pyparsing_context`` context manager, to restore pyparsing
config settings
- Enhanced error messages and error locations when parsing fails on
the ``Keyword`` or ``CaselessKeyword`` classes due to the presence of a
preceding or trailing keyword character.
- Enhanced the ``Regex`` class to be compatible with re's compiled with the
re-equivalent ``regex`` module. Individual expressions can be built with
regex compiled expressions using::
import pyparsing as pp
import regex
# would use regex for this expression
integer_parser = pp.Regex(regex.compile(r'\d+'))
- Fixed handling of ``ParseSyntaxExceptions`` raised as part of ``Each``
expressions, when sub-expressions contain ``'-'`` backtrack
suppression.
- Potential performance enhancement when parsing ``Word``
expressions built from ``pyparsing_unicode`` character sets. ``Word`` now
internally converts ranges of consecutive characters to regex
character ranges (converting ``"0123456789"`` to ``"0-9"`` for instance).
- Added a caseless parameter to the ``CloseMatch`` class to allow for casing to be
ignored when checking for close matches. Contributed by Adrian Edwards.
API Changes
===========
- [Note added in pyparsing 3.0.7, reflecting a change in 3.0.0]
Fixed a bug in the ``ParseResults`` class implementation of ``__bool__``, which
would formerly return ``False`` if the ``ParseResults`` item list was empty, even if it
contained named results. Now ``ParseResults`` will return ``True`` if either the item
list is not empty *or* if the named results list is not empty::
# generate an empty ParseResults by parsing a blank string with a ZeroOrMore
result = Word(alphas)[...].parse_string("")
print(result.as_list())
print(result.as_dict())
print(bool(result))
# add a results name to the result
result["name"] = "empty result"
print(result.as_list())
print(result.as_dict())
print(bool(result))
Prints::
[]
{}
False
[]
{'name': 'empty result'}
True
In previous versions, the second call to ``bool()`` would return ``False``.
- [Note added in pyparsing 3.0.4, reflecting a change in 3.0.0]
The ``ParseResults`` class now uses ``__slots__`` to pre-define instance attributes. This
means that code written like this (which was allowed in pyparsing 2.4.7)::
result = Word(alphas).parseString("abc")
result.xyz = 100
now raises this Python exception::
AttributeError: 'ParseResults' object has no attribute 'xyz'
To add new attribute values to ParseResults object in 3.0.0 and later, you must
assign them using indexed notation::
result["xyz"] = 100
You will still be able to access this new value as an attribute or as an
indexed item.
- ``enable_diag()`` and ``disable_diag()`` methods to
enable specific diagnostic values (instead of setting them
to ``True`` or ``False``). ``enable_all_warnings()`` has
also been added.
- ``counted_array`` formerly returned its list of items nested
within another list, so that accessing the items required
indexing the 0'th element to get the actual list. This
extra nesting has been removed. In addition, if there are
other metadata fields parsed between the count and the
list items, they can be preserved in the resulting list
if given results names.
- ``ParseException.explain()`` is now an instance method of
``ParseException``::
expr = pp.Word(pp.nums) * 3
try:
expr.parse_string("123 456 A789")
except pp.ParseException as pe:
print(pe.explain(depth=0))
prints::
123 456 A789
^
ParseException: Expected W:(0-9), found 'A789' (at char 8), (line:1, col:9)
To run explain against other exceptions, use
``ParseException.explain_exception()``.
- Debug actions now take an added keyword argument ``cache_hit``.
Now that debug actions are called for expressions matched in the
packrat parsing cache, debug actions are now called with this extra
flag, set to ``True``. For custom debug actions, it is necessary to add
support for this new argument.
- ``ZeroOrMore`` expressions that have results names will now
include empty lists for their name if no matches are found.
Previously, no named result would be present. Code that tested
for the presence of any expressions using ``"if name in results:"``
will now always return ``True``. This code will need to change to
``"if name in results and results[name]:"`` or just
``"if results[name]:"``. Also, any parser unit tests that check the
``as_dict()`` contents will now see additional entries for parsers
having named ``ZeroOrMore`` expressions, whose values will be ``[]``.
- ``ParserElement.set_default_whitespace_chars`` will now update
whitespace characters on all built-in expressions defined
in the pyparsing module.
- ``camelCase`` names have been converted to PEP-8 ``snake_case`` names.
Method names and arguments that were camel case (such as ``parseString``)
have been replaced with PEP-8 snake case versions (``parse_string``).
Backward-compatibility synonyms for all names and arguments have
been included, to allow parsers written using the old names to run
without change. The synonyms will be removed in a future release.
New parser code should be written using the new PEP-8 snake case names.
============================== ================================
Name Previous name
------------------------------ --------------------------------
ParserElement
- parse_string parseString
- scan_string scanString
- search_string searchString
- transform_string transformString
- add_condition addCondition
- add_parse_action addParseAction
- can_parse_next canParseNext
- default_name defaultName
- enable_left_recursion enableLeftRecursion
- enable_packrat enablePackrat
- ignore_whitespace ignoreWhitespace
- inline_literals_using inlineLiteralsUsing
- parse_file parseFile
- leave_whitespace leaveWhitespace
- parse_string parseString
- parse_with_tabs parseWithTabs
- reset_cache resetCache
- run_tests runTests
- scan_string scanString
- search_string searchString
- set_break setBreak
- set_debug setDebug
- set_debug_actions setDebugActions
- set_default_whitespace_chars setDefaultWhitespaceChars
- set_fail_action setFailAction
- set_name setName
- set_parse_action setParseAction
- set_results_name setResultsName
- set_whitespace_chars setWhitespaceChars
- transform_string transformString
- try_parse tryParse
ParseResults
- as_list asList
- as_dict asDict
- get_name getName
ParseBaseException
- parser_element parserElement
any_open_tag anyOpenTag
any_close_tag anyCloseTag
c_style_comment cStyleComment
common_html_entity commonHTMLEntity
condition_as_parse_action conditionAsParseAction
counted_array countedArray
cpp_style_comment cppStyleComment
dbl_quoted_string dblQuotedString
dbl_slash_comment dblSlashComment
delimited_list delimitedList
dict_of dictOf
html_comment htmlComment
infix_notation infixNotation
java_style_comment javaStyleComment
line_end lineEnd
line_start lineStart
make_html_tags makeHTMLTags
make_xml_tags makeXMLTags
match_only_at_col matchOnlyAtCol
match_previous_expr matchPreviousExpr
match_previous_literal matchPreviousLiteral
nested_expr nestedExpr
null_debug_action nullDebugAction
one_of oneOf
OpAssoc opAssoc
original_text_for originalTextFor
python_style_comment pythonStyleComment
quoted_string quotedString
remove_quotes removeQuotes
replace_html_entity replaceHTMLEntity
replace_with replaceWith
rest_of_line restOfLine
sgl_quoted_string sglQuotedString
string_end stringEnd
string_start stringStart
token_map tokenMap
trace_parse_action traceParseAction
unicode_string unicodeString
with_attribute withAttribute
with_class withClass
============================== ================================
Discontinued Features
=====================
Python 2.x no longer supported
------------------------------
Removed Py2.x support and other deprecated features. Pyparsing
now requires Python 3.6.8 or later. If you are using an earlier
version of Python, you must use a Pyparsing 2.4.x version.
Other discontinued features
---------------------------
- ``ParseResults.asXML()`` - if used for debugging, switch
to using ``ParseResults.dump()``; if used for data transfer,
use ``ParseResults.as_dict()`` to convert to a nested Python
dict, which can then be converted to XML or JSON or
other transfer format
- ``operatorPrecedence`` synonym for ``infixNotation`` -
convert to calling ``infix_notation``
- ``commaSeparatedList`` - convert to using
``pyparsing_common.comma_separated_list``
- ``upcaseTokens`` and ``downcaseTokens`` - convert to using
``pyparsing_common.upcase_tokens`` and ``downcase_tokens``
- ``__compat__.collect_all_And_tokens`` will not be settable to
``False`` to revert to pre-2.3.1 results name behavior -
review use of names for ``MatchFirst`` and Or expressions
containing ``And`` expressions, as they will return the
complete list of parsed tokens, not just the first one.
Use ``pyparsing.enable_diag(pyparsing.Diagnostics.warn_multiple_tokens_in_named_alternation)``
to help identify those expressions in your parsers that
will have changed as a result.
- Removed support for running ``python setup.py test``. The setuptools
maintainers consider the ``test`` command deprecated (see
<https://github.com/pypa/setuptools/issues/1684>). To run the Pyparsing tests,
use the command ``tox``.
Fixed Bugs
==========
- [Reverted in 3.0.2]Fixed issue when ``LineStart()`` expressions would match input text that was not
necessarily at the beginning of a line.
[The previous behavior was the correct behavior, since it represents the ``LineStart`` as its own
matching expression. ``ParserElements`` that must start in column 1 can be wrapped in the new
``AtLineStart`` class.]
- Fixed bug in regex definitions for ``real`` and ``sci_real`` expressions in
``pyparsing_common``.
- Fixed ``FutureWarning`` raised beginning in Python 3.7 for ``Regex`` expressions
containing '[' within a regex set.
- Fixed bug in ``PrecededBy`` which caused infinite recursion.
- Fixed bug in ``CloseMatch`` where end location was incorrectly
computed; and updated ``partial_gene_match.py`` example.
- Fixed bug in ``indentedBlock`` with a parser using two different
types of nested indented blocks with different indent values,
but sharing the same indent stack.
- Fixed bug in ``Each`` when using ``Regex``, when ``Regex`` expression would
get parsed twice.
- Fixed bugs in ``Each`` when passed ``OneOrMore`` or ``ZeroOrMore`` expressions:
. first expression match could be enclosed in an extra nesting level
. out-of-order expressions now handled correctly if mixed with required expressions
. results names are maintained correctly for these expression
- Fixed ``FutureWarning`` that sometimes is raised when ``'['`` passed as a
character to ``Word``.
- Fixed debug logging to show failure location after whitespace skipping.
- Fixed ``ParseFatalExceptions`` failing to override normal exceptions or expression
matches in ``MatchFirst`` expressions.
- Fixed bug in which ``ParseResults`` replaces a collection type value with an invalid
type annotation (as a result of changed behavior in Python 3.9).
- Fixed bug in ``ParseResults`` when calling ``__getattr__`` for special double-underscored
methods. Now raises ``AttributeError`` for non-existent results when accessing a
name starting with '__'.
- Fixed bug in ``Located`` class when used with a results name.
- Fixed bug in ``QuotedString`` class when the escaped quote string is not a
repeated character.
Acknowledgments
===============
And finally, many thanks to those who helped in the restructuring
of the pyparsing code base as part of this release. Pyparsing now
has more standard package structure, more standard unit tests,
and more standard code formatting (using black). Special thanks
to jdufresne, klahnakoski, mattcarmody, ckeygusuz,
tmiguelt, and toonarmycaptain to name just a few.
Thanks also to Michael Milton and Max Fischer, who added some
significant new features to pyparsing.
|