bpo-30455: Generate all token related code and docs from Grammar/Tokens. (GH-10370)

"Include/token.h", "Lib/token.py" (containing now some data moved from "Lib/tokenize.py") and new files "Parser/token.c" (containing the code moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by "Tools/scripts/generate_token.py". The script overwrites files only if needed and can be used on the read-only sources tree. "Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py" instead of been executable itself. Added new make targets "regen-token" and "regen-symbol" which are now dependencies of "regen-all". The documentation contains now strings for operators and punctuation tokens.
author: Serhiy Storchaka <storchaka@gmail.com> 2018-12-22 11:18:40 +0200
committer: GitHub <noreply@github.com> 2018-12-22 11:18:40 +0200
commit: 8ac658114dec4964479baecfbc439fceb40eaa79 (patch)
tree: e66c4c3beda293a6fdf01763306697d15d0af157 /Grammar/Tokens
parent: c1b4b0f6160e1919394586f44b12538505fed300 (diff)
download: cpython-git-8ac658114dec4964479baecfbc439fceb40eaa79.tar.gz
1 files changed, 62 insertions, 0 deletions
diff --git a/Grammar/Tokens b/Grammar/Tokens
new file mode 100644
index 0000000000..9595673a5a
--- /dev/null
+++ b/Grammar/Tokens
@@ -0,0 +1,62 @@
+ENDMARKER
+NAME
+NUMBER
+STRING
+NEWLINE
+INDENT
+DEDENT
+
+LPAR                    '('
+RPAR                    ')'
+LSQB                    '['
+RSQB                    ']'
+COLON                   ':'
+COMMA                   ','
+SEMI                    ';'
+PLUS                    '+'
+MINUS                   '-'
+STAR                    '*'
+SLASH                   '/'
+VBAR                    '|'
+AMPER                   '&'
+LESS                    '<'
+GREATER                 '>'
+EQUAL                   '='
+DOT                     '.'
+PERCENT                 '%'
+LBRACE                  '{'
+RBRACE                  '}'
+EQEQUAL                 '=='
+NOTEQUAL                '!='
+LESSEQUAL               '<='
+GREATEREQUAL            '>='
+TILDE                   '~'
+CIRCUMFLEX              '^'
+LEFTSHIFT               '<<'
+RIGHTSHIFT              '>>'
+DOUBLESTAR              '**'
+PLUSEQUAL               '+='
+MINEQUAL                '-='
+STAREQUAL               '*='
+SLASHEQUAL              '/='
+PERCENTEQUAL            '%='
+AMPEREQUAL              '&='
+VBAREQUAL               '|='
+CIRCUMFLEXEQUAL         '^='
+LEFTSHIFTEQUAL          '<<='
+RIGHTSHIFTEQUAL         '>>='
+DOUBLESTAREQUAL         '**='
+DOUBLESLASH             '//'
+DOUBLESLASHEQUAL        '//='
+AT                      '@'
+ATEQUAL                 '@='
+RARROW                  '->'
+ELLIPSIS                '...'
+
+OP
+ERRORTOKEN
+
+# These aren't used by the C tokenizer but are needed for tokenize.py
+COMMENT
+NL
+ENCODING
author	Serhiy Storchaka <storchaka@gmail.com>	2018-12-22 11:18:40 +0200
committer	GitHub <noreply@github.com>	2018-12-22 11:18:40 +0200
commit	8ac658114dec4964479baecfbc439fceb40eaa79 (patch)
tree	e66c4c3beda293a6fdf01763306697d15d0af157 /Grammar/Tokens
parent	c1b4b0f6160e1919394586f44b12538505fed300 (diff)
download	cpython-git-8ac658114dec4964479baecfbc439fceb40eaa79.tar.gz