summaryrefslogtreecommitdiff
path: root/TODO
Commit message (Collapse)AuthorAgeFilesLines
* style: minor fixesAkim Demaille2020-05-091-2/+9
| | | | * examples/c/README.md: here.
* todo: updateAkim Demaille2020-05-041-4/+7
|
* java: demonstrate push parsersAkim Demaille2020-05-031-3/+0
| | | | | | | | | * data/skeletons/lalr1.java (Location): Make it a static class. (Lexer.yylex, Lexer.getLVal, Lexer.getStartPos, Lexer.getEndPos): These are not needed in push parsers. * examples/java/calc/Calc.y: Demonstrate push parsers in the Java. * doc/bison.texi: Push parsers have been supported for a long time, remove incorrect statements stating the opposite.
* todo: moreAkim Demaille2020-05-021-0/+49
|
* news: make it more consistentAkim Demaille2020-05-011-0/+4
| | | | * NEWS: Use the same pattern for titles.
* doc: document YYEOF, YYUNDEF and YYerrorAkim Demaille2020-04-291-26/+19
| | | | | * doc/bison.texi (Special Tokens): New. * examples/c/bistromathic/parse.y: Formatting changes.
* yacc.c: install backward compatibility for YYERRCODEAkim Demaille2020-04-281-4/+2
| | | | | | | | | | | | Some people have been using that symbol. Some even have #defined it themselves. https://lists.gnu.org/r/bison-patches/2020-04/msg00138.html Let's provide backward compatibility, having it point to YYUNDEF, so that an error message is generated. * data/skeletons/yacc.c (YYERRCODE): New, at the exact same location it was defined before.
* style: c++: s/type/kind/ where appropriateAkim Demaille2020-04-281-17/+1
| | | | | | | | | These are internal details. `type_get ()` is still there to ensure backward compatibility, `kind ()` being the modern way. * data/skeletons/c++.m4 (by_type, by_type::type): Rename as... (by_kind, by_kind::kind_): this. Adjust dependencies.
* java: clean up the definition of token kindsAkim Demaille2020-04-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From public interface Lexer { /* Token kinds. */ /** Token number, to be returned by the scanner. */ static final int YYEOF = 0; /** Token number, to be returned by the scanner. */ static final int YYERRCODE = 256; /** Token number, to be returned by the scanner. */ static final int YYUNDEF = 257; /** Token number, to be returned by the scanner. */ static final int BANG = 258; ... /** Deprecated, use b4_symbol(0, id) instead. */ public static final int EOF = YYEOF; to public interface Lexer { /* Token kinds. */ /** Token "end of file", to be returned by the scanner. */ static final int YYEOF = 0; /** Token error, to be returned by the scanner. */ static final int YYerror = 256; /** Token "invalid token", to be returned by the scanner. */ static final int YYUNDEF = 257; /** Token "!", to be returned by the scanner. */ static final int BANG = 258; ... /** Deprecated, use YYEOF instead. */ public static final int EOF = YYEOF; * data/skeletons/java.m4 (b4_token_enum): Display the symbol's tag in comment. * data/skeletons/lalr1.java: Address overquotation issue. * examples/java/calc/Calc.y, examples/java/simple/Calc.y: Use YYEOF, not EOF.
* error: rename the error token from YYERRCODE to YYerrorAkim Demaille2020-04-281-16/+4
| | | | | | | | | | | See https://lists.gnu.org/r/bison-patches/2020-04/msg00162.html. * data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.cc, * data/skeletons/lalr1.java, doc/bison.texi, * examples/c/bistromathic/parse.y, src/scan-gram.l, src/symtab.c (YYERRCODE): Rename as... (YYerror): this. Adjust dependencies.
* todo: updateAkim Demaille2020-04-261-100/+23
|
* todo: update for YYERRCODEAkim Demaille2020-04-241-0/+112
|
* tokens: clean up the translation of special symbolsAkim Demaille2020-04-191-0/+2
| | | | | | | | * src/output.c (prepare_symbol_names): Don't play tricks with the symbols, it's quite too late. (has_translations): Move to... * src/symtab.c: here. (symbols_pack): Use it to enable translation for special symbols.
* c++: give public access to the symbol kindAkim Demaille2020-04-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | symbol_type::token () was removed: it returned the token kind of a symbol. To do that, one needs to convert from the symbol kind to the token kind, which requires a table. This broke some users' unit tests for scanners, see https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html https://lists.gnu.org/r/bug-bison/2020-03/msg00020.html https://lists.gnu.org/r/help-bison/2020-04/msg00005.html Instead of making this possible again, let's check the symbol's kind instead. So give proper access to a symbol's kind. That feature existed, undocumented, as 'type_get()'. Let's rename this as 'kind()'. * data/skeletons/c++.m4, data/skeletons/glr.cc, * data/skeletons/lalr1.cc (type_get): Rename as... (kind): This. (type_get): Install a backward compatibility alias. * doc/bison.texi (Complete Symbols): Document symbol_type and symbol_type::kind.
* doc: token_kind_type in C++Akim Demaille2020-04-171-4/+3
| | | | | | | * data/skeletons/c++.m4: Define the old names in terms on the new ones, instead of the converse. * doc/bison.texi (C++ Parser Interface): Be more extensive about token_kind_type.
* doc: updates for 3.6Akim Demaille2020-04-161-5/+4
| | | | | * doc/bison.texi: More s/token type/token kind/. * NEWS: Update.
* doc: spell checkAkim Demaille2020-04-131-2/+2
| | | | | * doc/bison.texi, NEWS, README-hacking.md: here. And elsewhere.
* java: promote YYEOF rather that Lexer.EOFAkim Demaille2020-04-131-3/+0
| | | | | * doc/bison.texi: here. * data/skeletons/lalr1.java: Use YYEOF.
* doc: java: SymbolKind, etc.Akim Demaille2020-04-131-10/+15
| | | | | | | | | | | | | Why didn't I think about this before??? symbolName should be a method of SymbolKind. * data/skeletons/lalr1.java (YYParser::yysymbolName): Move as... * data/skeletons/java.m4 (SymbolKind::getName): this. Make the table a static final table, not a local variable. Adjust dependencies. * doc/bison.texi (Java Parser Interface): Document i18n. (Java Parser Context Interface): Document SymbolKind. * examples/java/calc/Calc.y, tests/local.at: Adjust.
* d: put YYEMPTY in the TokenKindAkim Demaille2020-04-131-1/+6
| | | | | | | * data/skeletons/d.m4, data/skeletons/lalr1.d (b4_token_enums): Rename YYTokenType as TokenKind. Define YYEMPTY. * examples/d/calc.y, tests/calc.at, tests/scanner.at: Adjust.
* doc: use "code", not "number", for token (and symbol) kindsAkim Demaille2020-04-121-7/+2
| | | | | | | "Number" is too much about arithmethics. "Code" conveys better the "enum" nature of token kinds. And of symbol kinds. * doc/bison.texi: Here.
* doc: promote yytoken_kind_t, not yytokentypeAkim Demaille2020-04-121-12/+0
| | | | | | | | | | | | | | | | * data/skeletons/c.m4 (yytoken_kind_t): New. * data/skeletons/c++.m4, data/skeletons/lalr1.cc (yysymbol_kind_type): New. * examples/c/lexcalc/parse.y, examples/c/reccalc/parse.y, * tests/regression.at: Use them. * doc/bison.texi: Replace "enum yytokentype" by "yytoken_kind_t". (api.token.raw): Explain that it forces "yytoken_kind_t" to coincide with "yysymbol_kind_t". (Calling Convention): Mention YYEOF. (Table of Symbols): Add entries for "yytoken_kind_t" and "yysymbol_kind_t". (Glossary): Add entries for "Kind", "Token kind" and "Symbol kind".
* skeletons: use "end of file" instead of "$end"Akim Demaille2020-04-121-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | The name "$end" is nice in the report, in particular it avoids that pointed-rules (aka items) be too long. It also helps keeping them "standard". But it is bad in error messages, we should report "end of file" (or maybe "end of input", this is debatable). So, unless the user already defined the alias for the error token herself, make it "end of file". It should even be translated if the user already translated some tokens, so that there is now no strong reason to redefine the $end token. * src/output.c (prepare_symbol_names): Issue "end of file" instead of "$end". * data/skeletons/lalr1.java (yytnamerr_): Remove the renaming hack. * build-aux/update-test: Accept files with names containing a "+", such as c++.at. * tests/actions.at, tests/c++.at, tests/conflicts.at, * tests/glr-regression.at, tests/regression.at, tests/skeletons.at: Adjust.
* diagnostics: replace "user token number" by "token code"Akim Demaille2020-04-121-15/+1
| | | | | | | | | Yet, don't change the structure identifier to avoid introducing conflicts in Vincent Imbimbo's PR (which, amusingly enough, is about conflicts). * src/symtab.c: here. * tests/diagnostics.at, tests/input.at: Adjust.
* c++: remove the yy prefix from some functionsAkim Demaille2020-04-121-10/+1
| | | | | | | | yy::parser features a parse() function, not a yyparse() one. * data/skeletons/lalr1.cc (yyreport_syntax_error) (context::yyexpected_tokens): Rename as... (report_syntax_error, context::expected_tokens): these.
* tokens: properly define the YYEOF token kindAkim Demaille2020-04-121-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in the implementation file. As a result, the user has to define her own EOF token if she wants to use it, which is a pity. Give the $end token a visible kind name, YYEOF. Except that in C, where enums are not scoped, we would have collisions between all the definitions of YYEOFs in the header files, so in C, make it <api.PREFIX>EOF. * data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions. Unless the user already gave it a different name. * data/skeletons/glr.c (YYEOF): Remove. Use ]b4_symbol(0, [id])[ instead. Add support for "pre_epilogue", for glr.cc. * data/skeletons/glr.cc: Remove dead code (never emitted #undefs). * data/skeletons/yacc.c * src/parse-gram.c * src/reader.c * src/symtab.c * tests/actions.at * tests/input.at
* tokens: properly define the "error" token kindAkim Demaille2020-04-121-0/+1
| | | | | | | | | | | | | | | | | | | | | There are people out there that do use YYERRCODE (the token kind of the error token). See for instance https://github.com/borbolla-automation/SPC_Machines/blob/3812012bb782bfdfe7b325950a35cd337925fcad/unixODBC-2.3.2/Drivers/nn/yylex.c. Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define in the *.c file only. It belongs with the other token kinds. YYERRCODE is not a nice name, it does not fit in our naming scheme. YYERROR would be more logical, but it collides with the YYERROR macro. Shall we keep the same name in all the skeletons? Besides, to avoid collisions in C, we need to apply the api prefix: YYERRCODE is actually <PREFIX>ERRCODE. This is not needed in the other languages. * data/skeletons/bison.m4 (b4_symbol_token_kind): New. Map the error token to "YYERRCODE". * data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by... * src/output.c (prepare_symbol_definitions): this. * tests/input.at (Redefining the error token): Check it.
* style: rename YYNOMEM as YYENOMEMAkim Demaille2020-04-101-1/+1
| | | | | | | This is clearer. * data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): Rename as... (YYENOMEM): here.
* todo: updateAkim Demaille2020-04-101-2/+3
|
* todo: updateAkim Demaille2020-04-061-40/+6
| | | | * TODO (YYERRCODE): Remove, handled by YYSYMBOL_ERROR.
* skeletons: beware not to use yyarg when it's nullAkim Demaille2020-04-061-3/+0
| | | | | | | Reported by Adrian Vogelsgesang. * data/skeletons/glr.c, data/skeletons/lalr1.cc, * data/skeletons/lalr1.java, data/skeletons/yacc.c: Here.
* java: prefer null to YYSYMBOL_YYEMPTYAkim Demaille2020-04-061-0/+3
| | | | | | | | That's one nice benefit from using enums. * data/skeletons/lalr1.java (YYSYMBOL_YYEMPTY): No longer define it. Use 'null' instead. * examples/java/calc/Calc.y, tests/local.at: Adjust.
* skeletons: use consistently "kind" instead of "type" in the codeAkim Demaille2020-04-051-0/+1
| | | | | | | * data/skeletons/bison.m4, data/skeletons/c++.m4, data/skeletons/c.m4, * data/skeletons/glr.cc, data/skeletons/lalr1.cc, * data/skeletons/lalr1.d, data/skeletons/lalr1.java: Refer to the "kind" of a symbol, not its "type", where appropriate.
* todo: updateAkim Demaille2020-04-051-65/+4
|
* java: use SymbolTypeAkim Demaille2020-04-041-0/+3
| | | | | | | | | | | | The Java enums are very different from the C model. As a consequence, one cannot "build" an enum directly from an integer, we must retrieve it. That's the purpose of the SymbolType.get class method. * data/skeletons/java.m4 (b4_symbol_enum, b4_case_code_symbol) (b4_declare_symbol_enum): New. * data/skeletons/lalr1.java: Use SymbolType, SymbolType.YYSYMBOL_YYEMPTY, etc. * examples/java/calc/Calc.y, tests/local.at: Adjust.
* yacc.c: prefer YYSYMBOL_YYERROR to YYSYMBOL_errorAkim Demaille2020-04-011-16/+0
| | | | | * data/skeletons/bison.m4 (b4_symbol_sid): Map "error" to YYSYMBOL_YYERROR. * data/skeletons/yacc.c: Adjust.
* yacc.c: also define a symbol number for the empty tokenAkim Demaille2020-04-011-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is not only cleaner, it also protects us from mixing signed values (YYEMPTY is #defined as -2) with unsigned types (the yysymbol_type_t enum is typically compiled as a small unsigned). For instance GCC 9: input.c: In function 'yyparse': input.c:1107:7: error: conversion to 'unsigned int' from 'int' may change the sign of the result [-Werror=sign-conversion] 1107 | yyn += yytoken; | ^~ input.c:1107:10: error: conversion to 'int' from 'unsigned int' may change the sign of the result [-Werror=sign-conversion] 1107 | yyn += yytoken; | ^~~~~~~ input.c:1108:47: error: comparison of integer expressions of different signedness: 'yytype_int8' {aka 'const signed char'} and 'yysymbol_type_t' {aka 'enum yysymbol_type_t'} [-Werror=sign-compare] 1108 | if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken) | ^~ input.c:702:25: error: operand of ?: changes signedness from 'int' to 'unsigned int' due to unsignedness of other operand [-Werror=sign-compare] 702 | #define YYEMPTY (-2) | ^~~~ input.c:1220:33: note: in expansion of macro 'YYEMPTY' 1220 | yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); | ^~~~~~~ input.c:1220:41: error: unsigned conversion from 'int' to 'unsigned int' changes value from '-2' to '4294967294' [-Werror=sign-conversion] 1220 | yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); | ^ Eventually, it might be interesting to move away from -2 (which is the only possible negative symbol number) and use the next available number, to save bits. We could actually even simply use "0" and shift the rest, which would allow to write "!yytoken" to mean really "yytoken != YYEMPTY". * data/skeletons/c.m4 (b4_declare_symbol_enum): Define YYSYMBOL_YYEMPTY. * data/skeletons/yacc.c: Use it. * src/parse-gram.y (yyreport_syntax_error): Use YYSYMBOL_YYEMPTY, not YYEMPTY, when dealing with a symbol. * tests/regression.at: Adjust.
* yacc.c: introduce an enum that defines the symbol's numberAkim Demaille2020-04-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a number of advantage in exposing the symbol (internal) numbers: - custom error messages can use them to decide how to represent a given symbol, or a set of symbols. - we need something similar in uses of yyexpected_tokens. For instance, currently, bistromathic's completion() reads: int ntokens = expected_tokens (line, tokens, YYNTOKENS); [...] for (int i = 0; i < ntokens; ++i) if (tokens[i] == YYTRANSLATE (TOK_VAR)) [...] else if (tokens[i] == YYTRANSLATE (TOK_FUN)) [...] else [...] - now that it's a compile-time expression, we can easily build static tables, switch, etc. - some users depended on the ability to get the token number from a symbol to write test cases for their scanners. But Bison 3.5 removed the table this feature depended upon (a reverse yytranslate). Now they can check against the actual symbol number, without having pay (space and time) a conversion. See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html. - it helps us clearly separate the internal symbol numbers from the external token numbers, whose difference is sometimes blurred in the code when values coincide (e.g. "yychar = yytoken = YYEOF"). - it allows us to get rid of ugly macros with inconsistent names such as YYUNDEFTOK and YYTERROR, and to group related definitions together. - similarly it provides a clean access to the $accept symbol (which proves convenient in a current experimentation of mine with several %start symbols). Let's declare this type as a private type (in the *.c file, not the *.h one). So it does not need to be influenced by the api prefix. * data/skeletons/bison.m4 (b4_symbol_sid): New. (b4_symbol): Use it. * data/skeletons/c.m4 (b4_symbol_enum, b4_declare_symbol_enum): New. * data/skeletons/yacc.c: Use b4_declare_symbol_enum. (YYUNDEFTOK, YYTERROR): Remove. Use the corresponding symbol enum instead.
* java: move away from _ for internationalizationAkim Demaille2020-03-301-0/+1
| | | | | | | | | | The "_" is becoming a keyword in Java, which causes tons of warnings currently in our test suite. GNU Gettext is now using "i18n" instead of "_" (https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=e89fea36545f27487d9652a13e6a0adbea1117d0). * data/skeletons/java.m4: Use "i18n", not "_". * examples/java/calc/Calc.y, tests/calc.at: Adjust.
* c: use YYNOMEM instead of -2Akim Demaille2020-03-281-2/+1
| | | | | | | See 84b1972c96060866b4bd94a33b97711f8f7d0b6c. * data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): New. Use it.
* todo: updateAkim Demaille2020-03-281-19/+52
| | | | | | * TODO (Token Number): We have to clean this. (Naming conventions, Symbol numbers): New. (Bad styling): Addressed in e21ff47f5d0b64da693a47b7dd200a1a44a5bbeb.
* merge branch 'maint'Akim Demaille2020-03-081-7/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * upstream/maint: maint: post-release administrivia version 3.5.3 news: update for 3.5.3 yacc.c: make sure we properly propagated the user's number for error diagnostics: don't crash because of repeated definitions of error style: initialize some struct members diagnostics: beware of zero-width characters diagnostics: be sure to close the styling when lines are too short muscles: fix incorrect decoding of $ code: be robust to reference with invalid tags build: fix typo doc: update recommandation for libtextstyle style: comment changes examples: use consistently the GFDL header for readmes style: remove useless declarations typo: succesful -> successful README: point to tests/bison, and document --trace gnulib: update maint: post-release administrivia
| * yacc.c: make sure we properly propagated the user's number for errorAkim Demaille2020-03-081-7/+2
| | | | | | | | | | * data/skeletons/yacc.c (YYERRCODE): Be truthful. * tests/input.at (Redefining the error token): Check that.
| * package: bump copyrights to 2020Akim Demaille2020-01-101-1/+1
| | | | | | | | Run 'make update-copyright'.
* | yacc.c: yyerror_range does not need to be preserved accross callsAkim Demaille2020-03-051-3/+0
| | | | | | | | | | | | * data/skeletons/yacc.c (b4_parse_state_variable_macros): Don't define yyerror_range. (yyparse): Add yyerror_range as local variable.
* | yacc.c: push: initialize the pstate variables in pstate_newAkim Demaille2020-03-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently pstate_new does not set up its variables, this task is left to yypush_parse. This was probably to share more code with usual pull parsers, where these (local) variables are indeed initialized by yyparse. But as a consequence yyexpected_tokens crashes at the very beginning of the parse, since, for instance, the stacks are not even set up. See https://lists.gnu.org/r/bison-patches/2020-03/msg00001.html. The fix could have very simple, but the documentation actually makes it very clear that we can reuse a pstate for several parses: After yypush_parse returns a status other than YYPUSH_MORE, the parser instance yyps may be reused for a new parse. so we need to restore the parser to its pristine state so that (i) it is ready to run the next parse, (ii) it properly supports yyexpected_tokens for the next run. * data/skeletons/yacc.c (b4_initialize_parser_state_variables): New, extracted from the top of yyparse/yypush_parse. (yypstate_clear): New. (yypstate_new): Use it when push parsers are enabled. Define after the yyps macros so that we can use the same code as the regular pull parsers. (yyparse): Use it when push parsers are _not_ enabled. * examples/c/bistromathic/bistromathic.test: Check the completion on the beginning of the line.
* | m4: decommission function generating macroAkim Demaille2020-03-021-0/+4
| | | | | | | | | | | | | | | | | | | | These macros have been extremely useful when we had to support K&R C, which we dropped long ago. Now, they merely make the code uselessly hard to read. * data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/glr.cc, * data/skeletons/yacc.c: Stop using b4_function_define.
* | todo: updateAkim Demaille2020-02-191-21/+14
| |
* | diagnostics: modernize the display of submessagesVictor Morales Cayuela2020-02-151-46/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Bison 2.7, output was indented four spaces for explanatory statements. For example: input.y:2.7-13: error: %type redeclaration for exp input.y:1.7-11: previous declaration Since the introduction of caret-diagnostics, it became less clear. Remove the indentation and display submessages as in GCC: input.y:2.7-13: error: %type redeclaration for exp 2 | %type <float> exp | ^~~~~~~ input.y:1.7-11: note: previous declaration 1 | %type <int> exp | ^~~~~ * src/complain.h (SUB_INDENT): Remove. (warnings): Add "note" to the enum. * src/complain.h, src/complain.c (complain_indent): Replace by... (subcomplain): this. Adjust all dependencies. * tests/actions.at, tests/diagnostics.at, tests/glr-regression.at, * tests/input.at, tests/named-refs.at, tests/regression.at: Adjust expectations.
* | c++: simplifyAkim Demaille2020-02-121-11/+0
| | | | | | | | * data/skeletons/stack.hh (ssize): Remove, same as size.