summaryrefslogtreecommitdiff
path: root/Zend/zend_language_scanner.h
Commit message (Collapse)AuthorAgeFilesLines
* Fix parsing of semi-reserved tokens at offset > 4 GBNikita Popov2021-01-251-1/+1
| | | | | | | To avoid increasing the size of parser stack elements by storing size_t offset and length, this instead only stores the start offset (or rather pointer now) and determines the length of the identifier in zend_lex_tstring.
* Accept zend_string in zend_prepare_string_for_scanningNikita Popov2021-01-211-1/+1
|
* Replace zend_bool uses with boolNikita Popov2021-01-151-1/+1
| | | | | | | We're starting to see a mix between uses of zend_bool and bool. Replace all usages with the standard bool type everywhere. Of course, zend_bool is retained as an alias.
* Improve type declarations for Zend APIsGeorge Peter Banyard2020-08-281-3/+3
| | | | | | | | | Voidification of Zend API which always succeeded Use bool argument types instead of int for boolean arguments Use bool return type for functions which return true/false (1/0) Use zend_result return type for functions which return SUCCESS/FAILURE as they don't follow normal boolean semantics Closes GH-6002
* Forbid use of <?= as a semi-reserved identifierNikita Popov2020-06-191-1/+1
| | | | | | | | | | | | | One of the weirdest pieces of PHP code I've ever seen. In terms of tokens, this gets internally translated to use x as y; echo as my_echo; On master it crashes because this "echo" does not have attached identifier metadata. Make sure it is added and then reject the use of "<?=" as an identifier inside zend_lex_tstring. Fixes oss-fuzz #23547.
* Fix bug #77966: Cannot alias a method named "namespace"Nikita Popov2020-06-081-2/+4
| | | | | | | | | | | | | | | | | | | | This is a bit tricky: In this cases we have "namespace as", which means that we will only recognize "namespace" as an identifier when the lookahead token is already at the "as". This means that zend_lex_tstring picks up the wrong identifier. We solve this by actually assigning the identifier as the semantic value on the parser stack -- as in almost all cases we will not actually need the identifier, this is just an (offset, size) reference, not a copy of the string. Additionally, we need to teach the lexer feedback mechanism used by tokenizer TOKEN_PARSE mode to apply feedback to something other than the very last token. To that purpose we pass through the token text and check the tokens in reverse order to find the right one. Closes GH-5668.
* Syntax errors caused by unclosed {, [, ( mention specific locationAlex Dowad2020-04-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | Aside from a few very specific syntax errors for which detailed exceptions are thrown, generally PHP just emits the default error messages generated by bison on syntax error. These messages are very uninformative; they just say "Unexpected ... at line ...". This is most problematic with constructs which can span an arbitrary number of lines, such as blocks of code delimited by { }, 'if' conditions delimited by ( ), and so on. If a closing delimiter is missed, the block will run for the entire remainder of the source file (which could be thousands of lines), and then at the end, a parse error will be thrown with the dreaded words: "Unexpected end of file". Therefore, track the positions of opening and closing delimiters and ensure that they match up correctly. If any mismatch or missing delimiter is detected, immediately throw a parse error which points the user to the offending line. This is best done in the *lexer* and not in the parser. Thanks to Nikita Popov and George Peter Banyard for suggesting improvements. Fixes bug #79368. Closes GH-5364.
* Constify some char* arguments or return values of ZEND_APItwosee2019-06-121-1/+1
| | | | Closes GH-4247.
* Remove local variablesPeter Kokot2019-02-031-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the so called local variables defined per file basis for certain editors to properly show tab width, and similar settings. These are mainly used by Vim and Emacs editors yet with recent changes the once working definitions don't work anymore in Vim without custom plugins or additional configuration. Neither are these settings synced across the PHP code base. A simpler and better approach is EditorConfig and fixing code using some code style fixing tools in the future instead. This patch also removes the so called modelines for Vim. Modelines allow Vim editor specifically to set some editor configuration such as syntax highlighting, indentation style and tab width to be set in the first line or the last 5 lines per file basis. Since the php test files have syntax highlighting already set in most editors properly and EditorConfig takes care of the indentation settings, this patch removes these as well for the Vim 6.0 and newer versions. With the removal of local variables for certain editors such as Emacs and Vim, the footer is also probably not needed anymore when creating extensions using ext_skel.php script. Additionally, Vim modelines for setting php syntax and some editor settings has been removed from some *.phpt files. All these are mostly not relevant for phpt files neither work properly in the middle of the file.
* Adios, yearly copyright rangesZeev Suraski2019-01-301-1/+1
|
* Update email addresses. We're still @Zend, but future proofing it...Zeev Suraski2018-11-011-2/+2
|
* Remove unused Git attributes identPeter Kokot2018-07-251-2/+0
| | | | | | | | | | | | | | | The $Id$ keywords were used in Subversion where they can be substituted with filename, last revision number change, last changed date, and last user who changed it. In Git this functionality is different and can be done with Git attribute ident. These need to be defined manually for each file in the .gitattributes file and are afterwards replaced with 40-character hexadecimal blob object name which is based only on the particular file contents. This patch simplifies handling of $Id$ keywords by removing them since they are not used anymore.
* Implement flexible heredoc/nowdoc syntaxThomas Punt2018-04-131-0/+2
| | | | | | | | | | | | | | | RFC: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes * The ending label no longer has to be followed by a semicolon or newline. Any non-label character is fine. * The ending label may be indented. The indentation will be stripped from all lines in the heredoc/nowdoc string. Lexing of heredoc strings performs a scan-ahead to determine the indentation of the ending label, so that the correct amount of indentation can be removed when calculting the semantic values for use by the parser. This makes the implementation quite a bit more complicated than we would like :/
* year++Xinchen Hui2018-01-021-1/+1
|
* further sync for vim mode linesAnatol Belski2017-07-041-0/+2
|
* Update copyright headers to 2017Sammy Kaye Powers2017-01-021-1/+1
|
* further normalizations, uint vs uint32_tAnatol Belski2016-11-261-1/+1
| | | | | | fix merge mistake yet one more replacement run
* Make sure TOKEN_PARSE mode is thread safeNikita Popov2016-07-231-1/+2
| | | | | | Introduce an on_event_context passed to the on_event hook. Use this context to pass along the token array. Previously this was stored in a non-tls global :/
* Drop dup declare with inconsistent linkageNikita Popov2016-05-021-1/+0
| | | | This is already declared in zend_stream.h as ZEND_API.
* bump year which is missed in rev 49493a2Xinchen Hui2016-01-021-1/+1
|
* ext tokenizer port + cleanup unused lexer statesMárcio Almada2015-04-301-0/+4
| | | | | | | | | | | we basically added a mechanism to store the token stream during parsing and exposed the entire parser stack on the tokenizer extension through an opt in flag: token_get_all($src, TOKEN_PARSE). this change allows easy future language enhancements regarding context aware parsing & scanning without further maintance on the tokenizer extension while solves known inconsistencies "parseless" tokenizer extension has when it handles `__halt_compiler()` presence.
* Fixed compiler reenterabilityDmitry Stogov2015-01-221-0/+3
|
* bump yearXinchen Hui2015-01-151-1/+1
|
* trailing whitespace removalStanislav Malyshev2015-01-101-1/+1
|
* first shot remove TSRMLS_* thingsAnatol Belski2014-12-131-5/+5
|
* Use better data structures (incomplete)Dmitry Stogov2014-02-101-1/+1
|
* Bump yearXinchen Hui2014-01-031-1/+1
|
* Merge branch 'PHP-5.4' into PHP-5.5Stanislav Malyshev2013-08-041-1/+1
|\ | | | | | | | | | | | | | | * PHP-5.4: non living code related typo fixes Conflicts: Zend/zend_compile.c
| * non living code related typo fixesVeres Lajos2013-08-041-1/+1
| |
| * Happy New YearXinchen Hui2013-01-011-1/+1
| |
| * - Year++Felipe Pena2012-01-011-1/+1
| |
| * Fixed ZE specific compile warnings (Bug #55629)Dmitry Stogov2011-09-131-1/+1
| |
* | Happy New YearXinchen Hui2013-01-011-1/+1
| |
* | Fix lexing of nested heredoc strings in token_get_all()Nikita Popov2012-03-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes bug #60097. Before two global variables CG(heredoc) and CG(heredoc_len) were used to track the current heredoc label. In order to support nested heredoc strings the *previous* heredoc label was assigned as the token value of T_START_HEREDOC and the language_parser.y assigned that to CG(heredoc). This created a dependency of the lexer on the parser. Thus the token_get_all() function, which accesses the lexer directly without also running the parser, was not able to tokenize nested heredoc strings (and leaked memory). Same applies for the source-code highlighting functions. The new approach is to maintain a heredoc_label_stack in the lexer, which contains all active heredoc labels. As it is no longer required, T_START_HEREDOC and T_END_HEREDOC now don't carry a token value anymore. In order to make the work with zend_ptr_stack in this context more convenient I added a new function zend_ptr_stack_top(), which retrieves the top element of the stack (similar to zend_stack_top()).
* | - Year++Felipe Pena2012-01-011-1/+1
| |
* | Fixed ZE specific compile warnings (Bug #55629)Dmitry Stogov2011-09-131-1/+1
|/
* - Year++Felipe Pena2011-01-011-1/+1
|
* - Avoid allocating extra buffers. This makes parsing with zend.multibyte ↵Moriyoshi Koizumi2010-12-201-2/+0
| | | | enabled as fast as with it disabled.
* * Refactor zend_multibyte facility.Moriyoshi Koizumi2010-12-191-2/+5
| | | | | Now mbstring.script_encoding is superseded by zend.script_encoding.
* Added multibyte suppport by default. Previosly php had to be compiled with ↵Dmitry Stogov2010-11-241-2/+0
| | | | --enable-zend-multibyte. Now it can be enabled or disabled throug zend.multibyte directive in php.ini
* sed -i "s#1998-2009#1998-2010#g" **/*.c **/*.h **/*.phpSebastian Bergmann2010-01-051-1/+1
|
* MFH: Bump copyright year, 3 of 3.Sebastian Bergmann2008-12-311-1/+1
|
* - Revived zend multibyteMoriyoshi Koizumi2008-07-241-4/+4
|
* implemented again zend-multibyte for PHP 5.3Rui Hirokawa2008-06-291-0/+16
|
* - Rewrite scanner to be based on re2c instead of flexMarcus Boerger2008-03-161-19/+9
| | | | | | | | | The full patch is available as: http://php.net/~helly/php-re2c-5.3-20080316.diff.txt This is against php-re2c repository version 98 An older patch against version 97 is available under: http://php.net/~helly/php-re2c-97-20080316.diff.txt
* MFH: Bump copyright year, 2 of 2.Sebastian Bergmann2007-12-311-1/+1
|
* MFH: Bump year.Sebastian Bergmann2007-01-011-1/+1
|
* - Update copyright notices to 2006Andi Gutmans2006-01-041-1/+1
|
* Bump up the yearfoobar2005-08-031-1/+1
|
* Nuke compile warning by using the LANG_SCNG macro insteadfoobar2004-01-171-1/+0
|