Fix crashing bug in tokenizer, when tokenizing files with non-ASCII bytes

but without a specified encoding: decoding_fgets() (and decoding_feof()) can return NULL and fiddle with the 'tok' struct, making tok->buf NULL. This is okay in the other cases of calls to decoding_*(), it seems, but not in this one. This should get a test added, somewhere, but the testsuite doesn't seem to test encoding anywhere (although plenty of tests use it.) It seems to me that decoding errors in other places in the code (like at the start of a token, instead of in the middle of one) make the code end up adding small integers to NULL pointers, but happen to check for error states before using the calculated new pointers. I haven't been able to trigger any other crashes, in any case. I would nominate this file for a comlete rewrite for Py3k. The whole decoding trick is too bolted-on for my tastes.
author: Thomas Wouters <thomas@python.org> 2006-03-02 20:41:27 +0000
committer: Thomas Wouters <thomas@python.org> 2006-03-02 20:41:27 +0000
commit: 12cddb15fb385277fcce6c46489a0ed7d129014e (patch)
tree: 5b557089f90c6ff1108e363615a3a3b267c14857 /Parser
parent: 76caa56ed4f309ec85ed408152fa85a1262e157a (diff)
download: cpython-12cddb15fb385277fcce6c46489a0ed7d129014e.tar.gz
1 files changed, 5 insertions, 0 deletions
diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c
index 4174e9cc1d..3c8258832e 100644
--- a/Parser/tokenizer.c
+++ b/Parser/tokenizer.c
@@ -873,6 +873,11 @@ tok_nextc(register struct tok_state *tok)
 				if (decoding_fgets(tok->inp,
 					       (int)(tok->end - tok->inp),
 					       tok) == NULL) {
+					/* Break out early on decoding
+					   errors, as tok->buf will be NULL
+					 */
+					if (tok->decoding_erred)
+						return EOF;
 					/* Last line does not end in \n,
 					   fake one */
 					strcpy(tok->inp, "\n");
author	Thomas Wouters <thomas@python.org>	2006-03-02 20:41:27 +0000
committer	Thomas Wouters <thomas@python.org>	2006-03-02 20:41:27 +0000
commit	12cddb15fb385277fcce6c46489a0ed7d129014e (patch)
tree	5b557089f90c6ff1108e363615a3a3b267c14857 /Parser
parent	76caa56ed4f309ec85ed408152fa85a1262e157a (diff)
download	cpython-12cddb15fb385277fcce6c46489a0ed7d129014e.tar.gz