delta/libxml2.git - gitlab.gnome.org: GNOME/libxml2.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	parser: Deprecate more internal functions	Nick Wellnhofer	2023-04-26	1	-0/+8
\|
*	parser: Fix regression in memory pull parser with encoding	Nick Wellnhofer	2023-04-19	1	-1/+11
\| \| \| \| \| \| \|	Revert another change from commit 98840d40. Decode the whole buffer when reading from memory and switching to the initial encoding. Add some comments about potential improvements.
*	parser: Fix regression when switching input encodings	Nick Wellnhofer	2023-04-13	1	-4/+12
\| \| \| \| \| \| \| \|	Revert some changes from commit 98840d40. WebKit/Chromium can actually switch from ISO-8859-1 to UTF-16 in the middle of parsing. This is a bad idea, but we have to keep supporting this use case.
*	parser: Don't grow push parser buffers	Nick Wellnhofer	2023-04-12	1	-0/+3
\| \| \| \| \|	This should fix a short-lived regression when push parsing with encodings.
*	parser: Halt parser if switching encodings fails	Nick Wellnhofer	2023-03-30	1	-0/+2
\| \| \| \| \| \|	Avoids buffer overread in htmlParseHTMLAttribute. Found by OSS-Fuzz.
*	parser: Fix buffer overread in xmlDetectEBCDIC	Nick Wellnhofer	2023-03-26	1	-1/+2
\| \| \| \|	Short-lived regression found by OSS-Fuzz.
*	parser: Grow input buffer earlier when reading characters	Nick Wellnhofer	2023-03-21	1	-2/+2
\| \| \| \|	Make more bytes available after invoking CUR_CHAR or NEXT.
*	parser: Rework EBCDIC code page detection	Nick Wellnhofer	2023-03-21	1	-108/+76
\| \| \| \| \| \| \| \| \| \|	To detect EBCDIC code pages, we used to switch the encoding twice and had to be very careful not to decode data after the XML declaration before the second switch. This relied on a hard-coded expected size of the XML declaration and was complicated and unreliable. Now we convert the first 200 bytes to EBCDIC-US and parse the encoding declaration manually.
*	parser: Rework shrinking of input buffers	Nick Wellnhofer	2023-03-21	1	-14/+2
\| \| \| \| \| \| \| \| \| \|	Don't try to grow the input buffer in xmlParserShrink. This makes sure that no memory allocations are made and the function always succeeds. Remove unnecessary invocations of SHRINK. Invoke SHRINK at the end of DTD parsing loops. Shrink before growing.
*	parser: More fixes to xmlParserGrow	Nick Wellnhofer	2023-03-16	1	-20/+5
\| \| \| \| \|	xmlHaltParser must be called after reporting an error. Switch to xmlBufSetInputBaseCur.
*	malloc-fail: Fix buffer overread when reading from input	Nick Wellnhofer	2023-03-15	1	-36/+25
\| \| \| \|	Found by OSS-Fuzz, see #344.
*	parser: Fix short-lived regression causing infinite loops	Nick Wellnhofer	2023-03-14	1	-9/+40
\| \| \| \| \|	Fix 3eb6bf03. We really have to halt the parser, so the input buffer gets reset.
*	parser: Deprecate some parser input functions	Nick Wellnhofer	2023-03-13	1	-0/+2
\|
*	parser: Stop calling xmlParserInputShrink	Nick Wellnhofer	2023-03-13	1	-0/+57
\| \| \| \| \|	Introduce xmlParserShrink which takes a parser context to simplify error handling.
*	malloc-fail: Fix null deref in xmlParserInputShrink	Nick Wellnhofer	2023-03-13	1	-0/+7
\| \| \| \|	Found by OSS-Fuzz.
*	parser: Stop calling xmlParserInputGrow	Nick Wellnhofer	2023-03-12	1	-10/+60
\| \| \| \| \|	Introduce xmlParserGrow which takes a parser context to simplify error handling.
*	malloc-fail: Fix null deref if growing input buffer fails	Nick Wellnhofer	2023-01-24	1	-0/+6
\| \| \| \| \| \|	Also add some error checks. Found with libFuzzer, see #344.
*	parser: Fix integer overflow of input ID	Nick Wellnhofer	2022-12-22	1	-1/+6
\| \| \| \| \| \| \|	Applies a patch from Chromium. Also stop incrementing input ID of subcontexts. This isn't necessary. Fixes #465.
*	entities: Stop counting entities	Nick Wellnhofer	2022-12-21	1	-1/+0
\| \| \| \|	This was only used in the old version of xmlParserEntityCheck.
*	entities: Rework entity amplification checks	Nick Wellnhofer	2022-12-21	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit implements robust detection of entity amplification attacks, better known as the "billion laughs" attack. We now limit the size of the document after substitution of entities to 10 times the size before expansion. This guarantees linear behavior by definition. There already was a similar check before, but the accounting of "sizeentities" (size of external entities) and "sizeentcopy" (size of all copies created by entity references) wasn't accurate. We also need saturation arithmetic since we're historically limited to "unsigned long" which is 32-bit on many platforms. A maximum of 10 MB of substitutions is always allowed. This should make use cases like DITA work which have caused problems in the past. The old checks based on the number of entities were removed. This is accounted for by adding a fixed cost to each entity reference. Entity amplification checks are now enabled even if XML_PARSE_HUGE is set. This option is mainly used to allow larger text nodes. Most users were unaware that it also disabled entity expansion checks. Some of the limits might be adjusted later. If this change turns out to affect legitimate use cases, we can add a separate parser option to disable the checks. Fixes #294. Fixes #345.
*	parser: Fix progress check when parsing character data	Nick Wellnhofer	2022-11-21	1	-1/+1
\| \| \| \|	Skip over zero bytes to guarantee progress. Short-lived regression.
*	parser: Fix 'consumed' accounting when switching encodings	Nick Wellnhofer	2022-11-20	1	-0/+1
\|
*	io: Fix a few integer overflows in I/O statistics	Nick Wellnhofer	2022-11-20	1	-4/+12
\| \| \| \| \|	There are still many places where arithmetic on "consumed" stats isn't checked for overflow, affecting platforms with a 32-bit long type.
*	io: Rearrange code in xmlSwitchInputEncodingInt	Nick Wellnhofer	2022-11-20	1	-104/+96
\| \| \| \|	No functional change.
*	io: Remove xmlInputReadCallbackNop	Nick Wellnhofer	2022-11-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	In some cases, for example when using encoders, the read callback was set to NULL, in other cases it was set to xmlInputReadCallbackNop. xmlGROW only tested for xmlInputReadCallbackNop, resulting in errors when parsing large encoded content from memory. Always use a NULL callback for memory buffers to avoid ambiguities. Fixes #262.
*	io: Check for memory buffer early in xmlParserInputGrow	Nick Wellnhofer	2022-11-13	1	-4/+4
\|
*	Remove or annotate char casts	Nick Wellnhofer	2022-09-01	1	-4/+4
\|
*	Remove explicit integer casts	Nick Wellnhofer	2022-09-01	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.
*	Make xmlNewSAXParserCtx take a const sax handler	Nick Wellnhofer	2022-09-01	1	-4/+5
\| \| \| \|	Also improve documentation.
*	Consolidate private header files	Nick Wellnhofer	2022-08-26	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
*	Mark more functions setting globals as deprecated	Nick Wellnhofer	2022-08-24	1	-0/+4
\|
*	Mark more parser functions as deprecated	Nick Wellnhofer	2022-08-24	1	-1/+16
\| \| \| \|	No compiler warnings generated yet.
*	Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt	Nick Wellnhofer	2022-08-24	1	-8/+55
\| \| \| \| \|	Add API functions to create a parser context with a custom SAX handler without having to mess with ctxt->sax manually.
*	Use xmlStrlen in xmlNewStringInputStream	Nick Wellnhofer	2022-08-20	1	-1/+1
\| \| \| \|	xmlStrlen handles buffers larger than INT_MAX more gracefully.
*	Create stream with buffer in xmlNewStringInputStream	Nick Wellnhofer	2022-08-20	1	-4/+11
\| \| \| \| \| \| \|	Create an input stream with a buffer in xmlNewStringInputStream. Otherwise, switching encodings won't work. See #34.
*	Clean up encoding switching code	Nick Wellnhofer	2022-04-02	1	-127/+23
\| \| \| \| \| \| \| \|	- Remove xmlSwitchToEncodingInt which was basically just a wrapper around xmlSwitchInputEncodingInt. - Simplify xmlSwitchEncoding. - Improve error handling in xmlSwitchInputEncodingInt. - Deprecate xmlSwitchInputEncoding.
*	Fix calls to deprecated init/cleanup functions	Nick Wellnhofer	2022-03-29	1	-1/+1
\| \| \| \|	Only use xmlInitParser/xmlCleanupParser.
*	Avoid arithmetic on freed pointers	Nick Wellnhofer	2022-03-06	1	-36/+9
\|
*	Remove unneeded #includes	Nick Wellnhofer	2022-03-04	1	-13/+0
\|
*	Don't check for standard C89 headers	Nick Wellnhofer	2022-03-02	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h
*	Remove useless __CYGWIN__ checks	Nick Wellnhofer	2022-02-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	From what I can tell, some really early Cygwin versions from around 1998-2000 used to erroneously define _WIN32. This was eventually fixed, but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is unnecessary. Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether to use __declspec.
*	Remove elfgcchack.h	Nick Wellnhofer	2022-02-20	1	-2/+0
\| \| \| \| \|	The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
*	Rework validation context flags	Nick Wellnhofer	2022-02-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use a bitmask instead of magic values to - keep track whether the validation context is part of a parser context - keep track whether xmlValidateDtdFinal was called This allows to add addtional flags later. Note that this deliberately changes the name of a public struct member, assuming that this was always private data never to be used by client code.
*	Fix memory leak in xmlNewInputFromFile	David King	2022-01-16	1	-1/+3
\| \| \| \| \| \|	Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
*	Fix slow parsing of HTML with encoding errors	Nick Wellnhofer	2021-02-20	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under certain circumstances, the HTML parser would try to guess and switch input encodings multiple times, leading to slow processing of documents with encoding errors. The repeated scanning of the input buffer when guessing encodings could even lead to quadratic behavior. The code htmlCurrentChar probably assumed that if there's an encoding handler, it is guaranteed to produce valid UTF-8. This holds true in general, but if the detected encoding was "UTF-8", the UTF8ToUTF8 encoding handler simply invoked memcpy without checking for invalid UTF-8. This still must be fixed, preferably by not using this handler at all. Also leave a note that switching encodings twice seems impossible to implement correctly. Add a check when handling UTF-8 encoding errors in htmlCurrentChar to avoid this situation, even if encoders produce invalid UTF-8. Found by OSS-Fuzz.
*	Stop counting nbChars in parser context	Nick Wellnhofer	2020-08-09	1	-6/+0
\| \| \| \|	The value was inaccurate and never used.
*	Fix typos	Nick Wellnhofer	2020-03-08	1	-3/+3
\| \| \| \|	Resolves #133.
*	Large batch of typo fixes	Jared Yanovich	2019-09-30	1	-4/+4
\| \| \| \|	Closes #109.
*	Fix memory leak in xmlSwitchInputEncodingInt error path	Nick Wellnhofer	2018-11-22	1	-0/+10
\| \| \| \|	Found by OSS-Fuzz.
*	Revert "Change calls to xmlCharEncInput to set flush false"	Nick Wellnhofer	2018-03-17	1	-1/+1
\| \| \| \| \| \| \|	This reverts commit 6e6ae5daa6cd9640c9a83c1070896273e9b30d14 which broke decoding of larger documents with ICU. See https://bugs.chromium.org/p/chromium/issues/detail?id=820163