summaryrefslogtreecommitdiff
path: root/xmlregexp.c
Commit message (Collapse)AuthorAgeFilesLines
* regexp: Add sanity check in xmlRegCalloc2Nick Wellnhofer2023-02-211-1/+2
| | | | | | | These arguments should be non-zero, but add a sanity check to avoid division by zero. Fixes #450.
* malloc-fail: Fix OOB read after xmlRegGetCounterNick Wellnhofer2023-02-171-0/+12
| | | | Found with libFuzzer, see #344.
* malloc-fail: Fix memory leak in xmlFAParseCharPropNick Wellnhofer2023-02-171-10/+16
| | | | Found with libFuzzer, see #344.
* malloc-fail: Fix leak of xmlRegAtomNick Wellnhofer2023-02-171-26/+55
| | | | Found with libFuzzer, see #344.
* malloc-fail: Fix memory leak in xmlRegexpCompileNick Wellnhofer2023-02-171-10/+8
| | | | Found with libFuzzer, see #344.
* malloc-fail: Fix memory leak after xmlRegNewStateNick Wellnhofer2023-02-171-73/+71
| | | | | | Invoke xmlRegNewState from xmlRegStatePush to simplify error handling. Found with libFuzzer, see #344.
* regexp: Simplify xmlRegAtomPushNick Wellnhofer2023-02-171-14/+5
|
* Consolidate private header filesNick Wellnhofer2022-08-261-2/+3
| | | | | | | | | | | Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
* Fix parsing of subtracted regex character classesNick Wellnhofer2022-04-231-1/+1
| | | | Fixes #370.
* Remove unneeded #includesNick Wellnhofer2022-03-041-7/+0
|
* Document support for the non-standard escape sequences.Damjan Jovanovic2022-03-021-17/+64
| | | | Support non-BMP code points in surrogate pairs of '\uXXXX\uXXXX'.
* Use strtoul() instead of sscanf, and correct data types that break GCC.Damjan Jovanovic2022-03-021-1/+1
|
* Add support for some non-standard escapes in regular expressions.Damjan Jovanovic2022-03-021-1/+20
| | | | | | | | | | | | | This adds support for some non-standard escape sequences observed in Microsoft's MSXML DLLs and used by Windows apps, and thus needed by Wine. Some are also used in other XML implementations, eg. Java's. This isn't intended to be final. We probably wish to toggle these non-standard escape sequences on and off somehow, as needed by the caller. Further discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/260
* Don't check for standard C89 headersNick Wellnhofer2022-03-021-2/+1
| | | | | | | | | | | | | | | | | | | | Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h
* Fix certain combinations of regex range quantifiersNick Wellnhofer2022-02-281-5/+6
| | | | | | | | | Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html
* Fix range quantifier on subregexNick Wellnhofer2022-02-281-3/+3
| | | | | | | Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.
* Remove elfgcchack.hNick Wellnhofer2022-02-201-2/+1
| | | | | The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
* Patch to forbid epsilon-reduction of final statesArne Becker2021-07-061-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | When building the internal representation of a regexp, it is possible that a lot of empty transitions are created. Therefore there is a step to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions. There is an error there for this case: * State 1 has a transition with an atom (in this case "a") to state 2. * State 2 is final and has an epsilon transition to state 1. After reduction it looked like: * State 1 has a transition with an atom (in this case "a") to itself and is final. In other words, the empty string is accepted when it shouldn't be. The attached patch skips the reduction step for final states. An alternative would be to insert or increment counters when reducing a final state, but this seemed error prone and unnecessary, since there aren't that many final states. Fixes #282
* Fix caret in regexp character groupNick Wellnhofer2020-10-251-16/+13
| | | | | | | | Apply Per Hedeland's patch from https://bugzilla.gnome.org/show_bug.cgi?id=779751 Fixes #188.
* Fix exponential runtime in xmlFARecurseDeterminismNick Wellnhofer2020-07-311-1/+25
| | | | | | | | | | | | | | | | | | | | | In order to prevent visiting a state twice, states must be marked as visited for the whole duration of graph traversal because states might be reached by different paths. Otherwise state graphs like the following can lead to exponential runtime: ->O-->O-->O-->O-->O-> \ / \ / \ / \ / O O O O Reset the "visited" flag only after the graph was traversed. xmlFAComputesDeterminism still has massive performance problems when handling fuzzed input. By design, it has quadratic time complexity in the number of reachable states. Some issues might also stem from redundant epsilon transitions. With this fix, fuzzing regexes with a maximum length of 100 becomes feasible at least. Found with libFuzzer.
* Limit regexp nesting depthNick Wellnhofer2020-07-061-0/+8
| | | | | | | Enforce a maximum nesting depth of 50 for regular expressions. Avoids stack overflows with deeply nested regexes. Found by OSS-Fuzz.
* Report error for invalid regexp quantifiersNick Wellnhofer2020-07-021-0/+3
|
* Fix integer overflow in xmlFAParseQuantExactNick Wellnhofer2020-06-251-2/+13
| | | | Found by OSS-Fuzz.
* Fix typosNick Wellnhofer2020-03-081-3/+3
| | | | Resolves #133.
* Check for overflow when allocating two-dimensional arraysNick Wellnhofer2020-01-021-9/+37
| | | | Found by lgtm.com
* Remove useless comparisonsNick Wellnhofer2020-01-021-2/+2
| | | | Found by lgtm.com
* Large batch of typo fixesJared Yanovich2019-09-301-32/+32
| | | | Closes #109.
* Fix RegextestsNick Wellnhofer2019-09-251-1/+1
| | | | | | - One of the bug316338 test cases is expected to succeed. - Memory leak in testRegexp.c. - Refcount handling in xmlExpHashGetEntry.
* Fix empty branch in regexNick Wellnhofer2019-09-251-7/+7
| | | | | | | Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.
* Fix Schema determinism check of ##other namespacesNick Wellnhofer2019-09-161-3/+12
| | | | | | | Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.
* Fix memory leak in xmlRegEpxFromParsezhouzhongyuan2019-09-131-0/+2
| | | | Merge request !39
* Fix null deref in xmlregexp error pathNick Wellnhofer2019-03-051-0/+2
| | | | Thanks to Shaobo He for the report.
* Fix -Wimplicit-fallthrough warningsJ. Peter Mugaas2017-10-211-0/+5
| | | | | Add "falls through" comments to quench implicit-fallthrough warnings which are enabled by -Wextra under GCC 7.
* Heap-buffer-overflow read of size 1 in xmlFAParsePosCharGroupDavid Kilzer2017-07-041-1/+1
| | | | | | | | | Credit to OSS-Fuzz. Add a check to xmlFAParseCharRange() for the end of the buffer to prevent reading past the end of it. This fixes Bug 784017.
* Fix NULL pointer deref in xmlFAParseCharClassEscNick Wellnhofer2017-07-041-1/+2
| | | | Found with libFuzzer.
* Fix undefined behavior in xmlRegExecPushStringInternalNick Wellnhofer2017-06-011-2/+3
| | | | It's stupid, but the behavior of memcpy(NULL, NULL, 0) is undefined.
* Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup ↵CVE-2016-1840Pranjal Jumde2016-05-231-1/+2
| | | | | | | | | <https://bugzilla.gnome.org/show_bug.cgi?id=757711> * xmlregexp.c: (xmlFAParseCharRange): Only advance to the next character if there is no error. Advancing to the next character in case of an error while parsing regexp leads to an out of bounds access.
* Fix an error with regexp on nullable counted char transitionDaniel Veillard2016-05-091-4/+9
| | | | | This is the first of the two issues raised by Pete Cordell in https://mail.gnome.org/archives/xml/2016-April/msg00030.html
* Fix typos: dictio{ nn -> n }ar{y,ies}Jan Pokorný2016-04-151-2/+2
| | | | Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
* Avoid Double Null CheckGaurav2014-05-091-2/+0
| | | | | Cleanup For https://bugzilla.gnome.org/show_bug.cgi?id=729851
* Fix potential NULL pointer dereferences in regexp codeGaurav2013-09-111-3/+5
| | | | | | https://bugzilla.gnome.org/show_bug.cgi?id=707749 Fix 3 cases where we might dereference NULL
* Fix spelling of "length".Michael Wood2012-10-301-3/+3
|
* Big space and tab cleanupDaniel Veillard2012-09-111-116/+116
| | | | Remove all space before tabs and space and tabs at end of lines.
* Avoid a potential infinite recursionDaniel Veillard2012-08-271-0/+5
| | | | | Which can happen when eliminating epsilon transitions, as reported by Pavel Madr <pmadr@opentext.com>
* Fix a segfault on XSD validation on pattern errorDaniel Veillard2012-08-171-1/+7
| | | | | | | | | | | As reported by Sven <sven@e7o.de>: The following pattern will cause a segmentation fault in my Apache (using PHP5 to validate a XML against a XSD): <xs:pattern value="(.*)|"/> Fix a cascade of error handling failures which led to the crash in that scenario.
* undef ERROR if already definedPatrick R. Gansterer2012-05-101-0/+3
|
* Fix broken escape behaviour in regexp rangesDaniel Veillard2010-03-151-0/+11
|
* Fix a Relaxng bug raised by libvirt test suiteDaniel Veillard2009-09-231-5/+6
| | | | | | | | * xmlregexp.c: other fixes in 2.7.4 raised this internal error when comparing ranges, this affects among others detection of the determinism * test/relaxng/libvirt* result/relaxng/libvirt*: add a test case based on libvirt schemas and tests
* Release of libxml2-2.7.4v2.7.4Daniel Veillard2009-09-101-1/+1
| | | | | | | | * configure.in: new version * libxml.spec.in: cleanup * xmlregexp.c: fix a comment * doc/apibuild.py: update * doc/*: regenerate everything
* Chasing dead assignments reported by clang-scanDaniel Veillard2009-09-071-1/+3
| | | | | | | | * SAX2.c dict.c error.c hash.c nanohttp.c parser.c python/libxml.c relaxng.c runtest.c tree.c valid.c xinclude.c xmlregexp.c xmlsave.c xmlschemas.c xpath.c xpointer.c: mostly removing unneded affectations, but this led to a few real bugs and some part not yet understood (relaxng/interleave)