diff options
author | Roman Lebedev <lebedev.ri@gmail.com> | 2019-10-10 09:25:02 +0000 |
---|---|---|
committer | Roman Lebedev <lebedev.ri@gmail.com> | 2019-10-10 09:25:02 +0000 |
commit | 414670be4ec1c8599fce3a40f5dd4914e9317f55 (patch) | |
tree | 97f1b677136f1dc208a47d2bd78611cac8baaa22 /docs | |
parent | 1b1167f58f9cf653947777493c0fc4fbbaa103c5 (diff) | |
download | clang-414670be4ec1c8599fce3a40f5dd4914e9317f55.tar.gz |
[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4 When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
elements ([dcl.array]), the expressions P + J and J + P
(where J has the value j) point to the (possibly-hypothetical) array
element i+j of x if 0≤i+j≤n and the expression P - J points to the
(possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```
Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)
To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".
Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566
Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.
`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown
Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers
Reviewed By: rsmith
Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D67122
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/ReleaseNotes.rst | 45 | ||||
-rw-r--r-- | docs/UndefinedBehaviorSanitizer.rst | 3 |
2 files changed, 46 insertions, 2 deletions
diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst index d6c734a53a..8e9d298c95 100644 --- a/docs/ReleaseNotes.rst +++ b/docs/ReleaseNotes.rst @@ -60,6 +60,16 @@ Improvements to Clang's diagnostics Non-comprehensive list of changes in this release ------------------------------------------------- +* In both C and C++ (C17 ``6.5.6p8``, C++ ``[expr.add]``), pointer arithmetic is + only permitted within arrays. In particular, the behavior of a program is not + defined if it adds a non-zero offset (or in C, any offset) to a null pointer, + or if it forms a null pointer by subtracting an integer from a non-null + pointer, and the LLVM optimizer now uses those guarantees for transformations. + This may lead to unintended behavior in code that performs these operations. + The Undefined Behavior Sanitizer ``-fsanitize=pointer-overflow`` check has + been extended to detect these cases, so that code relying on them can be + detected and fixed. + - For X86 target, -march=skylake-avx512, -march=icelake-client, -march=icelake-server, -march=cascadelake, -march=cooperlake will default to not using 512-bit zmm registers in vectorized code unless 512-bit intrinsics @@ -238,7 +248,40 @@ Static Analyzer Undefined Behavior Sanitizer (UBSan) ------------------------------------ -- ... +- * The ``pointer-overflow`` check was extended added to catch the cases where + a non-zero offset is applied to a null pointer, or the result of + applying the offset is a null pointer. + + .. code-block:: c++ + + #include <cstdint> // for intptr_t + + static char *getelementpointer_inbounds(char *base, unsigned long offset) { + // Potentially UB. + return base + offset; + } + + char *getelementpointer_unsafe(char *base, unsigned long offset) { + // Always apply offset. UB if base is ``nullptr`` and ``offset`` is not + // zero, or if ``base`` is non-``nullptr`` and ``offset`` is + // ``-reinterpret_cast<intptr_t>(base)``. + return getelementpointer_inbounds(base, offset); + } + + char *getelementpointer_safe(char *base, unsigned long offset) { + // Cast pointer to integer, perform usual arithmetic addition, + // and cast to pointer. This is legal. + char *computed = + reinterpret_cast<char *>(reinterpret_cast<intptr_t>(base) + offset); + // If either the pointer becomes non-``nullptr``, or becomes + // ``nullptr``, we must use ``computed`` result. + if (((base == nullptr) && (computed != nullptr)) || + ((base != nullptr) && (computed == nullptr))) + return computed; + // Else we can use ``getelementpointer_inbounds()``. + return getelementpointer_inbounds(base, offset); + } + Core Analysis Improvements ========================== diff --git a/docs/UndefinedBehaviorSanitizer.rst b/docs/UndefinedBehaviorSanitizer.rst index 06d533f493..0f6a42a211 100644 --- a/docs/UndefinedBehaviorSanitizer.rst +++ b/docs/UndefinedBehaviorSanitizer.rst @@ -130,7 +130,8 @@ Available checks are: ``__builtin_object_size``, and consequently may be able to detect more problems at higher optimization levels. - ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which - overflows. + overflows, or where either the old or new pointer value is a null pointer + (or in C, when they both are). - ``-fsanitize=return``: In C++, reaching the end of a value-returning function without returning a value. - ``-fsanitize=returns-nonnull-attribute``: Returning null pointer |