summaryrefslogtreecommitdiff
path: root/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [c++20] Add CXXRewrittenBinaryOperator to represent a comparisonRichard Smith2019-10-194-0/+13
| | | | | | | | operator that is rewritten as a call to multiple other operators. No functionality change yet: nothing creates these expressions. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375305 91177308-0d34-0410-b5e6-96231b3b80d8
* DebugInfo: Render the canonical name of a class template specialization, ↵David Blaikie2019-10-181-1/+3
| | | | | | | | even when nested in another class template specialization Differential Revision: https://reviews.llvm.org/D63031 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375304 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP50]Add support for master taskloop simd.Alexey Bataev2019-10-185-0/+25
| | | | | | Added trsing/semantics/codegen for combined construct master taskloop simd. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375255 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP]Improve use of the global tid parameter.Alexey Bataev2019-10-171-9/+14
| | | | | | | | If we can determined, that the global tid parameter can be used in the function, better to use it rather than calling __kmpc_global_thread_num function. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375134 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP]Fix thread id passed to outlined region in sequential parallelAlexey Bataev2019-10-171-5/+3
| | | | | | | | | regions. The real global thread id must be passed to the outlined region instead of the zero thread id. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375119 91177308-0d34-0410-b5e6-96231b3b80d8
* [OpenCL] Preserve addrspace in CGClass (PR43145)Sven van Haastregt2019-10-171-2/+5
| | | | | | | | | | PR43145 revealed two places where Clang was attempting to create a bitcast without considering the address space of class types during C++ class code generation. Differential Revision: https://reviews.llvm.org/D68403 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375118 91177308-0d34-0410-b5e6-96231b3b80d8
* Reland: Dead Virtual Function EliminationOliver Stannard2019-10-174-42/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375094 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert Tag CFI-generated data structures with "#pragma clang section" ↵Dmitry Mikulin2019-10-173-30/+5
| | | | | | | | attributes. This reverts r375022 (git commit e2692b3bc0327606748b6d291b9009d2c845ced5) git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375069 91177308-0d34-0410-b5e6-96231b3b80d8
* Tag CFI-generated data structures with "#pragma clang section" attributes.Dmitry Mikulin2019-10-163-5/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D68808 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375022 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP]Use different addresses for zeroed thread_id/bound_id.Alexey Bataev2019-10-162-19/+26
| | | | | | | | When the parallel region is called directly in the sequential region, the zeroed tid/bound id are used. But they must point to the different memory locations as the parameters are marked as noalias. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375017 91177308-0d34-0410-b5e6-96231b3b80d8
* [DWARF5] Added support for DW_AT_noreturn attribute to be emitted forAdrian Prantl2019-10-161-0/+2
| | | | | | | | | | C++ class member functions. Patch by Sourabh Singh Tomar! Differential Revision: https://reviews.llvm.org/D68697 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375012 91177308-0d34-0410-b5e6-96231b3b80d8
* CGDebugInfo - silence static analyzer dyn_cast<> null dereference warnings. ↵Simon Pilgrim2019-10-161-4/+5
| | | | | | | | NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374989 91177308-0d34-0410-b5e6-96231b3b80d8
* CGExprConstant - silence static analyzer getAs<> null dereference warning. NFCI.Simon Pilgrim2019-10-161-2/+2
| | | | | | The static analyzer is warning about a potential null dereference, but in these cases we should be able to use castAs<> directly and if not assert will fire for us. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374988 91177308-0d34-0410-b5e6-96231b3b80d8
* CGBuiltin - silence static analyzer getAs<> null dereference warnings. NFCI.Simon Pilgrim2019-10-161-5/+4
| | | | | | The static analyzer is warning about potential null dereferences, but in these cases we should be able to use castAs<> directly and if not assert will fire for us. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374987 91177308-0d34-0410-b5e6-96231b3b80d8
* [Clang][OpenMP Offload] Move offload registration code to the wrapperSergey Dmitriev2019-10-153-191/+7
| | | | | | | | | | The final list of OpenMP offload targets becomes known only at the link time and since offload registration code depends on the targets list it makes sense to delay offload registration code generation to the link time instead of adding it to the host part of every fat object. This patch moves offload registration code generation from clang to the offload wrapper tool. This is the last part of the OpenMP linker script elimination patch https://reviews.llvm.org/D64943 Differential Revision: https://reviews.llvm.org/D68746 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374937 91177308-0d34-0410-b5e6-96231b3b80d8
* Added support for "#pragma clang section relro=<name>"Dmitry Mikulin2019-10-152-0/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D68806 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374934 91177308-0d34-0410-b5e6-96231b3b80d8
* CFI: wrong type passed to llvm.type.test with multiple inheritance ↵Dmitry Mikulin2019-10-151-1/+1
| | | | | | | | devirtualization. Differential Revision: https://reviews.llvm.org/D67985 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374909 91177308-0d34-0410-b5e6-96231b3b80d8
* [Concepts] Concept Specialization ExpressionsSaar Raz2019-10-151-0/+4
| | | | | | | | | | | | Part of C++20 Concepts implementation effort. Added Concept Specialization Expressions that are created when a concept is refe$ D41217 on Phabricator. (recommit after fixing failing Parser test on windows) git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374903 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert 374882 "[Concepts] Concept Specialization Expressions"Nico Weber2019-10-151-4/+0
| | | | | | | | | | This reverts commit ec87b003823d63f3342cf648f55a134c1522e612. The test fails on Windows, see e.g. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11533/steps/stage%201%20check/logs/stdio Also revert follow-up r374893. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374899 91177308-0d34-0410-b5e6-96231b3b80d8
* [Alignment] Migrate Attribute::getWith(Stack)AlignmentGuillaume Chatelet2019-10-151-2/+2
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Reviewed By: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68792 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374884 91177308-0d34-0410-b5e6-96231b3b80d8
* [Concepts] Concept Specialization ExpressionsSaar Raz2019-10-151-0/+4
| | | | | | | | | Part of C++20 Concepts implementation effort. Added Concept Specialization Expressions that are created when a concept is referenced with arguments, and tests thereof. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374882 91177308-0d34-0410-b5e6-96231b3b80d8
* [WebAssembly] Trapping fptoint builtins and intrinsicsThomas Lively2019-10-151-0/+20
| | | | | | | | | | | | | | | | | | | | | Summary: The WebAssembly backend lowers fptoint instructions to a code sequence that checks for overflow to avoid traps because fptoint is supposed to be speculatable. These new builtins and intrinsics give users a way to depend on the trapping semantics of the underlying instructions and avoid the extra code generated normally. Patch by coffee and tlively. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68902 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374856 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "Dead Virtual Function Elimination"Jorge Gorbe Moya2019-10-144-126/+42
| | | | | | This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374844 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP50]Add support for 'parallel master taskloop' construct.Alexey Bataev2019-10-145-1/+33
| | | | | | | | | Added parsing/sema/codegen support for 'parallel master taskloop' constructs. Some of the clauses, like 'grainsize', 'num_tasks', 'final' and 'priority' are not supported in full, only constant expressions can be used currently in these clauses. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374791 91177308-0d34-0410-b5e6-96231b3b80d8
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-141-6/+8
| | | | | | | | Reapply r374240 with fix for Ocaml test, namely Bindings/OCaml/core.ml. Differential Revision: https://reviews.llvm.org/D61675 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374782 91177308-0d34-0410-b5e6-96231b3b80d8
* Improve __builtin_constant_p loweringJoerg Sonnenberger2019-10-131-4/+0
| | | | | | | | | | | | | | | | | __builtin_constant_p used to be short-cut evaluated to false when building with -O0. This is undesirable as it means that constant folding in the front-end can give different results than folding in the back-end. It can also create conditional branches on constant conditions that don't get folded away. With the pending improvements to the llvm.is.constant handling on the LLVM side, the short-cut is no longer useful. Adjust various codegen tests to not depend on the short-cut or the backend optimisations. Differential Revision: https://reviews.llvm.org/D67638 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374742 91177308-0d34-0410-b5e6-96231b3b80d8
* remove an useless allocation found by scan-build - the new Dead nested ↵Sylvestre Ledru2019-10-121-2/+2
| | | | | | assignment check git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374659 91177308-0d34-0410-b5e6-96231b3b80d8
* Reland r374450 with Richard Smith's comments and test fixed.Erich Keane2019-10-116-32/+14
| | | | | | | | | | The behavior from the original patch has changed, since we're no longer allowing LLVM to just ignore the alignment. Instead, we're just assuming the maximum possible alignment. Differential Revision: https://reviews.llvm.org/D68824 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374562 91177308-0d34-0410-b5e6-96231b3b80d8
* Dead Virtual Function EliminationOliver Stannard2019-10-114-42/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374539 91177308-0d34-0410-b5e6-96231b3b80d8
* Insert module constructors in a module passVitaly Buka2019-10-111-2/+11
| | | | | | | | | | | | | | | | | | | | | Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 llvm-svn: 374481 Signed-off-by: Vitaly Buka <vitalybuka@google.com> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374527 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert 374481 "[tsan,msan] Insert module constructors in a module pass"Nico Weber2019-10-111-11/+2
| | | | | | | CodeGen/sanitizer-module-constructor.c fails on mac and windows, see e.g. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11424 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374503 91177308-0d34-0410-b5e6-96231b3b80d8
* [tsan,msan] Insert module constructors in a module passVitaly Buka2019-10-101-2/+11
| | | | | | | | | | | | | | | | | | Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374481 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert 374450 "Fix __builtin_assume_aligned with too large values."Nico Weber2019-10-106-10/+31
| | | | | | | | | | | | | The test fails on Windows, with error: 'warning' diagnostics expected but not seen: File builtin-assume-aligned.c Line 62: requested alignment must be 268435456 bytes or smaller; assumption ignored error: 'warning' diagnostics seen but not expected: File builtin-assume-aligned.c Line 62: requested alignment must be 8192 bytes or smaller; assumption ignored git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374456 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix __builtin_assume_aligned with too large values.Erich Keane2019-10-106-31/+10
| | | | | | | | | | | | | | Code to handle __builtin_assume_aligned was allowing larger values, but would convert this to unsigned along the way. This patch removes the EmitAssumeAligned overloads that take unsigned to do away with this problem. Additionally, it adds a warning that values greater than 1 <<29 are ignored by LLVM. Differential Revision: https://reviews.llvm.org/D68824 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374450 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP50]Support for 'master taskloop' directive.Alexey Bataev2019-10-105-0/+23
| | | | | | Added full support for master taskloop directive. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374437 91177308-0d34-0410-b5e6-96231b3b80d8
* Re-land "Use -fdebug-compilation-dir to form absolute paths in coverage ↵Reid Kleckner2019-10-102-9/+24
| | | | | | | | | | | | mappings" This reverts r374324 (git commit 62808631acceaa8b78f8ab9b407eb6b943ff5f77) I changed the test to not rely on finding the sequence "clang, test, CoverageMapping" in the CWD used to run the test. Instead it makes its own internal directory hierarchy of foo/bar/baz and looks for that. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374403 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP50]Support for declare variant directive for NVPTX target.Alexey Bataev2019-10-106-6/+92
| | | | | | | NVPTX does not support global aliases. Instead, we have to copy the full body of the variant function for the original function. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374387 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator"Dmitri Gribenko2019-10-101-8/+6
| | | | | | | This reverts commit r374240. It broke OCaml tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19014 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374354 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "Use -fdebug-compilation-dir to form absolute paths in coverage mappings"Kadir Cetinkaya2019-10-102-24/+9
| | | | | | | | | | | | This reverts commit f6777964bde28c349d3e289ea37ecf5f5eeedbc4. Because the absolute path check relies on temporary path containing "clang", "test" and "CoverageMapping" as a subsequence, which is not necessarily true on all systems(breaks internal integrates). Wanted to fix it by checking for a leading "/" instead, but then noticed that it would break windows tests, so leaving it to the author instead. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374324 91177308-0d34-0410-b5e6-96231b3b80d8
* [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined ↵Roman Lebedev2019-10-101-50/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.) To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * https://github.com/google/filament/pull/1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8
* Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"Sjoerd Meijer2019-10-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | This was further discussed at the llvm dev list: http://lists.llvm.org/pipermail/llvm-dev/2019-October/135602.html I think the brief summary of that is that this change is an improvement, this is the behaviour that we expect and promise in ours docs, and also as a result there are cases where we now emit diagnostics whereas before pragmas were silently ignored. Two areas where we can improve: 1) the diagnostic message itself, and 2) and in some cases (e.g. -Os and -Oz) the vectoriser is (quite understandably) not triggering. Original commit message: Specifying the vectorization width was supposed to implicitly enable vectorization, except that it wasn't really doing this. It was only setting the vectorize.width metadata, but not vectorize.enable. This should fix PR27643. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374288 91177308-0d34-0410-b5e6-96231b3b80d8
* Use -fdebug-compilation-dir to form absolute paths in coverage mappingsReid Kleckner2019-10-102-9/+24
| | | | | | | | | | | | | This allows users to explicitly request relative paths with `-fdebug-compilation-dir .`. Fixes PR43614 Reviewers: vsk, arphaman Differential Revision: https://reviews.llvm.org/D68733 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374266 91177308-0d34-0410-b5e6-96231b3b80d8
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-091-6/+8
| | | | | | | | Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374240 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP50]Fix scoring of contexts with and without user provided scores.Alexey Bataev2019-10-091-2/+2
| | | | | | | The context selector with user provided score must have higher score than the context selector without user provided score. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374224 91177308-0d34-0410-b5e6-96231b3b80d8
* [WebAssembly] Add builtin and intrinsic for v8x16.swizzleThomas Lively2019-10-091-0/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: This clang builtin and corresponding LLVM intrinsic are necessary to expose the exact semantics of the underlying WebAssembly instruction to users. LLVM produces a poison value if the dynamic swizzle indices are greater than the vector size, but the WebAssembly instruction sets the corresponding output lane to zero. Users who depend on this behavior can safely use this builtin. Depends on D68527. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68531 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374189 91177308-0d34-0410-b5e6-96231b3b80d8
* [DebugInfo] Enable call site debug info for ARM and AArch64Nikola Prica2019-10-091-2/+1
| | | | | | | | | | | | | | | | | ARM and AArch64 SelectionDAG support for tacking parameter forwarding register is implemented so we can allow clang invocations for those two targets. Beside that restrict debug entry value support to be emitted for LimitedDebugInfo info and FullDebugInfo. Other types of debug info do not have functions nor variables debug info. Reviewers: aprantl, probinson, dstenb, vsk Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D67004 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374153 91177308-0d34-0410-b5e6-96231b3b80d8
* [IRGen] Emit lifetime markers for temporary struct allocasFrancis Visoiu Mistrih2019-10-081-0/+22
| | | | | | | | | | | | | | When passing arguments using temporary allocas, we need to add the appropriate lifetime markers so that the stack coloring passes can re-use the stack space. This patch keeps track of all the lifetime.start calls emited before the codegened call, and adds the corresponding lifetime.end calls after the call. Differential Revision: https://reviews.llvm.org/D68611 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374126 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix crash or wrong code bug if a lifetime-extended temporary contains aRichard Smith2019-10-081-7/+6
| | | | | | | | | | | | | | "non-constant" value. If the constant evaluator evaluates part of a variable initializer, including the initializer for some lifetime-extended temporary, but fails to fully evaluate the initializer, it can leave behind wrong values for temporaries encountered in that initialization. Don't try to emit those from CodeGen! Instead, look at the values that constant evaluation produced if (and only if) it actually succeeds and we're emitting the lifetime-extending declaration's initializer as a constant. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374119 91177308-0d34-0410-b5e6-96231b3b80d8
* [OPENMP50]Multiple vendors in vendor context must be treated as logicalAlexey Bataev2019-10-081-1/+2
| | | | | | | | | | | | | and of vendors, not or. If several vendors are provided in the same vendor context trait, the context shall match only if all vendors are matching, not one of them. This is per OpenMP 5.0, 2.3.3 Matching and Scoring Context Selectors, all selectors in the construct, device, and implementation sets of the context selector appear in the corresponding trait set of the OpenMP context. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374107 91177308-0d34-0410-b5e6-96231b3b80d8
* [BPF] do compile-once run-everywhere relocation for bitfieldsYonghong Song2019-10-083-3/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void *, unsigned, const void *); int field_read(struct s *arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void *)arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = *(unsigned char *)((void *)arg + offset); break; case 2: ull = *(unsigned short *)((void *)arg + offset); break; case 4: ull = *(unsigned int *)((void *)arg + offset); break; case 8: ull = *(unsigned long long *)((void *)arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = *(u64 *)(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = *(u64 *)(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = *(u16 *)(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = *(u64 *)(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = *(u8 *)(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = *(u32 *)(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = *(u32 *)(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@374099 91177308-0d34-0410-b5e6-96231b3b80d8