[hip][cuda] Fix the extended lambda name mangling issue.

Summary: - HIP/CUDA host side needs to use device kernel symbol name to match the device side binaries. Without a consistent naming between host- and device-side compilations, it's risky that wrong device binaries are executed. Consistent naming is usually not an issue until unnamed types are used, especially the lambda. In this patch, the consistent name mangling is addressed for the extended lambdas, i.e. the lambdas annotated with `__device__`. - In [Itanium C++ ABI][1], the mangling of the lambda is generally unspecified unless, in certain cases, ODR rule is required to ensure consisent naming cross TUs. The extended lambda is such a case as its name may be part of a device kernel function, e.g., the extended lambda is used as a template argument and etc. Thus, we need to force ODR for extended lambdas as they are referenced in both device- and host-side TUs. Furthermore, if a extended lambda is nested in other (extended or not) lambdas, those lambdas are required to follow ODR naming as well. This patch revises the current lambda mangle numbering to force ODR from an extended lambda to all its parent lambdas. - On the other side, the aforementioned ODR naming should not change those lambdas' original linkages, i.e., we cannot replace the original `internal` with `linkonce_odr`; otherwise, we may violate ODR in general. This patch introduces a new field `HasKnownInternalLinkage` in lambda data to decouple the current linkage calculation based on mangling number assigned. [1]: https://itanium-cxx-abi.github.io/cxx-abi/abi.html Reviewers: tra, rsmith, yaxunl, martong, shafik Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D68818 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@375309 91177308-0d34-0410-b5e6-96231b3b80d8
author: Michael Liao <michael.hliao@gmail.com> 2019-10-19 00:15:19 +0000
committer: Michael Liao <michael.hliao@gmail.com> 2019-10-19 00:15:19 +0000
commit: 3e0834ac1faf66710d37619919234f28a3e43ffe (patch)
tree: 115375ef0eb7b5ace3a9ed4b4f522a81a74e78a2 /test
parent: e0c6af7e0c294c0d322c6ccab6757dda0fc5db81 (diff)
download: clang-3e0834ac1faf66710d37619919234f28a3e43ffe.tar.gz
1 files changed, 39 insertions, 0 deletions
diff --git a/test/CodeGenCUDA/unnamed-types.cu b/test/CodeGenCUDA/unnamed-types.cu
new file mode 100644
index 0000000000..81557817e4
--- /dev/null
+++ b/test/CodeGenCUDA/unnamed-types.cu
@@ -0,0 +1,39 @@
+// RUN: %clang_cc1 -std=c++11 -x hip -triple x86_64-linux-gnu -aux-triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s --check-prefix=HOST
+// RUN: %clang_cc1 -std=c++11 -x hip -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm %s -o - | FileCheck %s --check-prefix=DEVICE
+
+#include "Inputs/cuda.h"
+
+// HOST: @0 = private unnamed_addr constant [43 x i8] c"_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_\00", align 1
+
+__device__ float d0(float x) {
+  return [](float x) { return x + 2.f; }(x);
+}
+
+__device__ float d1(float x) {
+  return [](float x) { return x * 2.f; }(x);
+}
+
+// DEVICE: amdgpu_kernel void @_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_(
+template <typename F>
+__global__ void k0(float *p, F f) {
+  p[0] = f(p[0]) + d0(p[1]) + d1(p[2]);
+}
+
+void f0(float *p) {
+  [](float *p) {
+    *p = 1.f;
+  }(p);
+}
+
+// The inner/outer lambdas are required to be mangled following ODR but their
+// linkages are still required to keep the original `internal` linkage.
+
+// HOST: define internal void @_ZZ2f1PfENKUlS_E_clES_(
+// DEVICE: define internal float @_ZZZ2f1PfENKUlS_E_clES_ENKUlfE_clEf(
+void f1(float *p) {
+  [](float *p) {
+    k0<<<1,1>>>(p, [] __device__ (float x) { return x + 1.f; });
+  }(p);
+}
+// HOST: @__hip_register_globals
+// HOST: __hipRegisterFunction{{.*}}@_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_{{.*}}@0
author	Michael Liao <michael.hliao@gmail.com>	2019-10-19 00:15:19 +0000
committer	Michael Liao <michael.hliao@gmail.com>	2019-10-19 00:15:19 +0000
commit	3e0834ac1faf66710d37619919234f28a3e43ffe (patch)
tree	115375ef0eb7b5ace3a9ed4b4f522a81a74e78a2 /test
parent	e0c6af7e0c294c0d322c6ccab6757dda0fc5db81 (diff)
download	clang-3e0834ac1faf66710d37619919234f28a3e43ffe.tar.gz