diff options
author | Artem Belevich <tra@google.com> | 2017-09-07 18:14:32 +0000 |
---|---|---|
committer | Artem Belevich <tra@google.com> | 2017-09-07 18:14:32 +0000 |
commit | 6d4cb407f117d080e04cf6b8ca200ee01c7b502f (patch) | |
tree | f1c5487cceedd6e0475d2fd7593fd5f05523fe35 /lib/Headers/__clang_cuda_runtime_wrapper.h | |
parent | 82bebd88990fdf6c635b85522c366f1a3c535e24 (diff) | |
download | clang-6d4cb407f117d080e04cf6b8ca200ee01c7b502f.tar.gz |
[CUDA] Added rudimentary support for CUDA-9 and sm_70.
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.
On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.
Differential Revision: https://reviews.llvm.org/D37576
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@312734 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib/Headers/__clang_cuda_runtime_wrapper.h')
-rw-r--r-- | lib/Headers/__clang_cuda_runtime_wrapper.h | 9 |
1 files changed, 8 insertions, 1 deletions
diff --git a/lib/Headers/__clang_cuda_runtime_wrapper.h b/lib/Headers/__clang_cuda_runtime_wrapper.h index 931d44b696..b5b173cd0c 100644 --- a/lib/Headers/__clang_cuda_runtime_wrapper.h +++ b/lib/Headers/__clang_cuda_runtime_wrapper.h @@ -62,7 +62,7 @@ #include "cuda.h" #if !defined(CUDA_VERSION) #error "cuda.h did not define CUDA_VERSION" -#elif CUDA_VERSION < 7000 || CUDA_VERSION > 8000 +#elif CUDA_VERSION < 7000 || CUDA_VERSION > 9000 #error "Unsupported CUDA version!" #endif @@ -86,7 +86,11 @@ #define __COMMON_FUNCTIONS_H__ #undef __CUDACC__ +#if CUDA_VERSION < 9000 #define __CUDABE__ +#else +#define __CUDA_LIBDEVICE__ +#endif // Disables definitions of device-side runtime support stubs in // cuda_device_runtime_api.h #include "driver_types.h" @@ -94,6 +98,7 @@ #include "host_defines.h" #undef __CUDABE__ +#undef __CUDA_LIBDEVICE__ #define __CUDACC__ #include "cuda_runtime.h" @@ -105,7 +110,9 @@ #define __nvvm_memcpy(s, d, n, a) __builtin_memcpy(s, d, n) #define __nvvm_memset(d, c, n, a) __builtin_memset(d, c, n) +#if CUDA_VERSION < 9000 #include "crt/device_runtime.h" +#endif #include "crt/host_runtime.h" // device_runtime.h defines __cxa_* macros that will conflict with // cxxabi.h. |