diff options
author | Andrew Brower <monofuel@japura.net> | 2024-07-08 05:17:04 -0400 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-07-08 11:17:04 +0200 |
commit | dc46350fa192faf97f5d6ea3a05e41358a10a069 (patch) | |
tree | a3484b79f4049d5f06c676261a09eb195b0574c0 /doc | |
parent | 3f5016f60e3ce7fd5c2883cf65dbbc9fbdbf9300 (diff) | |
download | Nim-dc46350fa192faf97f5d6ea3a05e41358a10a069.tar.gz |
Add support for nvcc & hipcc (cuda/rocm) (#23805)
I've been working on making some basic cuda examples work, both with cuda (nvcc) and with AMD HIP (hipcc) https://github.com/monofuel/hippo - hipcc is just a drop-in replacement for clang and works out of the box with clang settings in Nim. hipcc is capable of compiling for AMD ROCm or to CUDA, depending on how HIP_PLATFORM is set. - nvcc is a little quirky. we can use `-x cu` to tell it to handle nim's `.cpp` files as if they were `.cu` files. nvcc expects all backend compiler flags to be wrapped with a special `-Xcompiler=""` flag when compiling and also when linking. I manually tested on a linux desktop with amd and a laptop with nvidia.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/nimc.md | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/doc/nimc.md b/doc/nimc.md index 25acf31e8..38558454b 100644 --- a/doc/nimc.md +++ b/doc/nimc.md @@ -481,6 +481,28 @@ They are: 5. nl_types. No headers for this. 6. As mmap is not supported, the nimAllocPagesViaMalloc option has to be used. +GPU Compilation +=============== + +Compiling for GPU computation can be achieved with `--cc:nvcc` for CUDA with nvcc, or with `--cc:hipcc` for AMD GPUs with HIP. Both compilers require building for C++ with `nim cpp`. + +Here's a very simple CUDA kernel example using emit, which can be compiled with `nim cpp --cc:nvcc --define:"useMalloc" hello_kernel.nim` assuming you have the CUDA toolkit installed. + +```nim +{.emit: """ +__global__ void add(int a, int b) { + int c; + c = a + b; +} +""".} + +proc main() = + {.emit: """ + add<<<1,1>>>(2,7); + """.} + +main() +``` DLL generation ============== |