[CUDA] Expand upon --cuda-gpu-arch flag in CompileCudaWithLLVM doc.

author Justin Lebar <jlebar@google.com>

Wed, 7 Sep 2016 20:09:46 +0000 (20:09 +0000)

committer Justin Lebar <jlebar@google.com>

Wed, 7 Sep 2016 20:09:46 +0000 (20:09 +0000)
author Justin Lebar <jlebar@google.com>
Wed, 7 Sep 2016 20:09:46 +0000 (20:09 +0000)
committer Justin Lebar <jlebar@google.com>
Wed, 7 Sep 2016 20:09:46 +0000 (20:09 +0000)
diff --git a/llvm/docs/CompileCudaWithLLVM.rst b/llvm/docs/CompileCudaWithLLVM.rst

index f57839c..85aab5d 100644 (file)
--- a/llvm/docs/CompileCudaWithLLVM.rst
+++ b/llvm/docs/CompileCudaWithLLVM.rst
@@ -119,6 +119,13 @@ your GPU <https://developer.nvidia.com/cuda-gpus>`_. For example, if you want
  to run your program on a GPU with compute capability of 3.5, you should specify
  ``--cuda-gpu-arch=sm_35``.
  
+Note: You cannot pass ``compute_XX`` as an argument to ``--cuda-gpu-arch``;
+only ``sm_XX`` is currently supported.  However, clang always includes PTX in
+its binaries, so e.g. a binary compiled with ``--cuda-gpu-arch=sm_30`` would be
+forwards-compatible with e.g. ``sm_35`` GPUs.
+
+You can pass ``--cuda-gpu-arch`` multiple times to compile for multiple archs.
+
  Detecting clang vs NVCC
  =======================
author	Justin Lebar <jlebar@google.com>
	Wed, 7 Sep 2016 20:09:46 +0000 (20:09 +0000)
committer	Justin Lebar <jlebar@google.com>
	Wed, 7 Sep 2016 20:09:46 +0000 (20:09 +0000)