From f190ed4355c246d1835bb22e3bc83ed24c77c83c Mon Sep 17 00:00:00 2001 From: Jingyue Wu Date: Wed, 30 Mar 2016 05:05:40 +0000 Subject: [PATCH] [docs] Add gpucc publication and tutorial. llvm-svn: 264839 --- llvm/docs/CompileCudaWithLLVM.rst | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/llvm/docs/CompileCudaWithLLVM.rst b/llvm/docs/CompileCudaWithLLVM.rst index 8e7ed0d..5ed3f14 100644 --- a/llvm/docs/CompileCudaWithLLVM.rst +++ b/llvm/docs/CompileCudaWithLLVM.rst @@ -53,7 +53,7 @@ How to Compile CUDA C/C++ with LLVM =================================== We assume you have installed the CUDA driver and runtime. Consult the `NVIDIA -CUDA installation Guide +CUDA installation guide `_ if you have not. @@ -167,10 +167,9 @@ customizable target-independent optimization pipeline. straight-line scalar optimizations `_. * **Inferring memory spaces**. `This optimization - `_ + `_ infers the memory space of an address so that the backend can emit faster - special loads and stores from it. Details can be found in the `design - document for memory space inference `_. + special loads and stores from it. * **Aggressive loop unrooling and function inlining**. Loop unrolling and function inlining need to be more aggressive for GPUs than for CPUs because @@ -201,6 +200,19 @@ customizable target-independent optimization pipeline. divides in our benchmarks have a divisor and dividend which fit in 32-bits at runtime. This optimization provides a fast path for this common case. +Publication +=========== + +| `gpucc: An Open-Source GPGPU Compiler `_ +| Jingyue Wu, Artem Belevich, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng, Robert Hundt +| *Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO 2016)* +| `Slides for the CGO talk `_ + +Tutorial +======== + +`CGO 2016 gpucc tutorial `_ + Obtaining Help ============== -- 2.7.4