review.tizen.org Git - platform/upstream/llvm.git/commit

author	Johannes Doerfert <johannes@jdoerfert.de>
	Sun, 16 Aug 2020 15:49:37 +0000 (10:49 -0500)
committer	Johannes Doerfert <johannes@jdoerfert.de>
	Sun, 16 Aug 2020 19:38:33 +0000 (14:38 -0500)
commit	aa27cfc1e7d7456325e951a4ba3ced405027f7d0
tree	d83bd80ef78294c169876b31b1f323e6cca6da5c	tree \| snapshot
parent	95a25e4c3203f35e9f57f9fac620b4a21bffd6e1	commit \| diff

[OpenMP][CUDA] Cache the maximal number of threads per block (per kernel)

Instead of calling `cuFuncGetAttribute` with
`CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` for every kernel invocation,
we can do it for the first one and cache the result as part of the
`KernelInfo` struct. The only functional change is that we now expect
`cuFuncGetAttribute` to succeed and otherwise propagate the error.
Ignoring any error seems like a slippery slope...

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D86038