libgomp: Fix hang when profiling OpenACC programs with CUDA 9.0 nvprof
authorKwok Cheung Yeung <kcy@codesourcery.com>
Tue, 14 Jul 2020 17:31:35 +0000 (10:31 -0700)
committerKwok Cheung Yeung <kcy@codesourcery.com>
Tue, 14 Jul 2020 17:31:35 +0000 (10:31 -0700)
commitb52643ab9004ba8ecea06a399885fe1e04183eda
treefe2f328c0be9fc9ee8eba66196c8202f8132102c
parentbae45b8be57b2a2c22bf45f3eeb1118c328ad028
libgomp: Fix hang when profiling OpenACC programs with CUDA 9.0 nvprof

The version of nvprof in CUDA 9.0 causes a hang when used to profile an
OpenACC program.  This is because it calls acc_get_device_type from
a callback called during device initialization, which then attempts
to acquire acc_device_lock while it is already taken, resulting in
deadlock.  This works around the issue by returning acc_device_none
from acc_get_device_type without attempting to acquire the lock when
initialization has not completed yet.

2020-07-14  Tom de Vries  <tom@codesourcery.com>
    Cesar Philippidis  <cesar@codesourcery.com>
    Thomas Schwinge  <thomas@codesourcery.com>
    Kwok Cheung Yeung  <kcy@codesourcery.com>

libgomp/
* oacc-init.c (acc_init_state_lock, acc_init_state, acc_init_thread):
New variable.
(acc_init_1): Set acc_init_thread to pthread_self ().  Set
acc_init_state to initializing at the start, and to initialized at the
end.
(self_initializing_p): New function.
(acc_get_device_type): Return acc_device_none if called by thread that
is currently executing acc_init_1.
* libgomp.texi (acc_get_device_type): Update documentation.
(Implementation Status and Implementation-Defined Behavior): Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-2.c: New.
libgomp/libgomp.texi
libgomp/oacc-init.c
libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-init-2.c [new file with mode: 0644]