Fix implementation of OMP_THREAD_LIMIT
This change fixes the implementation of OMP_THREAD_LIMIT. The implementation of
this previously was not restricted to a contention group (but it should be,
according to the spec), and this is fixed here. A field is added to root thread
to store a counter of the threads in the contention group. An extra check is
added when reserving threads for a parallel region that checks this variable and
compares to threadlimit-var, which is implemented as a new global variable,
kmp_cg_max_nth. Associated settings changes were also made, and clean up of
comments that referred to OMP_THREAD_LIMIT, but should refer to the new
KMP_DEVICE_THREAD_LIMIT (added in an earlier patch).
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D35912
llvm-svn: 309319