elf: Fix data races in pthread_create and TLS access [BZ #19329]
DTV setup at thread creation (_dl_allocate_tls_init) is changed
to take the dlopen lock, GL(dl_load_lock). Avoiding data races
here without locks would require design changes: the map that is
accessed for static TLS initialization here may be concurrently
freed by dlclose. That use after free may be solved by only
locking around static TLS setup or by ensuring dlclose does not
free modules with static TLS, however currently every link map
with TLS has to be accessed at least to see if it needs static
TLS. And even if that's solved, still a lot of atomics would be
needed to synchronize DTV related globals without a lock. So fix
both bug 19329 and bug 27111 with a lock that prevents DTV setup
running concurrently with dlopen or dlclose.
_dl_update_slotinfo at TLS access still does not use any locks
so CONCURRENCY NOTES are added to explain the synchronization.
The early exit from the slotinfo walk when max_modid is reached
is not strictly necessary, but does not hurt either.
An incorrect acquire load was removed from _dl_resize_dtv: it
did not synchronize with any release store or fence and
synchronization is now handled separately at thread creation
and TLS access time.
There are still a number of racy read accesses to globals that
will be changed to relaxed MO atomics in a followup patch. This
should not introduce regressions compared to existing behaviour
and avoid cluttering the main part of the fix.
Not all TLS access related data races got fixed here: there are
additional races at lazy tlsdesc relocations see bug 27137.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>