[openmp] Use z_Linux_asm.S to provide __kmp_invoke_microtask with Clang for Windows/aarch64
When building for Windows aarch64, and not using the actual MSVC,
we can assemble gnu assembly files just fine, and the existing
correct implementation of __kmp_invoke_microtask is fully usable.
The C implementation of __kmp_invoke_microtask in
z_Windows_NT-586_util.cpp relies on unguaranteed assumptions about
the compiler behaviour - it does work currently on MSVC, but doesn't
necessarily on other compilers. That function uses an alloca to pass
parameters on the stack to the called functions.
There's no guarantee that the buffer allocated by alloca is exactly
at the bottom of the stack when doing the call; the compiler might
have left space for extra things to save on the stack there.
Additionally, when compiled with Clang with optimization, Clang
optimizes out the alloca and memcpy entirely. On the C language
level, they don't have any visible effect outside of the function
and thus can be omitted entirely.
This fixes calling microtasks with more than 6 parameters, in
builds for Windows/aarch64 with Clang.
Differential Revision: https://reviews.llvm.org/D137827