Make BitScanForward/BitScanForward64 PAL wrappers branchless. (dotnet/coreclr#20412)
The BitScanForward/BitScanForward64 wrapper functions from the PAL and
gcenv have been modified so they're faster (and branchless), while also
adhering more closely to the behavior of the MSVC intrinsics.
Use _BitScanForward64 when targeting 64-bit Windows.
The _WIN32 macro is always defined by MSVC, even when targeting 64-bit
versions of Windows. Use the _WIN64 macro instead to check whether the
build is targeting 64-bit Windows, and if so, use the _BitScanForward64
intrinsic for the BitScanForward64 wrapper instead of the 32-bit-based
fallback.
Commit migrated from https://github.com/dotnet/coreclr/commit/
6fe7effad7fddf8d5dc0b3ac3d5be5ec80e158ff