Generate efficient code for rotation patterns.
This change adds code to recognize rotation idioms and generate efficient instructions for them.
Two new operators are added: GT_ROL and GT_ROR.
The patterns recognized:
(x << c1) | (x >>> c2) => x rol c1
(x >>> c1) | (x << c2) => x ror c2
where c1 and c2 are constant and c1 + c2 == bitsize(x)
(x << y) | (x >>> (N - y)) => x rol y
(x >>> y) | (x << (N - y)) => x ror y
where N == bitsize(x)
(x << y & M1) | (x >>> (N - y) & M2) => x rol y
(x >>> y & M1) | (x << (N - y) & M2) => x ror y
where N == bitsize(x)
M1 & (N - 1) == N - 1
M2 & (N - 1) == N - 1
For a simple benchmark with 4 rotation patterns in a tight loop
time goes from 7.324 to 2.600 (2.8 speedup).
Rotations found and optimized in mscorlib:
System.Security.Cryptography.SHA256Managed::RotateRight
System.Security.Cryptography.SHA384Managed::RotateRight
System.Security.Cryptography.SHA512Managed::RotateRight
System.Security.Cryptography.RIPEMD160Managed:MDTransform (320 instances!)
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Rol1
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Rol5
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Rol30
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Drain
(9 instances of Sha1ForNonSecretPurposes::Rol* inlined)
Closes #1619.
21 files changed: