Improve regex compiler / source generator for sets (#84370)
* Fix downlevel builds with a project reference to regex generator
* Improve char class canonicalization for complete and almost empty sets
- Remove categories from a set whose ranges make it already complete (when there's no subtraction). We have code paths that explicitly recognize the Any char class, and these extra categories knock these sets off those fast paths.
- Remove categories from a set where a single char is missing from the ranges, by checking whether that char is contained in the categories. If the char is present, the set can be morphed into Any. If the char isn't present, the categories can be removed and the set becomes a standard NotOne form.
Both of these are unlikely to be written explicitly by a developer but result from analysis producing search sets, in particular when alternations or nullable loops are involved.
Also fixed textual description of sets that both contain the last character (\uFFFF) and have categories. We were sometimes skipping the first category in this case. This is only relevant to the source generator, as these descriptions are output in comments.
* Avoid using a IndexOf for the any set
We needn't search for anything, as everything matches.
* Improve regex source gen IndexOfAny naming for Unicode categories
When we're otherwise unable to come up with a good name for the custom IndexOfAny helper, if the set is just a handful of UnicodeCategory values, derive a name from those categories.
* Reduce RegexCompiler cost of using IndexOfAnyValues
With the source generator, each IndexOfAnyValues is stored in its own static readonly field. This makes it cheap to access and allows the JIT to devirtualize calls to it.
With RegexCompiler, we use a DynamicMethod and thus can't introduce new static fields, so instead we maintain an array of IndexOfAnyValues. That means that every time we need one, we're loading the object out of the array. This incurs both bounds checks and doesn't devirtualize.
This commit changes the implementation to avoid the bounds check and to also enable devirtualization.