Fix NonBacktracking quadratic behavior with deep loops (#66038)
authorOlli Saarikivi <olsaarik@microsoft.com>
Tue, 8 Mar 2022 21:30:46 +0000 (13:30 -0800)
committerGitHub <noreply@github.com>
Tue, 8 Mar 2022 21:30:46 +0000 (13:30 -0800)
commit78e071de86f8d18e2401730de9220fde7de0d838
treeba19e6383daba3e69b305250d0ada683740b133f
parent906146aba64a9c14bcc77cffa5f540ead31cd4a7
Fix NonBacktracking quadratic behavior with deep loops (#66038)

Derivative construction now accumulates a list of concatenands and
builds concatenations making up a part of a derivative just once.
Repeated concatenation to the right side in an interaction of the
derivation rule for concatenations and loops was causing quadratic
rebuilding of concatenations.

To implement this fix TransitionRegex is no longer used and instead both
normal and effectful derivative variants use a monolithic base function
that implements the rules for regex derivatives. This also gets rid of
the additional inefficiency of allocating TransitionRegex trees.

Both kinds of derivatives now support simulating backtracking behavior
and always use OrderedOr nodes to maintain transition ordering. The
second reverse phase of matching, however, needs to not use backtracking
simulation. A new DisableBacktrackingSimulation is used to signal this.

Other notable fixes:
The eager concept in derivatives is renamed to simulateBacktracking.
Put TransitionRegex and SymbolicNFA behind DEBUG.
Fix ApplyEffects calls to not create closures.
Fix OrderedOr to always deduplicate: when the new element was on the
 left deduplication was skipped.
Improve CaptureStart/End ToString to not look like parentheses and
 include the group number.
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/DfaMatchingState.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicNFA.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexBuilder.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexKind.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexMatcher.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexNode.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/SymbolicRegexSet.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/TransitionRegex.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Symbolic/TransitionRegexKind.cs
src/libraries/System.Text.RegularExpressions/src/System/Threading/StackHelper.cs