Streamline Regex path to matching, and improve Replace/Split (#1950)
authorStephen Toub <stoub@microsoft.com>
Tue, 21 Jan 2020 20:14:00 +0000 (15:14 -0500)
committerGitHub <noreply@github.com>
Tue, 21 Jan 2020 20:14:00 +0000 (15:14 -0500)
* Add ThrowHelper, and clean up some style

Trying to streamline main path to the engine, ensuring helpers can be inlined, reducing boilerplate, etc.  And as long as ThrowHelper was being used in some places, used it in others where it didn't require adding additional methods.

Also cleaned style along the way.

* Streamline the Scan loop

The main costs remaining are the virtual calls to FindFirstChar/Go.

* Enumerate matches with a reusable match object in Regex.Replace/Split

The cost of re-entering the scan implementation and creating a new Match object for each NextMatch is measurable, but in these cases, we can use an iterator to quickly get back to where we were and reuse the match object.  It adds a couple interface calls per iteration, as well as an allocation for the enumerator, but it saves more than that in the common case.

* Use SegmentStringBuilder instead of ValueStringBuilder in Replace

A previous .NET Core release saw StringBuilder in Regex.Replace replaced by ValueStringBuilder.  This was done to avoid the allocations from the StringBuilder.  However, in some ways, for large input strings, it made things worse.  StringBuilder is implemented as a linked list of builders, whereas ValueStringBuilder is contiguous memory taken from the ArrayPool.  For large input strings, we start requesting buffers too large for the ArrayPool, and thus when we grow we generate large array allocations that become garbage.

We're better off using a simple arraypool-backed struct to store the segments that need to be concatenated, and then just creating the final string from those segments.  For the common case where there are many fewer replacements than the length of the string, this saves a lot of memory as well as avoiding a layer of copying.

* Replace previously added enumerator with a callback mechanism

The delegate invocation per match is faster than the two interface calls per match, plus we can avoid allocating the enumerator and just pass the relevant state through by ref.

* Remove unnecessary fields from MatchCollection

* More exception streamlining and style cleanup

* Address PR feedback, and fix merge with latest changes

25 files changed:
src/libraries/System.Text.RegularExpressions/src/System.Text.RegularExpressions.csproj
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Capture.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/CaptureCollection.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/CollectionDebuggerProxy.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Group.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/GroupCollection.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Match.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/MatchCollection.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Cache.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Match.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Replace.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Split.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Timeout.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexBoyerMoore.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexCharClass.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexCompiler.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreter.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexReplacement.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexRunner.cs
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/ThrowHelper.cs [new file with mode: 0644]
src/libraries/System.Text.RegularExpressions/src/System/Text/SegmentStringBuilder.cs [new file with mode: 0644]
src/libraries/System.Text.RegularExpressions/src/System/Text/ValueStringBuilder.Reverse.cs [deleted file]
src/libraries/System.Text.RegularExpressions/tests/PrecompiledRegexScenarioTest.cs
src/libraries/System.Text.RegularExpressions/tests/Regex.Replace.Tests.cs

index 3241968439878b74305903e503fb9863c75277e8..457f414bd8c2482749da42a4541d18b4cd3c1c99 100644 (file)
@@ -1,4 +1,4 @@
-<Project Sdk="Microsoft.NET.Sdk">
+<Project Sdk="Microsoft.NET.Sdk">
   <PropertyGroup>
     <AssemblyName>System.Text.RegularExpressions</AssemblyName>
     <DefineConstants>$(DefineConstants);FEATURE_COMPILED</DefineConstants>
@@ -7,8 +7,9 @@
     <Nullable>enable</Nullable>
   </PropertyGroup>
   <ItemGroup>
-    <Compile Include="System\Collections\Generic\ValueListBuilder.Pop.cs" />
     <Compile Include="System\Collections\HashtableExtensions.cs" />
+    <Compile Include="System\Collections\Generic\ValueListBuilder.Pop.cs" />
+    <Compile Include="System\Text\SegmentStringBuilder.cs" />
     <Compile Include="System\Text\RegularExpressions\Capture.cs" />
     <Compile Include="System\Text\RegularExpressions\CaptureCollection.cs" />
     <Compile Include="System\Text\RegularExpressions\CollectionDebuggerProxy.cs" />
@@ -16,8 +17,8 @@
     <Compile Include="System\Text\RegularExpressions\GroupCollection.cs" />
     <Compile Include="System\Text\RegularExpressions\Match.cs" />
     <Compile Include="System\Text\RegularExpressions\MatchCollection.cs" />
-    <Compile Include="System\Text\RegularExpressions\Regex.Cache.cs" />
     <Compile Include="System\Text\RegularExpressions\Regex.cs" />
+    <Compile Include="System\Text\RegularExpressions\Regex.Cache.cs" />
     <Compile Include="System\Text\RegularExpressions\Regex.Match.cs" />
     <Compile Include="System\Text\RegularExpressions\Regex.Replace.cs" />
     <Compile Include="System\Text\RegularExpressions\Regex.Split.cs" />
@@ -41,6 +42,7 @@
     <Compile Include="System\Text\RegularExpressions\RegexRunnerFactory.cs" />
     <Compile Include="System\Text\RegularExpressions\RegexTree.cs" />
     <Compile Include="System\Text\RegularExpressions\RegexWriter.cs" />
+    <Compile Include="System\Text\RegularExpressions\ThrowHelper.cs" />
     <!-- Files that enable compiled feature -->
     <Compile Include="System\Text\RegularExpressions\CompiledRegexRunnerFactory.cs" />
     <Compile Include="System\Text\RegularExpressions\CompiledRegexRunner.cs" />
@@ -56,7 +58,6 @@
     <Compile Include="$(CommonPath)System\Text\ValueStringBuilder.cs">
       <Link>Common\System\Text\ValueStringBuilder.cs</Link>
     </Compile>
-    <Compile Include="System\Text\ValueStringBuilder.Reverse.cs" />
   </ItemGroup>
   <ItemGroup>
     <Reference Include="System.Buffers" />
index 6c5610b02c7088e026a14d238c5303b2125e813c..6cf13391110d914dbd148ba14bc439813a619128 100644 (file)
@@ -2,11 +2,6 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// Capture is just a location/length pair that indicates the
-// location of a regular expression match. A single regexp
-// search may return multiple Capture within each capturing
-// RegexGroup.
-
 namespace System.Text.RegularExpressions
 {
     /// <summary>
@@ -22,20 +17,13 @@ namespace System.Text.RegularExpressions
             Length = length;
         }
 
-        /// <summary>
-        /// Returns the position in the original string where the first character of
-        /// captured substring was found.
-        /// </summary>
+        /// <summary>Returns the position in the original string where the first character of captured substring was found.</summary>
         public int Index { get; private protected set; }
 
-        /// <summary>
-        /// Returns the length of the captured substring.
-        /// </summary>
+        /// <summary>Returns the length of the captured substring.</summary>
         public int Length { get; private protected set; }
 
-        /// <summary>
-        /// The original string
-        /// </summary>
+        /// <summary>The original string</summary>
         internal string Text { get; private protected set; }
 
         /// <summary>
@@ -43,19 +31,13 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Value => Text.Substring(Index, Length);
 
-        /// <summary>
-        /// Returns the substring that was matched.
-        /// </summary>
+        /// <summary>Returns the substring that was matched.</summary>
         public override string ToString() => Value;
 
-        /// <summary>
-        /// The substring to the left of the capture
-        /// </summary>
-        internal ReadOnlySpan<char> GetLeftSubstring() => Text.AsSpan(0, Index);
+        /// <summary>The substring to the left of the capture</summary>
+        internal ReadOnlyMemory<char> GetLeftSubstring() => Text.AsMemory(0, Index);
 
-        /// <summary>
-        /// The substring to the right of the capture
-        /// </summary>
-        internal ReadOnlySpan<char> GetRightSubstring() => Text.AsSpan(Index + Length, Text.Length - Index - Length);
+        /// <summary>The substring to the right of the capture</summary>
+        internal ReadOnlyMemory<char> GetRightSubstring() => Text.AsMemory(Index + Length, Text.Length - Index - Length);
     }
 }
index 6f4413bd88168aec8a25856272b6673c5ee87812..4f60548533e68408e79a1edfa42c5bbc6ce7e14a 100644 (file)
@@ -2,19 +2,12 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// The CaptureCollection lists the captured Capture numbers
-// contained in a compiled Regex.
-
 using System.Collections;
 using System.Collections.Generic;
 using System.Diagnostics;
 
 namespace System.Text.RegularExpressions
 {
-    // This collection returns the Captures for a group
-    // in the order in which they were matched (left to right
-    // or right to left). It is created by Group.Captures.
-
     /// <summary>
     /// Represents a sequence of capture substrings. The object is used
     /// to return the set of captures done by a single capturing group.
@@ -35,36 +28,32 @@ namespace System.Text.RegularExpressions
 
         public bool IsReadOnly => true;
 
-        /// <summary>
-        /// Returns the number of captures.
-        /// </summary>
+        /// <summary>Returns the number of captures.</summary>
         public int Count => _capcount;
 
-        /// <summary>
-        /// Returns a specific capture, by index, in this collection.
-        /// </summary>
+        /// <summary>Returns a specific capture, by index, in this collection.</summary>
         public Capture this[int i] => GetCapture(i);
 
-        /// <summary>
-        /// Provides an enumerator in the same order as Item[].
-        /// </summary>
+        /// <summary>Provides an enumerator in the same order as Item[].</summary>
         public IEnumerator GetEnumerator() => new Enumerator(this);
 
         IEnumerator<Capture> IEnumerable<Capture>.GetEnumerator() => new Enumerator(this);
 
-        /// <summary>
-        /// Returns the set of captures for the group
-        /// </summary>
+        /// <summary>Returns the set of captures for the group</summary>
         private Capture GetCapture(int i)
         {
-            if (i == _capcount - 1 && i >= 0)
+            if ((uint)i == _capcount - 1)
+            {
                 return _group;
+            }
 
             if (i >= _capcount || i < 0)
-                throw new ArgumentOutOfRangeException(nameof(i));
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.i);
+            }
 
             // first time a capture is accessed, compute them all
-            if (_captures == null)
+            if (_captures is null)
             {
                 ForceInitialized();
                 Debug.Assert(_captures != null);
@@ -91,8 +80,10 @@ namespace System.Text.RegularExpressions
 
         public void CopyTo(Array array, int arrayIndex)
         {
-            if (array == null)
-                throw new ArgumentNullException(nameof(array));
+            if (array is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
+            }
 
             for (int i = arrayIndex, j = 0; j < Count; i++, j++)
             {
@@ -102,12 +93,18 @@ namespace System.Text.RegularExpressions
 
         public void CopyTo(Capture[] array, int arrayIndex)
         {
-            if (array == null)
-                throw new ArgumentNullException(nameof(array));
-            if (arrayIndex < 0 || arrayIndex > array.Length)
-                throw new ArgumentOutOfRangeException(nameof(arrayIndex));
+            if (array is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
+            }
+            if ((uint)arrayIndex > (uint)array.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.arrayIndex);
+            }
             if (array.Length - arrayIndex < Count)
+            {
                 throw new ArgumentException(SR.Arg_ArrayPlusOffTooSmall);
+            }
 
             for (int i = arrayIndex, j = 0; j < Count; i++, j++)
             {
@@ -128,77 +125,57 @@ namespace System.Text.RegularExpressions
             return -1;
         }
 
-        void IList<Capture>.Insert(int index, Capture item)
-        {
+        void IList<Capture>.Insert(int index, Capture item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList<Capture>.RemoveAt(int index)
-        {
+        void IList<Capture>.RemoveAt(int index) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         Capture IList<Capture>.this[int index]
         {
-            get { return this[index]; }
-            set { throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection); }
+            get => this[index];
+            set => throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
         }
 
-        void ICollection<Capture>.Add(Capture item)
-        {
+        void ICollection<Capture>.Add(Capture item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void ICollection<Capture>.Clear()
-        {
+        void ICollection<Capture>.Clear() =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool ICollection<Capture>.Contains(Capture item) =>
             ((IList<Capture>)this).IndexOf(item) >= 0;
 
-        bool ICollection<Capture>.Remove(Capture item)
-        {
+        bool ICollection<Capture>.Remove(Capture item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        int IList.Add(object? value)
-        {
+        int IList.Add(object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList.Clear()
-        {
+        void IList.Clear() =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool IList.Contains(object? value) =>
-            value is Capture && ((ICollection<Capture>)this).Contains((Capture)value);
+            value is Capture other && ((ICollection<Capture>)this).Contains(other);
 
         int IList.IndexOf(object? value) =>
-            value is Capture ? ((IList<Capture>)this).IndexOf((Capture)value) : -1;
+            value is Capture other ? ((IList<Capture>)this).IndexOf(other) : -1;
 
-        void IList.Insert(int index, object? value)
-        {
+        void IList.Insert(int index, object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool IList.IsFixedSize => true;
 
-        void IList.Remove(object? value)
-        {
+        void IList.Remove(object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList.RemoveAt(int index)
-        {
+        void IList.RemoveAt(int index) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         object? IList.this[int index]
         {
-            get { return this[index]; }
-            set { throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection); }
+            get => this[index];
+            set => throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
         }
 
         private sealed class Enumerator : IEnumerator<Capture>
@@ -219,7 +196,9 @@ namespace System.Text.RegularExpressions
                 int size = _collection.Count;
 
                 if (_index >= size)
+                {
                     return false;
+                }
 
                 _index++;
 
@@ -231,7 +210,9 @@ namespace System.Text.RegularExpressions
                 get
                 {
                     if (_index < 0 || _index >= _collection.Count)
+                    {
                         throw new InvalidOperationException(SR.EnumNotStarted);
+                    }
 
                     return _collection[_index];
                 }
@@ -239,10 +220,7 @@ namespace System.Text.RegularExpressions
 
             object IEnumerator.Current => Current;
 
-            void IEnumerator.Reset()
-            {
-                _index = -1;
-            }
+            void IEnumerator.Reset() => _index = -1;
 
             void IDisposable.Dispose() { }
         }
index e8ef9dc8962f098c393d846f81c3595c8561e7d0..39c05a0d5e167f62c8a62ac307b2e736e5d2ea38 100644 (file)
@@ -11,10 +11,8 @@ namespace System.Text.RegularExpressions
     {
         private readonly ICollection<T> _collection;
 
-        public CollectionDebuggerProxy(ICollection<T> collection)
-        {
+        public CollectionDebuggerProxy(ICollection<T> collection) =>
             _collection = collection ?? throw new ArgumentNullException(nameof(collection));
-        }
 
         [DebuggerBrowsable(DebuggerBrowsableState.RootHidden)]
         public T[] Items
index 54d6f23f03828bf22f84f70f30aa2787c75c09b1..9245ac3a83f1979e499f2f99925e2b7d3a1fe9db 100644 (file)
@@ -2,10 +2,6 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// Group represents the substring or substrings that
-// are captured by a single capturing group after one
-// regular expression match.
-
 namespace System.Text.RegularExpressions
 {
     /// <summary>
@@ -22,17 +18,14 @@ namespace System.Text.RegularExpressions
         internal CaptureCollection? _capcoll;
 
         internal Group(string text, int[] caps, int capcount, string name)
-            : base(text, capcount == 0 ? 0 : caps[(capcount - 1) * 2],
-               capcount == 0 ? 0 : caps[(capcount * 2) - 1])
+            : base(text, capcount == 0 ? 0 : caps[(capcount - 1) * 2], capcount == 0 ? 0 : caps[(capcount * 2) - 1])
         {
             _caps = caps;
             _capcount = capcount;
             Name = name;
         }
 
-        /// <summary>
-        /// Indicates whether the match is successful.
-        /// </summary>
+        /// <summary>Indicates whether the match is successful.</summary>
         public bool Success => _capcount != 0;
 
         public string Name { get; }
@@ -45,13 +38,14 @@ namespace System.Text.RegularExpressions
         public CaptureCollection Captures => _capcoll ??= new CaptureCollection(this);
 
         /// <summary>
-        /// Returns a Group object equivalent to the one supplied that is safe to share between
-        /// multiple threads.
+        /// Returns a Group object equivalent to the one supplied that is safe to share between multiple threads.
         /// </summary>
         public static Group Synchronized(Group inner)
         {
             if (inner == null)
-                throw new ArgumentNullException(nameof(inner));
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.inner);
+            }
 
             // force Captures to be computed.
             CaptureCollection capcoll = inner.Captures;
index a3b4920c077a11a3bf3ba3ce4291ef2a59c4fdd8..71f797ca3acbdc63055d4917924a7fb6d3773eef 100644 (file)
@@ -2,9 +2,6 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// The GroupCollection lists the captured Capture numbers
-// contained in a compiled Regex.
-
 using System.Collections;
 using System.Collections.Generic;
 using System.Diagnostics;
@@ -23,7 +20,7 @@ namespace System.Text.RegularExpressions
         private readonly Match _match;
         private readonly Hashtable? _captureMap;
 
-        // cache of Group objects fed to the user
+        /// <summary>Cache of Group objects fed to the user.</summary>
         private Group[]? _groups;
 
         internal GroupCollection(Match match, Hashtable? caps)
@@ -32,22 +29,20 @@ namespace System.Text.RegularExpressions
             _captureMap = caps;
         }
 
+        internal void Reset() => _groups = null;
+
         public bool IsReadOnly => true;
 
-        /// <summary>
-        /// Returns the number of groups.
-        /// </summary>
+        /// <summary>Returns the number of groups.</summary>
         public int Count => _match._matchcount.Length;
 
         public Group this[int groupnum] => GetGroup(groupnum);
 
-        public Group this[string groupname] => _match._regex == null ?
+        public Group this[string groupname] => _match._regex is null ?
             Group.s_emptyGroup :
             GetGroup(_match._regex.GroupNumberFromName(groupname));
 
-        /// <summary>
-        /// Provides an enumerator in the same order as Item[].
-        /// </summary>
+        /// <summary>Provides an enumerator in the same order as Item[].</summary>
         public IEnumerator GetEnumerator() => new Enumerator(this);
 
         IEnumerator<Group> IEnumerable<Group>.GetEnumerator() => new Enumerator(this);
@@ -61,7 +56,7 @@ namespace System.Text.RegularExpressions
                     return GetGroupImpl(groupNumImpl);
                 }
             }
-            else if (groupnum < _match._matchcount.Length && groupnum >= 0)
+            else if ((uint)groupnum < _match._matchcount.Length)
             {
                 return GetGroupImpl(groupnum);
             }
@@ -75,11 +70,12 @@ namespace System.Text.RegularExpressions
         private Group GetGroupImpl(int groupnum)
         {
             if (groupnum == 0)
+            {
                 return _match;
+            }
 
             // Construct all the Group objects the first time GetGroup is called
-
-            if (_groups == null)
+            if (_groups is null)
             {
                 _groups = new Group[_match._matchcount.Length - 1];
                 for (int i = 0; i < _groups.Length; i++)
@@ -98,8 +94,10 @@ namespace System.Text.RegularExpressions
 
         public void CopyTo(Array array, int arrayIndex)
         {
-            if (array == null)
-                throw new ArgumentNullException(nameof(array));
+            if (array is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
+            }
 
             for (int i = arrayIndex, j = 0; j < Count; i++, j++)
             {
@@ -109,12 +107,18 @@ namespace System.Text.RegularExpressions
 
         public void CopyTo(Group[] array, int arrayIndex)
         {
-            if (array == null)
-                throw new ArgumentNullException(nameof(array));
+            if (array is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
+            }
             if (arrayIndex < 0 || arrayIndex > array.Length)
+            {
                 throw new ArgumentOutOfRangeException(nameof(arrayIndex));
+            }
             if (array.Length - arrayIndex < Count)
+            {
                 throw new ArgumentException(SR.Arg_ArrayPlusOffTooSmall);
+            }
 
             for (int i = arrayIndex, j = 0; j < Count; i++, j++)
             {
@@ -124,92 +128,72 @@ namespace System.Text.RegularExpressions
 
         int IList<Group>.IndexOf(Group item)
         {
-            var comparer = EqualityComparer<Group>.Default;
             for (int i = 0; i < Count; i++)
             {
-                if (comparer.Equals(this[i], item))
+                if (EqualityComparer<Group>.Default.Equals(this[i], item))
+                {
                     return i;
+                }
             }
+
             return -1;
         }
 
-        void IList<Group>.Insert(int index, Group item)
-        {
+        void IList<Group>.Insert(int index, Group item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList<Group>.RemoveAt(int index)
-        {
+        void IList<Group>.RemoveAt(int index) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         Group IList<Group>.this[int index]
         {
-            get { return this[index]; }
-            set { throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection); }
+            get => this[index];
+            set => throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
         }
 
-        void ICollection<Group>.Add(Group item)
-        {
+        void ICollection<Group>.Add(Group item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void ICollection<Group>.Clear()
-        {
+        void ICollection<Group>.Clear() =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool ICollection<Group>.Contains(Group item) =>
             ((IList<Group>)this).IndexOf(item) >= 0;
 
-        bool ICollection<Group>.Remove(Group item)
-        {
+        bool ICollection<Group>.Remove(Group item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        int IList.Add(object? value)
-        {
+        int IList.Add(object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList.Clear()
-        {
+        void IList.Clear() =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool IList.Contains(object? value) =>
-            value is Group && ((ICollection<Group>)this).Contains((Group)value);
+            value is Group other && ((ICollection<Group>)this).Contains(other);
 
         int IList.IndexOf(object? value) =>
-            value is Group ? ((IList<Group>)this).IndexOf((Group)value) : -1;
+            value is Group other ? ((IList<Group>)this).IndexOf(other) : -1;
 
-        void IList.Insert(int index, object? value)
-        {
+        void IList.Insert(int index, object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool IList.IsFixedSize => true;
 
-        void IList.Remove(object? value)
-        {
+        void IList.Remove(object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList.RemoveAt(int index)
-        {
+        void IList.RemoveAt(int index) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         object? IList.this[int index]
         {
-            get { return this[index]; }
-            set { throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection); }
+            get => this[index];
+            set => throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
         }
 
-        IEnumerator<KeyValuePair<string, Group>> IEnumerable<KeyValuePair<string, Group>>.GetEnumerator()
-        {
-            return new Enumerator(this);
-        }
+        IEnumerator<KeyValuePair<string, Group>> IEnumerable<KeyValuePair<string, Group>>.GetEnumerator() =>
+            new Enumerator(this);
 
 #pragma warning disable CS8614 // Nullability of reference types in type of parameter doesn't match implicitly implemented member.
         public bool TryGetValue(string key, [NotNullWhen(true)] out Group? value)
@@ -226,10 +210,7 @@ namespace System.Text.RegularExpressions
             return true;
         }
 
-        public bool ContainsKey(string key)
-        {
-            return _match._regex!.GroupNumberFromName(key) >= 0;
-        }
+        public bool ContainsKey(string key) => _match._regex!.GroupNumberFromName(key) >= 0;
 
         public IEnumerable<string> Keys
         {
@@ -271,10 +252,11 @@ namespace System.Text.RegularExpressions
                 int size = _collection.Count;
 
                 if (_index >= size)
+                {
                     return false;
+                }
 
                 _index++;
-
                 return _index < size;
             }
 
@@ -283,7 +265,9 @@ namespace System.Text.RegularExpressions
                 get
                 {
                     if (_index < 0 || _index >= _collection.Count)
+                    {
                         throw new InvalidOperationException(SR.EnumNotStarted);
+                    }
 
                     return _collection[_index];
                 }
@@ -293,11 +277,12 @@ namespace System.Text.RegularExpressions
             {
                 get
                 {
-                    if (_index < 0 || _index >= _collection.Count)
+                    if ((uint)_index >= _collection.Count)
+                    {
                         throw new InvalidOperationException(SR.EnumNotStarted);
+                    }
 
                     Group value = _collection[_index];
-
                     return new KeyValuePair<string, Group>(value.Name, value);
 
                 }
@@ -305,10 +290,7 @@ namespace System.Text.RegularExpressions
 
             object IEnumerator.Current => Current;
 
-            void IEnumerator.Reset()
-            {
-                _index = -1;
-            }
+            void IEnumerator.Reset() => _index = -1;
 
             void IDisposable.Dispose() { }
         }
index a16fd0ec90a1773f38daf4ad16c0f91e054bd54f..01c8ee2a081a170cf53b5385f3ccafd2f4f40319 100644 (file)
@@ -2,41 +2,40 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// Match is the result class for a regex search.
-// It returns the location, length, and substring for
-// the entire match as well as every captured group.
-
-// Match is also used during the search to keep track of each capture for each group.  This is
-// done using the "_matches" array.  _matches[x] represents an array of the captures for group x.
-// This array consists of start and length pairs, and may have empty entries at the end.  _matchcount[x]
-// stores how many captures a group has.  Note that _matchcount[x]*2 is the length of all the valid
-// values in _matches.  _matchcount[x]*2-2 is the Start of the last capture, and _matchcount[x]*2-1 is the
-// Length of the last capture
-//
-// For example, if group 2 has one capture starting at position 4 with length 6,
-// _matchcount[2] == 1
-// _matches[2][0] == 4
-// _matches[2][1] == 6
-//
-// Values in the _matches array can also be negative.  This happens when using the balanced match
-// construct, "(?<start-end>...)".  When the "end" group matches, a capture is added for both the "start"
-// and "end" groups.  The capture added for "start" receives the negative values, and these values point to
-// the next capture to be balanced.  They do NOT point to the capture that "end" just balanced out.  The negative
-// values are indices into the _matches array transformed by the formula -3-x.  This formula also untransforms.
-//
-
 using System.Collections;
+using System.Diagnostics;
 using System.Diagnostics.CodeAnalysis;
-using System.Globalization;
 
 namespace System.Text.RegularExpressions
 {
     /// <summary>
     /// Represents the results from a single regular expression match.
     /// </summary>
+    /// <remarks>
+    /// Match is the result class for a regex search.
+    /// It returns the location, length, and substring for
+    /// the entire match as well as every captured group.
+    ///
+    /// Match is also used during the search to keep track of each capture for each group.  This is
+    /// done using the "_matches" array.  _matches[x] represents an array of the captures for group x.
+    /// This array consists of start and length pairs, and may have empty entries at the end.  _matchcount[x]
+    /// stores how many captures a group has.  Note that _matchcount[x]*2 is the length of all the valid
+    /// values in _matches.  _matchcount[x]*2-2 is the Start of the last capture, and _matchcount[x]*2-1 is the
+    /// Length of the last capture
+    ///
+    /// For example, if group 2 has one capture starting at position 4 with length 6,
+    /// _matchcount[2] == 1
+    /// _matches[2][0] == 4
+    /// _matches[2][1] == 6
+    ///
+    /// Values in the _matches array can also be negative.  This happens when using the balanced match
+    /// construct, "(?&lt;start-end&gt;...)".  When the "end" group matches, a capture is added for both the "start"
+    /// and "end" groups.  The capture added for "start" receives the negative values, and these values point to
+    /// the next capture to be balanced.  They do NOT point to the capture that "end" just balanced out.  The negative
+    /// values are indices into the _matches array transformed by the formula -3-x.  This formula also untransforms.
+    /// </remarks>
     public class Match : Group
     {
-        private const int ReplaceBufferSize = 256;
         internal GroupCollection? _groupcoll;
 
         // input to the match
@@ -52,8 +51,8 @@ namespace System.Text.RegularExpressions
         internal bool _balancing;        // whether we've done any balancing with this match.  If we
                                          // have done balancing, we'll need to do extra work in Tidy().
 
-        internal Match(Regex? regex, int capcount, string text, int begpos, int len, int startpos)
-            base(text, new int[2], 0, "0")
+        internal Match(Regex? regex, int capcount, string text, int begpos, int len, int startpos) :
+            base(text, new int[2], 0, "0")
         {
             _regex = regex;
             _matchcount = new int[capcount];
@@ -64,14 +63,11 @@ namespace System.Text.RegularExpressions
             _textstart = startpos;
             _balancing = false;
 
-            // No need for an exception here.  This is only called internally, so we'll use an Assert instead
-            System.Diagnostics.Debug.Assert(!(_textbeg < 0 || _textstart < _textbeg || _textend < _textstart || Text.Length < _textend),
-                                            "The parameters are out of range.");
+            Debug.Assert(!(_textbeg < 0 || _textstart < _textbeg || _textend < _textstart || Text.Length < _textend),
+                "The parameters are out of range.");
         }
 
-        /// <summary>
-        /// Returns an empty Match object.
-        /// </summary>
+        /// <summary>Returns an empty Match object.</summary>
         public static Match Empty { get; } = new Match(null, 1, string.Empty, 0, 0, 0);
 
         internal void Reset(Regex regex, string text, int textbeg, int textend, int textstart)
@@ -89,6 +85,7 @@ namespace System.Text.RegularExpressions
             }
 
             _balancing = false;
+            _groupcoll?.Reset();
         }
 
         public virtual GroupCollection Groups => _groupcoll ??= new GroupCollection(this, null);
@@ -100,10 +97,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public Match NextMatch()
         {
-            if (_regex == null)
-                return this;
-
-            return _regex.Run(false, Length, Text, _textbeg, _textend - _textbeg, _textpos)!;
+            Regex? r = _regex;
+            return r != null ?
+                r.Run(false, Length, Text, _textbeg, _textend - _textbeg, _textpos)! :
+                this;
         }
 
         /// <summary>
@@ -113,34 +110,38 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public virtual string Result(string replacement)
         {
-            if (replacement == null)
-                throw new ArgumentNullException(nameof(replacement));
+            if (replacement is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.replacement);
+            }
 
-            if (_regex == null)
+            Regex? regex = _regex;
+            if (regex is null)
+            {
                 throw new NotSupportedException(SR.NoResultOnFailed);
+            }
 
             // Gets the weakly cached replacement helper or creates one if there isn't one already.
-            RegexReplacement repl = RegexReplacement.GetOrCreate(_regex._replref!, replacement, _regex.caps!, _regex.capsize, _regex.capnames!, _regex.roptions);
-            var vsb = new ValueStringBuilder(stackalloc char[ReplaceBufferSize]);
-            repl.ReplacementImpl(ref vsb, this);
-            return vsb.ToString();
+            RegexReplacement repl = RegexReplacement.GetOrCreate(regex._replref!, replacement, regex.caps!, regex.capsize, regex.capnames!, regex.roptions);
+            var segments = new SegmentStringBuilder(256);
+            repl.ReplacementImpl(ref segments, this);
+            return segments.ToString();
         }
 
-        internal ReadOnlySpan<char> GroupToStringImpl(int groupnum)
+        internal ReadOnlyMemory<char> GroupToStringImpl(int groupnum)
         {
             int c = _matchcount[groupnum];
             if (c == 0)
-                return string.Empty;
+            {
+                return default;
+            }
 
             int[] matches = _matches[groupnum];
-
-            return Text.AsSpan(matches[(c - 1) * 2], matches[(c * 2) - 1]);
+            return Text.AsMemory(matches[(c - 1) * 2], matches[(c * 2) - 1]);
         }
 
-        internal ReadOnlySpan<char> LastGroupToStringImpl()
-        {
-            return GroupToStringImpl(_matchcount.Length - 1);
-        }
+        internal ReadOnlyMemory<char> LastGroupToStringImpl() =>
+            GroupToStringImpl(_matchcount.Length - 1);
 
         /// <summary>
         /// Returns a Match instance equivalent to the one supplied that is safe to share
@@ -148,27 +149,25 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public static Match Synchronized(Match inner)
         {
-            if (inner == null)
-                throw new ArgumentNullException(nameof(inner));
+            if (inner is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.inner);
+            }
 
             int numgroups = inner._matchcount.Length;
 
             // Populate all groups by looking at each one
             for (int i = 0; i < numgroups; i++)
             {
-                Group group = inner.Groups[i];
-
                 // Depends on the fact that Group.Synchronized just
                 // operates on and returns the same instance
-                Group.Synchronized(group);
+                Synchronized(inner.Groups[i]);
             }
 
             return inner;
         }
 
-        /// <summary>
-        /// Adds a capture to the group specified by "cap"
-        /// </summary>
+        /// <summary>Adds a capture to the group specified by "cap"</summary>
         internal void AddMatch(int cap, int start, int len)
         {
             _matches[cap] ??= new int[2];
@@ -182,7 +181,10 @@ namespace System.Text.RegularExpressions
                 int[] oldmatches = matches[cap];
                 int[] newmatches = new int[capcount * 8];
                 for (int j = 0; j < capcount * 2; j++)
+                {
                     newmatches[j] = oldmatches[j];
+                }
+
                 matches[cap] = newmatches;
             }
 
@@ -191,13 +193,12 @@ namespace System.Text.RegularExpressions
             matchcount[cap] = capcount + 1;
         }
 
-        /*
-         * Nonpublic builder: Add a capture to balance the specified group.  This is used by the
-                              balanced match construct. (?<foo-foo2>...)
-
-           If there were no such thing as backtracking, this would be as simple as calling RemoveMatch(cap).
-           However, since we have backtracking, we need to keep track of everything.
-         */
+        /// <summary>
+        /// Nonpublic builder: Add a capture to balance the specified group.  This is used by the
+        /// balanced match construct. (?&lt;foo-foo2&gt;...)
+        /// If there were no such thing as backtracking, this would be as simple as calling RemoveMatch(cap).
+        /// However, since we have backtracking, we need to keep track of everything.
+        /// </summary>
         internal void BalanceMatch(int cap)
         {
             _balancing = true;
@@ -210,33 +211,35 @@ namespace System.Text.RegularExpressions
             // capture group for balancing.  If it is, we'll reset target to point to that capture.
             int[][] matches = _matches;
             if (matches[cap][target] < 0)
+            {
                 target = -3 - matches[cap][target];
+            }
 
             // move back to the previous capture
             target -= 2;
 
             // if the previous capture is a reference, just copy that reference to the end.  Otherwise, point to it.
             if (target >= 0 && matches[cap][target] < 0)
+            {
                 AddMatch(cap, matches[cap][target], matches[cap][target + 1]);
+            }
             else
+            {
                 AddMatch(cap, -3 - target, -4 - target /* == -3 - (target + 1) */ );
+            }
         }
 
-        /// <summary>
-        /// Removes a group match by capnum
-        /// </summary>
-        internal void RemoveMatch(int cap)
-        {
-            _matchcount[cap]--;
-        }
+        /// <summary>Removes a group match by capnum</summary>
+        internal void RemoveMatch(int cap) => _matchcount[cap]--;
 
-        /// <summary>
-        /// Tells if a group was matched by capnum
-        /// </summary>
+        /// <summary>Tells if a group was matched by capnum</summary>
         internal bool IsMatched(int cap)
         {
             int[] matchcount = _matchcount;
-            return (uint)cap < (uint)matchcount.Length && matchcount[cap] > 0 && _matches[cap][matchcount[cap] * 2 - 1] != (-3 + 1);
+            return
+                (uint)cap < (uint)matchcount.Length &&
+                matchcount[cap] > 0 &&
+                _matches[cap][matchcount[cap] * 2 - 1] != (-3 + 1);
         }
 
         /// <summary>
@@ -247,10 +250,7 @@ namespace System.Text.RegularExpressions
             int[][] matches = _matches;
 
             int i = matches[cap][_matchcount[cap] * 2 - 2];
-            if (i >= 0)
-                return i;
-
-            return matches[cap][-3 - i];
+            return i >= 0 ? i : matches[cap][-3 - i];
         }
 
         /// <summary>
@@ -261,96 +261,99 @@ namespace System.Text.RegularExpressions
             int[][] matches = _matches;
 
             int i = matches[cap][_matchcount[cap] * 2 - 1];
-            if (i >= 0)
-                return i;
-
-            return matches[cap][-3 - i];
+            return i >= 0 ? i : matches[cap][-3 - i];
         }
 
-        /// <summary>
-        /// Tidy the match so that it can be used as an immutable result
-        /// </summary>
+        /// <summary>Tidy the match so that it can be used as an immutable result</summary>
         internal void Tidy(int textpos)
         {
-            int[][] matches = _matches;
-
-            int[] interval = matches[0];
+            _textpos = textpos;
+            _capcount = _matchcount[0];
+            int[] interval = _matches[0];
             Index = interval[0];
             Length = interval[1];
-            _textpos = textpos;
+            if (_balancing)
+            {
+                TidyBalancing();
+            }
+        }
 
+        private void TidyBalancing()
+        {
             int[] matchcount = _matchcount;
-            _capcount = matchcount[0];
+            int[][] matches = _matches;
 
-            if (_balancing)
+            // The idea here is that we want to compact all of our unbalanced captures.  To do that we
+            // use j basically as a count of how many unbalanced captures we have at any given time
+            // (really j is an index, but j/2 is the count).  First we skip past all of the real captures
+            // until we find a balance captures.  Then we check each subsequent entry.  If it's a balance
+            // capture (it's negative), we decrement j.  If it's a real capture, we increment j and copy
+            // it down to the last free position.
+            for (int cap = 0; cap < matchcount.Length; cap++)
             {
-                // The idea here is that we want to compact all of our unbalanced captures.  To do that we
-                // use j basically as a count of how many unbalanced captures we have at any given time
-                // (really j is an index, but j/2 is the count).  First we skip past all of the real captures
-                // until we find a balance captures.  Then we check each subsequent entry.  If it's a balance
-                // capture (it's negative), we decrement j.  If it's a real capture, we increment j and copy
-                // it down to the last free position.
-                for (int cap = 0; cap < matchcount.Length; cap++)
-                {
-                    int limit;
-                    int[] matcharray;
+                int limit;
+                int[] matcharray;
 
-                    limit = matchcount[cap] * 2;
-                    matcharray = matches[cap];
+                limit = matchcount[cap] * 2;
+                matcharray = matches[cap];
 
-                    int i = 0;
-                    int j;
+                int i;
+                int j;
 
-                    for (i = 0; i < limit; i++)
+                for (i = 0; i < limit; i++)
+                {
+                    if (matcharray[i] < 0)
                     {
-                        if (matcharray[i] < 0)
-                            break;
+                        break;
                     }
+                }
 
-                    for (j = i; i < limit; i++)
+                for (j = i; i < limit; i++)
+                {
+                    if (matcharray[i] < 0)
                     {
-                        if (matcharray[i] < 0)
-                        {
-                            // skip negative values
-                            j--;
-                        }
-                        else
+                        // skip negative values
+                        j--;
+                    }
+                    else
+                    {
+                        // but if we find something positive (an actual capture), copy it back to the last
+                        // unbalanced position.
+                        if (i != j)
                         {
-                            // but if we find something positive (an actual capture), copy it back to the last
-                            // unbalanced position.
-                            if (i != j)
-                                matcharray[j] = matcharray[i];
-                            j++;
+                            matcharray[j] = matcharray[i];
                         }
-                    }
 
-                    matchcount[cap] = j / 2;
+                        j++;
+                    }
                 }
 
-                _balancing = false;
+                matchcount[cap] = j / 2;
             }
+
+            _balancing = false;
         }
 
 #if DEBUG
         [ExcludeFromCodeCoverage]
-        internal bool Debug => _regex != null && _regex.Debug;
+        internal bool IsDebug => _regex != null && _regex.IsDebug;
 
         internal virtual void Dump()
         {
-            int i, j;
-
-            for (i = 0; i < _matchcount.Length; i++)
+            for (int i = 0; i < _matchcount.Length; i++)
             {
-                System.Diagnostics.Debug.WriteLine("Capnum " + i.ToString(CultureInfo.InvariantCulture) + ":");
+                Debug.WriteLine($"Capnum {i}:");
 
-                for (j = 0; j < _matchcount[i]; j++)
+                for (int j = 0; j < _matchcount[i]; j++)
                 {
                     string text = "";
 
                     if (_matches[i][j * 2] >= 0)
+                    {
                         text = Text.Substring(_matches[i][j * 2], _matches[i][j * 2 + 1]);
+                    }
 
-                    System.Diagnostics.Debug.WriteLine("  (" + _matches[i][j * 2].ToString(CultureInfo.InvariantCulture) + "," + _matches[i][j * 2 + 1].ToString(CultureInfo.InvariantCulture) + ") " + text);
+                    Debug.WriteLine($"  ({_matches[i][j * 2]},{_matches[i][j * 2 + 1]}) {text}");
                 }
             }
         }
@@ -360,13 +363,13 @@ namespace System.Text.RegularExpressions
     /// <summary>
     /// MatchSparse is for handling the case where slots are sparsely arranged (e.g., if somebody says use slot 100000)
     /// </summary>
-    internal class MatchSparse : Match
+    internal sealed class MatchSparse : Match
     {
         // the lookup hashtable
         internal new readonly Hashtable _caps;
 
-        internal MatchSparse(Regex regex, Hashtable caps, int capcount, string text, int begpos, int len, int startpos)
-            base(regex, capcount, text, begpos, len, startpos)
+        internal MatchSparse(Regex regex, Hashtable caps, int capcount, string text, int begpos, int len, int startpos) :
+            base(regex, capcount, text, begpos, len, startpos)
         {
             _caps = caps;
         }
@@ -382,7 +385,7 @@ namespace System.Text.RegularExpressions
                 foreach (object? entry in _caps)
                 {
                     DictionaryEntry kvp = (DictionaryEntry)entry!;
-                    System.Diagnostics.Debug.WriteLine("Slot " + kvp.Key.ToString() + " -> " + kvp.Value!.ToString());
+                    Debug.WriteLine($"Slot {kvp.Key} -> {kvp.Value}");
                 }
             }
 
index 76017e82b13d151d567b99bd2c4790795d1ddf4d..ca0005da659abc361396f643ca18281196702d40 100644 (file)
@@ -2,20 +2,12 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// The MatchCollection lists the successful matches that
-// result when searching a string for a regular expression.
-
 using System.Collections;
 using System.Collections.Generic;
 using System.Diagnostics;
 
 namespace System.Text.RegularExpressions
 {
-    /*
-     * This collection returns a sequence of successful match results, either
-     * from GetMatchCollection() or GetExecuteCollection(). It stops when the
-     * first failure is encountered (it does not return the failed match).
-     */
     /// <summary>
     /// Represents the set of names appearing as capturing group
     /// names in a regular expression.
@@ -26,22 +18,20 @@ namespace System.Text.RegularExpressions
     {
         private readonly Regex _regex;
         private readonly List<Match> _matches;
-        private bool _done;
         private readonly string _input;
-        private readonly int _beginning;
-        private readonly int _length;
         private int _startat;
         private int _prevlen;
+        private bool _done;
 
-        internal MatchCollection(Regex regex, string input, int beginning, int length, int startat)
+        internal MatchCollection(Regex regex, string input, int startat)
         {
-            if (startat < 0 || startat > input.Length)
-                throw new ArgumentOutOfRangeException(nameof(startat), SR.BeginIndexNotNegative);
+            if ((uint)startat > (uint)input.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startat, ExceptionResource.BeginIndexNotNegative);
+            }
 
             _regex = regex;
             _input = input;
-            _beginning = beginning;
-            _length = length;
             _startat = startat;
             _prevlen = -1;
             _matches = new List<Match>();
@@ -69,21 +59,16 @@ namespace System.Text.RegularExpressions
         {
             get
             {
-                if (i < 0)
-                    throw new ArgumentOutOfRangeException(nameof(i));
-
-                Match? match = GetMatch(i);
-
-                if (match == null)
-                    throw new ArgumentOutOfRangeException(nameof(i));
-
+                Match? match = null;
+                if (i < 0 || (match = GetMatch(i)) is null)
+                {
+                    ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.i);
+                }
                 return match;
             }
         }
 
-        /// <summary>
-        /// Provides an enumerator in the same order as Item[i].
-        /// </summary>
+        /// <summary>Provides an enumerator in the same order as Item[i].</summary>
         public IEnumerator GetEnumerator() => new Enumerator(this);
 
         IEnumerator<Match> IEnumerable<Match>.GetEnumerator() => new Enumerator(this);
@@ -93,17 +78,19 @@ namespace System.Text.RegularExpressions
             Debug.Assert(i >= 0, "i cannot be negative.");
 
             if (_matches.Count > i)
+            {
                 return _matches[i];
+            }
 
             if (_done)
+            {
                 return null;
+            }
 
             Match match;
-
             do
             {
-                match = _regex.Run(false, _prevlen, _input, _beginning, _length, _startat)!;
-
+                match = _regex.Run(false, _prevlen, _input, 0, _input.Length, _startat)!;
                 if (!match.Success)
                 {
                     _done = true;
@@ -111,7 +98,6 @@ namespace System.Text.RegularExpressions
                 }
 
                 _matches.Add(match);
-
                 _prevlen = match.Length;
                 _startat = match._textpos;
             } while (_matches.Count <= i);
@@ -149,31 +135,23 @@ namespace System.Text.RegularExpressions
             return _matches.IndexOf(item);
         }
 
-        void IList<Match>.Insert(int index, Match item)
-        {
+        void IList<Match>.Insert(int index, Match item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList<Match>.RemoveAt(int index)
-        {
+        void IList<Match>.RemoveAt(int index) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         Match IList<Match>.this[int index]
         {
-            get { return this[index]; }
-            set { throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection); }
+            get => this[index];
+            set => throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
         }
 
-        void ICollection<Match>.Add(Match item)
-        {
+        void ICollection<Match>.Add(Match item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void ICollection<Match>.Clear()
-        {
+        void ICollection<Match>.Clear() =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool ICollection<Match>.Contains(Match item)
         {
@@ -181,48 +159,36 @@ namespace System.Text.RegularExpressions
             return _matches.Contains(item);
         }
 
-        bool ICollection<Match>.Remove(Match item)
-        {
+        bool ICollection<Match>.Remove(Match item) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        int IList.Add(object? value)
-        {
+        int IList.Add(object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList.Clear()
-        {
+        void IList.Clear() =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool IList.Contains(object? value) =>
             value is Match && ((ICollection<Match>)this).Contains((Match)value);
 
         int IList.IndexOf(object? value) =>
-            value is Match ? ((IList<Match>)this).IndexOf((Match)value) : -1;
+            value is Match other ? ((IList<Match>)this).IndexOf(other) : -1;
 
-        void IList.Insert(int index, object? value)
-        {
+        void IList.Insert(int index, object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         bool IList.IsFixedSize => true;
 
-        void IList.Remove(object? value)
-        {
+        void IList.Remove(object? value) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
-        void IList.RemoveAt(int index)
-        {
+        void IList.RemoveAt(int index) =>
             throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
-        }
 
         object? IList.this[int index]
         {
-            get { return this[index]; }
-            set { throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection); }
+            get => this[index];
+            set => throw new NotSupportedException(SR.NotSupported_ReadOnlyCollection);
         }
 
         private sealed class Enumerator : IEnumerator<Match>
@@ -241,12 +207,14 @@ namespace System.Text.RegularExpressions
             public bool MoveNext()
             {
                 if (_index == -2)
+                {
                     return false;
+                }
 
                 _index++;
                 Match? match = _collection.GetMatch(_index);
 
-                if (match == null)
+                if (match is null)
                 {
                     _index = -2;
                     return false;
@@ -260,7 +228,9 @@ namespace System.Text.RegularExpressions
                 get
                 {
                     if (_index < 0)
+                    {
                         throw new InvalidOperationException(SR.EnumNotStarted);
+                    }
 
                     return _collection.GetMatch(_index)!;
                 }
@@ -268,10 +238,7 @@ namespace System.Text.RegularExpressions
 
             object IEnumerator.Current => Current;
 
-            void IEnumerator.Reset()
-            {
-                _index = -1;
-            }
+            void IEnumerator.Reset() => _index = -1;
 
             void IDisposable.Dispose() { }
         }
index 0049aff187c2fb5160bf7101d24330aa36a263a4..b780608db3e5dc1ff6bb20836843238c3092211f 100644 (file)
@@ -20,7 +20,7 @@ namespace System.Text.RegularExpressions
             {
                 if (value < 0)
                 {
-                    throw new ArgumentOutOfRangeException(nameof(value));
+                    ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value);
                 }
 
                 RegexCache.MaxCacheSize = value;
index bba51ade96e871da8fd00cef84a9ac6f8a94e86a..8e13953ae276e554613fdd3179696c02fe21f704 100644 (file)
@@ -23,35 +23,32 @@ namespace System.Text.RegularExpressions
         public static bool IsMatch(string input, string pattern, RegexOptions options, TimeSpan matchTimeout) =>
             RegexCache.GetOrAdd(pattern, options, matchTimeout).IsMatch(input);
 
-        /*
-         * Returns true if the regex finds a match within the specified string
-         */
         /// <summary>
         /// Searches the input string for one or more matches using the previous pattern,
         /// options, and starting position.
         /// </summary>
         public bool IsMatch(string input)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return IsMatch(input, UseOptionR() ? input.Length : 0);
+            return Run(quick: true, -1, input, 0, input.Length, UseOptionR() ? input.Length : 0) is null;
         }
 
-        /*
-         * Returns true if the regex finds a match after the specified position
-         * (proceeding leftward if the regex is leftward and rightward otherwise)
-         */
         /// <summary>
         /// Searches the input string for one or more matches using the previous pattern and options,
         /// with a new starting position.
         /// </summary>
         public bool IsMatch(string input, int startat)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return (null == Run(true, -1, input, 0, input.Length, startat));
+            return Run(quick: true, -1, input, 0, input.Length, startat) is null;
         }
 
         /// <summary>
@@ -72,51 +69,45 @@ namespace System.Text.RegularExpressions
         public static Match Match(string input, string pattern, RegexOptions options, TimeSpan matchTimeout) =>
             RegexCache.GetOrAdd(pattern, options, matchTimeout).Match(input);
 
-        /*
-         * Finds the first match for the regular expression starting at the beginning
-         * of the string (or at the end of the string if the regex is leftward)
-         */
         /// <summary>
         /// Matches a regular expression with a string and returns
-        /// the precise result as a RegexMatch object.
+        /// the precise result as a Match object.
         /// </summary>
         public Match Match(string input)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Match(input, UseOptionR() ? input.Length : 0);
+            return Run(quick: false, -1, input, 0, input.Length, UseOptionR() ? input.Length : 0)!;
         }
 
-        /*
-         * Finds the first match, starting at the specified position
-         */
         /// <summary>
         /// Matches a regular expression with a string and returns
-        /// the precise result as a RegexMatch object.
+        /// the precise result as a Match object.
         /// </summary>
         public Match Match(string input, int startat)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Run(false, -1, input, 0, input.Length, startat)!;
+            return Run(quick: false, -1, input, 0, input.Length, startat)!;
         }
 
-        /*
-         * Finds the first match, restricting the search to the specified interval of
-         * the char array.
-         */
         /// <summary>
-        /// Matches a regular expression with a string and returns the precise result as a
-        /// RegexMatch object.
+        /// Matches a regular expression with a string and returns the precise result as a Match object.
         /// </summary>
         public Match Match(string input, int beginning, int length)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Run(false, -1, input, beginning, length, UseOptionR() ? beginning + length : beginning)!;
+            return Run(quick: false, -1, input, beginning, length, UseOptionR() ? beginning + length : beginning)!;
         }
 
         /// <summary>
@@ -134,33 +125,30 @@ namespace System.Text.RegularExpressions
         public static MatchCollection Matches(string input, string pattern, RegexOptions options, TimeSpan matchTimeout) =>
             RegexCache.GetOrAdd(pattern, options, matchTimeout).Matches(input);
 
-        /*
-         * Finds the first match for the regular expression starting at the beginning
-         * of the string Enumerator(or at the end of the string if the regex is leftward)
-         */
         /// <summary>
         /// Returns all the successful matches as if Match was called iteratively numerous times.
         /// </summary>
         public MatchCollection Matches(string input)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Matches(input, UseOptionR() ? input.Length : 0);
+            return new MatchCollection(this, input, UseOptionR() ? input.Length : 0);
         }
 
-        /*
-         * Finds the first match, starting at the specified position
-         */
         /// <summary>
         /// Returns all the successful matches as if Match was called iteratively numerous times.
         /// </summary>
         public MatchCollection Matches(string input, int startat)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return new MatchCollection(this, input, 0, input.Length, startat);
+            return new MatchCollection(this, input, startat);
         }
     }
 }
index 5a8c16c249edf0577d8aa7e3206658e1293bfb80..2ddc07ad8627959bc1ac51a365d476eec45e5058 100644 (file)
@@ -2,19 +2,15 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-using System.Collections.Generic;
-using System.IO;
-using System.Text;
-
 namespace System.Text.RegularExpressions
 {
     // Callback class
     public delegate string MatchEvaluator(Match match);
 
+    internal delegate bool MatchCallback<TState>(ref TState state, Match match);
+
     public partial class Regex
     {
-        private const int ReplaceBufferSize = 256;
-
         /// <summary>
         /// Replaces all occurrences of the pattern with the <paramref name="replacement"/> pattern, starting at
         /// the first character in the input string.
@@ -40,8 +36,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Replace(string input, string replacement)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
             return Replace(input, replacement, -1, UseOptionR() ? input.Length : 0);
         }
@@ -53,8 +51,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Replace(string input, string replacement, int count)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
             return Replace(input, replacement, count, UseOptionR() ? input.Length : 0);
         }
@@ -66,16 +66,20 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Replace(string input, string replacement, int count, int startat)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
-
-            if (replacement == null)
-                throw new ArgumentNullException(nameof(replacement));
-
-            // Gets the weakly cached replacement helper or creates one if there isn't one already.
-            RegexReplacement repl = RegexReplacement.GetOrCreate(_replref!, replacement, caps!, capsize, capnames!, roptions);
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
+            if (replacement is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.replacement);
+            }
 
-            return repl.Replace(this, input, count, startat);
+            // Gets the weakly cached replacement helper or creates one if there isn't one already,
+            // then uses it to perform the replace.
+            return
+                RegexReplacement.GetOrCreate(_replref!, replacement, caps!, capsize, capnames!, roptions).
+                Replace(this, input, count, startat);
         }
 
         /// <summary>
@@ -101,10 +105,12 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Replace(string input, MatchEvaluator evaluator)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Replace(input, evaluator, -1, UseOptionR() ? input.Length : 0);
+            return Replace(evaluator, this, input, -1, UseOptionR() ? input.Length : 0);
         }
 
         /// <summary>
@@ -113,10 +119,12 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Replace(string input, MatchEvaluator evaluator, int count)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Replace(input, evaluator, count, UseOptionR() ? input.Length : 0);
+            return Replace(evaluator, this, input, count, UseOptionR() ? input.Length : 0);
         }
 
         /// <summary>
@@ -126,8 +134,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string Replace(string input, MatchEvaluator evaluator, int count, int startat)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
             return Replace(evaluator, this, input, count, startat);
         }
@@ -143,79 +153,65 @@ namespace System.Text.RegularExpressions
         /// </summary>
         private static string Replace(MatchEvaluator evaluator, Regex regex, string input, int count, int startat)
         {
-            if (evaluator == null)
-                throw new ArgumentNullException(nameof(evaluator));
+            if (evaluator is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.evaluator);
+            }
             if (count < -1)
-                throw new ArgumentOutOfRangeException(nameof(count), SR.CountTooSmall);
-            if (startat < 0 || startat > input.Length)
-                throw new ArgumentOutOfRangeException(nameof(startat), SR.BeginIndexNotNegative);
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.count, ExceptionResource.CountTooSmall);
+            }
+            if ((uint)startat > (uint)input.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startat, ExceptionResource.BeginIndexNotNegative);
+            }
 
             if (count == 0)
+            {
                 return input;
+            }
 
-            Match match = regex.Match(input, startat);
+            var state = (segments: new SegmentStringBuilder(256), evaluator, prevat: 0, input, count);
 
-            if (!match.Success)
+            if (!regex.RightToLeft)
             {
-                return input;
+                regex.Run(input, startat, ref state, (ref (SegmentStringBuilder segments, MatchEvaluator evaluator, int prevat, string input, int count) state, Match match) =>
+                {
+                    state.segments.Add(state.input.AsMemory(state.prevat, match.Index - state.prevat));
+                    state.prevat = match.Index + match.Length;
+                    state.segments.Add(state.evaluator(match).AsMemory());
+                    return --state.count != 0;
+                });
+
+                if (state.segments.Count == 0)
+                {
+                    return input;
+                }
+
+                state.segments.Add(input.AsMemory(state.prevat, input.Length - state.prevat));
             }
             else
             {
-                var vsb = new ValueStringBuilder(stackalloc char[ReplaceBufferSize]);
+                state.prevat = input.Length;
 
-                if (!regex.RightToLeft)
+                regex.Run(input, startat, ref state, (ref (SegmentStringBuilder segments, MatchEvaluator evaluator, int prevat, string input, int count) state, Match match) =>
                 {
-                    int prevat = 0;
-
-                    do
-                    {
-                        if (match.Index != prevat)
-                            vsb.Append(input.AsSpan(prevat, match.Index - prevat));
-
-                        prevat = match.Index + match.Length;
-                        string result = evaluator(match);
-                        if (!string.IsNullOrEmpty(result))
-                            vsb.Append(result);
+                    state.segments.Add(state.input.AsMemory(match.Index + match.Length, state.prevat - match.Index - match.Length));
+                    state.prevat = match.Index;
+                    state.segments.Add(evaluator(match).AsMemory());
+                    return --state.count != 0;
+                });
 
-                        if (--count == 0)
-                            break;
-
-                        match = match.NextMatch();
-                    } while (match.Success);
-
-                    if (prevat < input.Length)
-                        vsb.Append(input.AsSpan(prevat, input.Length - prevat));
-                }
-                else
+                if (state.segments.Count == 0)
                 {
-                    // In right to left mode append all the inputs in reversed order to avoid an extra dynamic data structure
-                    // and to be able to work with Spans. A final reverse of the transformed reversed input string generates
-                    // the desired output. Similar to Tower of Hanoi.
-
-                    int prevat = input.Length;
-
-                    do
-                    {
-                        if (match.Index + match.Length != prevat)
-                            vsb.AppendReversed(input.AsSpan(match.Index + match.Length, prevat - match.Index - match.Length));
-
-                        prevat = match.Index;
-                        vsb.AppendReversed(evaluator(match));
-
-                        if (--count == 0)
-                            break;
-
-                        match = match.NextMatch();
-                    } while (match.Success);
-
-                    if (prevat > 0)
-                        vsb.AppendReversed(input.AsSpan(0, prevat));
-
-                    vsb.Reverse();
+                    return input;
                 }
 
-                return vsb.ToString();
+                state.segments.Add(input.AsMemory(0, state.prevat));
+                state.segments.AsSpan().Reverse();
             }
+
+            return state.segments.ToString();
         }
     }
 }
index d0bb0e991e66bc596fe934b799b1ff2197c792dd..ee5a2a43a8cddcd66e7a7e3594d56455d657c27a 100644 (file)
@@ -30,10 +30,12 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string[] Split(string input)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
-            return Split(input, 0, UseOptionR() ? input.Length : 0);
+            return Split(this, input, 0, UseOptionR() ? input.Length : 0);
         }
 
         /// <summary>
@@ -42,8 +44,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string[] Split(string input, int count)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
             return Split(this, input, count, UseOptionR() ? input.Length : 0);
         }
@@ -53,8 +57,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public string[] Split(string input, int count, int startat)
         {
-            if (input == null)
-                throw new ArgumentNullException(nameof(input));
+            if (input is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.input);
+            }
 
             return Split(this, input, count, startat);
         }
@@ -66,94 +72,79 @@ namespace System.Text.RegularExpressions
         private static string[] Split(Regex regex, string input, int count, int startat)
         {
             if (count < 0)
-                throw new ArgumentOutOfRangeException(nameof(count), SR.CountTooSmall);
-            if (startat < 0 || startat > input.Length)
-                throw new ArgumentOutOfRangeException(nameof(startat), SR.BeginIndexNotNegative);
-
-            string[] result;
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.count, ExceptionResource.CountTooSmall);
+            }
+            if ((uint)startat > (uint)input.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startat, ExceptionResource.BeginIndexNotNegative);
+            }
 
             if (count == 1)
             {
-                result = new string[1];
-                result[0] = input;
-                return result;
+                return new[] { input };
             }
 
-            count -= 1;
+            count--;
+            var state = (results: new List<string>(), prevat: 0, input, count);
 
-            Match match = regex.Match(input, startat);
-
-            if (!match.Success)
-            {
-                result = new string[1];
-                result[0] = input;
-                return result;
-            }
-            else
+            if (!regex.RightToLeft)
             {
-                List<string> al = new List<string>();
-
-                if (!regex.RightToLeft)
+                regex.Run(input, startat, ref state, (ref (List<string> results, int prevat, string input, int count) state, Match match) =>
                 {
-                    int prevat = 0;
+                    state.results.Add(state.input.Substring(state.prevat, match.Index - state.prevat));
+                    state.prevat = match.Index + match.Length;
 
-                    while (true)
+                    // add all matched capture groups to the list.
+                    for (int i = 1; i < match.Groups.Count; i++)
                     {
-                        al.Add(input.Substring(prevat, match.Index - prevat));
-
-                        prevat = match.Index + match.Length;
-
-                        // add all matched capture groups to the list.
-                        for (int i = 1; i < match.Groups.Count; i++)
+                        if (match.IsMatched(i))
                         {
-                            if (match.IsMatched(i))
-                                al.Add(match.Groups[i].ToString());
+                            state.results.Add(match.Groups[i].ToString());
                         }
+                    }
 
-                        if (--count == 0)
-                            break;
+                    return --state.count != 0;
+                });
 
-                        match = match.NextMatch();
+                if (state.results.Count == 0)
+                {
+                    return new[] { input };
+                }
 
-                        if (!match.Success)
-                            break;
-                    }
+                state.results.Add(input.Substring(state.prevat, input.Length - state.prevat));
+            }
+            else
+            {
+                state.prevat = input.Length;
 
-                    al.Add(input.Substring(prevat, input.Length - prevat));
-                }
-                else
+                regex.Run(input, startat, ref state, (ref (List<string> results, int prevat, string input, int count) state, Match match) =>
                 {
-                    int prevat = input.Length;
+                    state.results.Add(state.input.Substring(match.Index + match.Length, state.prevat - match.Index - match.Length));
+                    state.prevat = match.Index;
 
-                    while (true)
+                    // add all matched capture groups to the list.
+                    for (int i = 1; i < match.Groups.Count; i++)
                     {
-                        al.Add(input.Substring(match.Index + match.Length, prevat - match.Index - match.Length));
-
-                        prevat = match.Index;
-
-                        // add all matched capture groups to the list.
-                        for (int i = 1; i < match.Groups.Count; i++)
+                        if (match.IsMatched(i))
                         {
-                            if (match.IsMatched(i))
-                                al.Add(match.Groups[i].ToString());
+                            state.results.Add(match.Groups[i].ToString());
                         }
-
-                        if (--count == 0)
-                            break;
-
-                        match = match.NextMatch();
-
-                        if (!match.Success)
-                            break;
                     }
 
-                    al.Add(input.Substring(0, prevat));
+                    return --state.count != 0;
+                });
 
-                    al.Reverse(0, al.Count);
+                if (state.results.Count == 0)
+                {
+                    return new[] { input };
                 }
 
-                return al.ToArray();
+                state.results.Add(input.Substring(0, state.prevat));
+                state.results.Reverse(0, state.results.Count);
             }
+
+            return state.results.ToArray();
         }
     }
 }
index f82231b72b9bac86e585dbf3109f5fc5805bf9c7..8d60f3302c216e105c4f3ce70db76073901e5fb7 100644 (file)
@@ -10,11 +10,14 @@ namespace System.Text.RegularExpressions
     {
         // We need this because time is queried using Environment.TickCount for performance reasons
         // (Environment.TickCount returns milliseconds as an int and cycles):
-        private static readonly TimeSpan s_maximumMatchTimeout = TimeSpan.FromMilliseconds(int.MaxValue - 1);
+        private const ulong MaximumMatchTimeoutTicks = 10_000UL * (int.MaxValue - 1); // TimeSpan.FromMilliseconds(int.MaxValue - 1).Ticks;
 
         // During static initialisation of Regex we check
         private const string DefaultMatchTimeout_ConfigKeyName = "REGEX_DEFAULT_MATCH_TIMEOUT";
 
+        // Number of ticks represented by InfiniteMatchTimeout
+        private const long InfiniteMatchTimeoutTicks = -10_000; // InfiniteMatchTimeout.Ticks
+
         // InfiniteMatchTimeout specifies that match timeout is switched OFF. It allows for faster code paths
         // compared to simply having a very large timeout.
         // We do not want to ask users to use System.Threading.Timeout.InfiniteTimeSpan as a parameter because:
@@ -52,7 +55,7 @@ namespace System.Text.RegularExpressions
             object? defaultMatchTimeoutObj = ad.GetData(DefaultMatchTimeout_ConfigKeyName);
 
             // If no default is specified, use fallback
-            if (defaultMatchTimeoutObj == null)
+            if (defaultMatchTimeoutObj is null)
             {
                 return InfiniteMatchTimeout;
             }
index c25170723ec1c3e1c7d40c17a85986c273b11743..05116c6861b7f1578f1876142fc8388cc295fe93 100644 (file)
@@ -3,6 +3,7 @@
 // See the LICENSE file in the project root for more information.
 
 using System.Collections;
+using System.Diagnostics;
 using System.Diagnostics.CodeAnalysis;
 using System.Globalization;
 using System.Reflection;
@@ -102,14 +103,18 @@ namespace System.Text.RegularExpressions
             internalMatchTimeout = matchTimeout;
 
 #if DEBUG
-            if (Debug)
+            if (IsDebug)
             {
-                System.Diagnostics.Debug.Write($"Pattern:     {pattern}");
+                Debug.Write($"Pattern:     {pattern}");
                 RegexOptions displayOptions = options & ~RegexOptions.Debug;
                 if (displayOptions != RegexOptions.None)
-                    System.Diagnostics.Debug.Write($"Options:     {displayOptions}");
-                if (matchTimeout != Regex.InfiniteMatchTimeout)
-                    System.Diagnostics.Debug.Write($"Timeout:     {matchTimeout}");
+                {
+                    Debug.Write($"Options:     {displayOptions}");
+                }
+                if (matchTimeout != InfiniteMatchTimeout)
+                {
+                    Debug.Write($"Timeout:     {matchTimeout}");
+                }
             }
 #endif
 
@@ -130,28 +135,21 @@ namespace System.Text.RegularExpressions
         {
             if (pattern is null)
             {
-                throw new ArgumentNullException(nameof(pattern));
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.pattern);
             }
         }
 
         internal static void ValidateOptions(RegexOptions options)
         {
-            if (options < RegexOptions.None || (((int)options) >> MaxOptionShift) != 0)
-            {
-                throw new ArgumentOutOfRangeException(nameof(options));
-            }
-
-            if ((options & RegexOptions.ECMAScript) != 0 &&
-                (options & ~(RegexOptions.ECMAScript |
-                             RegexOptions.IgnoreCase |
-                             RegexOptions.Multiline |
-                             RegexOptions.Compiled |
+            if (((((uint)options) >> MaxOptionShift) != 0) ||
+                ((options & RegexOptions.ECMAScript) != 0 &&
+                 (options & ~(RegexOptions.ECMAScript | RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled |
 #if DEBUG
                              RegexOptions.Debug |
 #endif
-                             RegexOptions.CultureInvariant)) != 0)
+                             RegexOptions.CultureInvariant)) != 0))
             {
-                throw new ArgumentOutOfRangeException(nameof(options));
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.options);
             }
         }
 
@@ -160,18 +158,15 @@ namespace System.Text.RegularExpressions
         /// The valid range is <code>TimeSpan.Zero &lt; matchTimeout &lt;= Regex.MaximumMatchTimeout</code>.
         /// </summary>
         /// <param name="matchTimeout">The timeout value to validate.</param>
-        /// <exception cref="ArgumentOutOfRangeException">If the specified timeout is not within a valid range.
-        /// </exception>
+        /// <exception cref="ArgumentOutOfRangeException">If the specified timeout is not within a valid range.</exception>
         protected internal static void ValidateMatchTimeout(TimeSpan matchTimeout)
         {
-            if (InfiniteMatchTimeout == matchTimeout)
-                return;
-
-            // make sure timeout is not longer then Environment.Ticks cycle length:
-            if (TimeSpan.Zero < matchTimeout && matchTimeout <= s_maximumMatchTimeout)
-                return;
-
-            throw new ArgumentOutOfRangeException(nameof(matchTimeout));
+            // make sure timeout is positive but not longer then Environment.Ticks cycle length
+            long matchTimeoutTicks = matchTimeout.Ticks;
+            if (matchTimeoutTicks != InfiniteMatchTimeoutTicks && ((ulong)(matchTimeoutTicks - 1) >= MaximumMatchTimeoutTicks))
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.matchTimeout);
+            }
         }
 
         protected Regex(SerializationInfo info, StreamingContext context) =>
@@ -186,8 +181,10 @@ namespace System.Text.RegularExpressions
             get => caps;
             set
             {
-                if (value == null)
-                    throw new ArgumentNullException(nameof(value));
+                if (value is null)
+                {
+                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
+                }
 
                 caps = value as Hashtable ?? new Hashtable(value);
             }
@@ -199,8 +196,10 @@ namespace System.Text.RegularExpressions
             get => capnames;
             set
             {
-                if (value == null)
-                    throw new ArgumentNullException(nameof(value));
+                if (value is null)
+                {
+                    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
+                }
 
                 capnames = value as Hashtable ?? new Hashtable(value);
             }
@@ -227,12 +226,12 @@ namespace System.Text.RegularExpressions
         {
             if (assemblyname is null)
             {
-                throw new ArgumentNullException(nameof(assemblyname));
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.assemblyname);
             }
 
             if (regexinfos is null)
             {
-                throw new ArgumentNullException(nameof(regexinfos));
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.regexinfos);
             }
 
 #if DEBUG // until it can be fully implemented
@@ -253,8 +252,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public static string Escape(string str)
         {
-            if (str == null)
-                throw new ArgumentNullException(nameof(str));
+            if (str is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.str);
+            }
 
             return RegexParser.Escape(str);
         }
@@ -264,8 +265,10 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public static string Unescape(string str)
         {
-            if (str == null)
-                throw new ArgumentNullException(nameof(str));
+            if (str is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.str);
+            }
 
             return RegexParser.Unescape(str);
         }
@@ -285,12 +288,6 @@ namespace System.Text.RegularExpressions
         /// </summary>
         public override string ToString() => pattern!;
 
-        /*
-         * Returns an array of the group names that are used to capture groups
-         * in the regular expression. Only needed if the regex is not known until
-         * runtime, and one wants to extract captured groups. (Probably unusual,
-         * but supplied for completeness.)
-         */
         /// <summary>
         /// Returns the GroupNameCollection for the regular expression. This collection contains the
         /// set of strings used to name capturing groups in the expression.
@@ -299,12 +296,12 @@ namespace System.Text.RegularExpressions
         {
             string[] result;
 
-            if (capslist == null)
+            if (capslist is null)
             {
                 result = new string[capsize];
                 for (int i = 0; i < result.Length; i++)
                 {
-                    result[i] = i.ToString();
+                    result[i] = ((uint)i).ToString();
                 }
             }
             else
@@ -315,12 +312,6 @@ namespace System.Text.RegularExpressions
             return result;
         }
 
-        /*
-         * Returns an array of the group numbers that are used to capture groups
-         * in the regular expression. Only needed if the regex is not known until
-         * runtime, and one wants to extract captured groups. (Probably unusual,
-         * but supplied for completeness.)
-         */
         /// <summary>
         /// Returns the integer group number corresponding to a group name.
         /// </summary>
@@ -328,11 +319,9 @@ namespace System.Text.RegularExpressions
         {
             int[] result;
 
-            if (caps == null)
+            if (caps is null)
             {
-                int max = capsize;
-                result = new int[max];
-
+                result = new int[capsize];
                 for (int i = 0; i < result.Length; i++)
                 {
                     result[i] = i;
@@ -340,9 +329,8 @@ namespace System.Text.RegularExpressions
             }
             else
             {
-                result = new int[caps.Count];
-
                 // Manual use of IDictionaryEnumerator instead of foreach to avoid DictionaryEntry box allocations.
+                result = new int[caps.Count];
                 IDictionaryEnumerator de = caps.GetEnumerator();
                 while (de.MoveNext())
                 {
@@ -353,134 +341,123 @@ namespace System.Text.RegularExpressions
             return result;
         }
 
-        /*
-         * Given a group number, maps it to a group name. Note that numbered
-         * groups automatically get a group name that is the decimal string
-         * equivalent of its number.
-         *
-         * Returns null if the number is not a recognized group number.
-         */
         /// <summary>
         /// Retrieves a group name that corresponds to a group number.
         /// </summary>
         public string GroupNameFromNumber(int i)
         {
-            if (capslist == null)
+            if (capslist is null)
             {
-                if (i >= 0 && i < capsize)
-                    return i.ToString();
-
-                return string.Empty;
+                return (uint)i < (uint)capsize ?
+                    ((uint)i).ToString() :
+                    string.Empty;
             }
             else
             {
-                if (caps != null)
-                {
-                    if (!caps.TryGetValue(i, out i))
-                        return string.Empty;
-                }
-
-                if (i >= 0 && i < capslist.Length)
-                    return capslist[i];
-
-                return string.Empty;
+                return caps != null && !caps.TryGetValue(i, out i) ? string.Empty :
+                    (uint)i < (uint)capslist.Length ? capslist[i] :
+                    string.Empty;
             }
         }
 
-        /*
-         * Given a group name, maps it to a group number. Note that numbered
-         * groups automatically get a group name that is the decimal string
-         * equivalent of its number.
-         *
-         * Returns -1 if the name is not a recognized group name.
-         */
         /// <summary>
-        /// Returns a group number that corresponds to a group name.
+        /// Returns a group number that corresponds to a group name, or -1 if the name is not a recognized group name.
         /// </summary>
         public int GroupNumberFromName(string name)
         {
-            if (name == null)
-                throw new ArgumentNullException(nameof(name));
-
-            int result;
+            if (name is null)
+            {
+                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.name);
+            }
 
-            // look up name if we have a hashtable of names
             if (capnames != null)
             {
-                return capnames.TryGetValue(name, out result) ? result : -1;
+                // Look up name if we have a hashtable of names.
+                return capnames.TryGetValue(name, out int result) ? result : -1;
             }
-
-            // convert to an int if it looks like a number
-            result = 0;
-            for (int i = 0; i < name.Length; i++)
+            else
             {
-                uint digit = (uint)(name[i] - '0');
-                if (digit > 9)
-                {
-                    return -1;
-                }
-
-                result = (result * 10) + (int)digit;
+                // Otherwise, try to parse it as a number.
+                return uint.TryParse(name, NumberStyles.None, provider: null, out uint result) && result < capsize ? (int)result : -1;
             }
-
-            // return int if it's in range
-            return result >= 0 && result < capsize ? result : -1;
         }
 
         protected void InitializeReferences()
         {
             if (_refsInitialized)
-                throw new NotSupportedException(SR.OnlyAllowedOnce);
+            {
+                ThrowHelper.ThrowNotSupportedException(ExceptionResource.OnlyAllowedOnce);
+            }
 
-            _refsInitialized = true;
             _replref = new WeakReference<RegexReplacement?>(null);
+            _refsInitialized = true;
         }
 
-        /// <summary>
-        /// Internal worker called by all the public APIs
-        /// </summary>
-        /// <returns></returns>
+        /// <summary>Internal worker called by the public APIs</summary>
         internal Match? Run(bool quick, int prevlen, string input, int beginning, int length, int startat)
         {
-            if (startat < 0 || startat > input.Length)
-                throw new ArgumentOutOfRangeException(nameof(startat), SR.BeginIndexNotNegative);
-
-            if (length < 0 || length > input.Length)
-                throw new ArgumentOutOfRangeException(nameof(length), SR.LengthNotNegative);
+            if ((uint)startat > (uint)input.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startat, ExceptionResource.BeginIndexNotNegative);
+            }
+            if ((uint)length > (uint)input.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.length, ExceptionResource.LengthNotNegative);
+            }
 
-            RegexRunner runner =
-                Interlocked.Exchange(ref _runner, null) ?? // use a cached runner if there is one
-                (factory != null ? factory.CreateInstance() : // use the compiled RegexRunner factory if there is one
-                 new RegexInterpreter(_code!, UseOptionInvariant() ? CultureInfo.InvariantCulture : CultureInfo.CurrentCulture));
+            RegexRunner runner = RentRunner();
             try
             {
                 // Do the scan starting at the requested position
                 Match? match = runner.Scan(this, input, beginning, beginning + length, startat, prevlen, quick, internalMatchTimeout);
 #if DEBUG
-                if (Debug) match?.Dump();
+                if (IsDebug) match?.Dump();
 #endif
                 return match;
             }
             finally
             {
-                // Release the runner back to the cache
-                _runner = runner;
+                ReturnRunner(runner);
             }
         }
 
+        internal void Run<TState>(string input, int startat, ref TState state, MatchCallback<TState> callback)
+        {
+            Debug.Assert((uint)startat <= (uint)input.Length);
+            RegexRunner runner = RentRunner();
+            try
+            {
+                runner.Scan(this, input, startat, ref state, callback, internalMatchTimeout);
+            }
+            finally
+            {
+                ReturnRunner(runner);
+            }
+        }
+
+        /// <summary>Gets a runner from the cache, or creates a new one.</summary>
+        [MethodImpl(MethodImplOptions.AggressiveInlining)] // factored out to be used by only two call sites
+        private RegexRunner RentRunner() =>
+            Interlocked.Exchange(ref _runner, null) ?? // use a cached runner if there is one
+            (factory != null ? factory.CreateInstance() : // use the compiled RegexRunner factory if there is one
+            new RegexInterpreter(_code!, UseOptionInvariant() ? CultureInfo.InvariantCulture : CultureInfo.CurrentCulture));
+
+        /// <summary>Release the runner back to the cache.</summary>
+        internal void ReturnRunner(RegexRunner runner) => _runner = runner;
+
+        /// <summary>True if the <see cref="RegexOptions.Compiled"/> option was set.</summary>
         protected bool UseOptionC() => (roptions & RegexOptions.Compiled) != 0;
 
-        /// <summary>True if the L option was set</summary>
+        /// <summary>True if the <see cref="RegexOptions.RightToLeft"/> option was set.</summary>
         protected internal bool UseOptionR() => (roptions & RegexOptions.RightToLeft) != 0;
 
+        /// <summary>True if the <see cref="RegexOptions.CultureInvariant"/> option was set.</summary>
         internal bool UseOptionInvariant() => (roptions & RegexOptions.CultureInvariant) != 0;
 
 #if DEBUG
-        /// <summary>
-        /// True if the regex has debugging enabled
-        /// </summary>
+        /// <summary>True if the regex has debugging enabled.</summary>
         [ExcludeFromCodeCoverage]
-        internal bool Debug => (roptions & RegexOptions.Debug) != 0;
+        internal bool IsDebug => (roptions & RegexOptions.Debug) != 0;
 #endif
     }
 }
index 2980488a290367f80a9b4ce0d8919c58a0910057..9897fae9333267325225e5540a7c6af9d3306b42 100644 (file)
@@ -108,8 +108,6 @@ namespace System.Text.RegularExpressions
                         if (Positive[match] == 0)
                             Positive[match] = match - scan;
 
-                        // System.Diagnostics.Debug.WriteLine("Set positive[" + match + "] to " + (match - scan));
-
                         break;
                     }
 
index 5a7c16f39a26099777afe59396a09b7a2bb256ce..76288af6a890bca527e294c7ca91ea94cd672757 100644 (file)
@@ -408,15 +408,12 @@ namespace System.Text.RegularExpressions
             // Make sure the initial capacity for s_definedCategories is correct
             Debug.Assert(
                 s_definedCategories.Count == DefinedCategoriesCapacity,
-                "RegexCharClass s_definedCategories's initial capacity (DefinedCategoriesCapacity) is incorrect.",
-                "Expected (s_definedCategories.Count): {0}, Actual (DefinedCategoriesCapacity): {1}",
-                s_definedCategories.Count,
-                DefinedCategoriesCapacity);
+                $"Expected (s_definedCategories.Count): {s_definedCategories.Count}, Actual (DefinedCategoriesCapacity): {DefinedCategoriesCapacity}");
 
             // Make sure the s_propTable is correctly ordered
             int len = s_propTable.Length;
             for (int i = 0; i < len - 1; i++)
-                Debug.Assert(string.Compare(s_propTable[i][0], s_propTable[i + 1][0], StringComparison.Ordinal) < 0, "RegexCharClass s_propTable is out of order at (" + s_propTable[i][0] + ", " + s_propTable[i + 1][0] + ")");
+                Debug.Assert(string.Compare(s_propTable[i][0], s_propTable[i + 1][0], StringComparison.Ordinal) < 0, $"RegexCharClass s_propTable is out of order at ({s_propTable[i][0]}, {s_propTable[i + 1][0]})");
         }
 #endif
 
index a95a9d170e1d08cec1cfc901ef7a15c9245ee258..e3710de9d0211075fd463ff08d909747b6257e99 100644 (file)
@@ -2838,7 +2838,7 @@ namespace System.Text.RegularExpressions
                 Call(s_spanGetLengthMethod);
                 BgeUnFar(skipUpdatesLabel);
 
-                // if (textSpan[i] != ch) goto skipUpdatesLabel;
+                // if (textSpan[textSpanPos] != ch) goto skipUpdatesLabel;
                 Ldloca(textSpanLocal);
                 Ldc(textSpanPos);
                 Call(s_spanGetItemMethod);
index b8ebca81d7d47d7443ed18375e3be7b42344cceb..df31aef4597b8091c4cf3aba28646efa58d964ea 100644 (file)
@@ -146,7 +146,7 @@ namespace System.Text.RegularExpressions
         {
             int newpos = runtrack![runtrackpos++];
 #if DEBUG
-            if (runmatch!.Debug)
+            if (runmatch!.IsDebug)
             {
                 if (newpos < 0)
                     Debug.WriteLine("       Backtracking (back2) to code position " + (-newpos));
@@ -621,7 +621,7 @@ namespace System.Text.RegularExpressions
                     advance = -1;
                 }
 #if DEBUG
-                if (runmatch!.Debug)
+                if (runmatch!.IsDebug)
                 {
                     DumpState();
                 }
index 88013dd6335b501db6bfc11fe35374f1177d0528..a75138419a7f1348509887a9504980895497d7f5 100644 (file)
@@ -2,15 +2,16 @@
 // The .NET Foundation licenses this file to you under the MIT license.
 // See the LICENSE file in the project root for more information.
 
-// The RegexReplacement class represents a substitution string for
-// use when using regexes to search/replace, etc. It's logically
-// a sequence intermixed (1) constant strings and (2) group numbers.
-
 using System.Collections;
 using System.Collections.Generic;
 
 namespace System.Text.RegularExpressions
 {
+    /// <summary>
+    /// The RegexReplacement class represents a substitution string for
+    /// use when using regexes to search/replace, etc. It's logically
+    /// a sequence intermixed (1) constant strings and (2) group numbers.
+    /// </summary>
     internal sealed class RegexReplacement
     {
         // Constants for special insertion patterns
@@ -21,7 +22,7 @@ namespace System.Text.RegularExpressions
         public const int WholeString = -4;
 
         private readonly List<string> _strings; // table of string constants
-        private readonly List<int> _rules;      // negative -> group #, positive -> string #
+        private readonly int[] _rules;          // negative -> group #, positive -> string #
 
         /// <summary>
         /// Since RegexReplacement shares the same parser as Regex,
@@ -31,14 +32,17 @@ namespace System.Text.RegularExpressions
         public RegexReplacement(string rep, RegexNode concat, Hashtable _caps)
         {
             if (concat.Type != RegexNode.Concatenate)
-                throw new ArgumentException(SR.ReplacementError);
+            {
+                throw ThrowHelper.CreateArgumentException(ExceptionResource.ReplacementError);
+            }
 
             Span<char> vsbStack = stackalloc char[256];
             var vsb = new ValueStringBuilder(vsbStack);
             var strings = new List<string>();
-            var rules = new List<int>();
+            var rules = new ValueListBuilder<int>(stackalloc int[64]);
 
-            for (int i = 0; i < concat.ChildCount(); i++)
+            int childCount = concat.ChildCount();
+            for (int i = 0; i < childCount; i++)
             {
                 RegexNode child = concat.Child(i);
 
@@ -55,32 +59,36 @@ namespace System.Text.RegularExpressions
                     case RegexNode.Ref:
                         if (vsb.Length > 0)
                         {
-                            rules.Add(strings.Count);
+                            rules.Append(strings.Count);
                             strings.Add(vsb.ToString());
                             vsb = new ValueStringBuilder(vsbStack);
                         }
                         int slot = child.M;
 
                         if (_caps != null && slot >= 0)
+                        {
                             slot = (int)_caps[slot]!;
+                        }
 
-                        rules.Add(-Specials - 1 - slot);
+                        rules.Append(-Specials - 1 - slot);
                         break;
 
                     default:
-                        throw new ArgumentException(SR.ReplacementError);
+                        throw ThrowHelper.CreateArgumentException(ExceptionResource.ReplacementError);
                 }
             }
 
             if (vsb.Length > 0)
             {
-                rules.Add(strings.Count);
+                rules.Append(strings.Count);
                 strings.Add(vsb.ToString());
             }
 
             Pattern = rep;
             _strings = strings;
-            _rules = rules;
+            _rules = rules.AsSpan().ToArray();
+
+            rules.Dispose();
         }
 
         /// <summary>
@@ -101,39 +109,43 @@ namespace System.Text.RegularExpressions
             return repl;
         }
 
-        /// <summary>
-        /// The original pattern string
-        /// </summary>
+        /// <summary>The original pattern string</summary>
         public string Pattern { get; }
 
         /// <summary>
         /// Given a Match, emits into the StringBuilder the evaluated
         /// substitution pattern.
         /// </summary>
-        public void ReplacementImpl(ref ValueStringBuilder vsb, Match match)
+        public void ReplacementImpl(ref SegmentStringBuilder segments, Match match)
         {
-            for (int i = 0; i < _rules.Count; i++)
+            foreach (int r in _rules)
             {
-                int r = _rules[i];
-                if (r >= 0)   // string lookup
-                    vsb.Append(_strings[r]);
-                else if (r < -Specials) // group lookup
-                    vsb.Append(match.GroupToStringImpl(-Specials - 1 - r));
+                if (r >= 0)
+                {
+                    // string lookup
+                    segments.Add(_strings[r].AsMemory());
+                }
+                else if (r < -Specials)
+                {
+                    // group lookup
+                    segments.Add(match.GroupToStringImpl(-Specials - 1 - r));
+                }
                 else
                 {
+                    // special insertion patterns
                     switch (-Specials - 1 - r)
-                    { // special insertion patterns
+                    {
                         case LeftPortion:
-                            vsb.Append(match.GetLeftSubstring());
+                            segments.Add(match.GetLeftSubstring());
                             break;
                         case RightPortion:
-                            vsb.Append(match.GetRightSubstring());
+                            segments.Add(match.GetRightSubstring());
                             break;
                         case LastGroup:
-                            vsb.Append(match.LastGroupToStringImpl());
+                            segments.Add(match.LastGroupToStringImpl());
                             break;
                         case WholeString:
-                            vsb.Append(match.Text);
+                            segments.Add(match.Text.AsMemory());
                             break;
                     }
                 }
@@ -141,42 +153,46 @@ namespace System.Text.RegularExpressions
         }
 
         /// <summary>
-        /// Given a Match, emits into the ValueStringBuilder the evaluated
+        /// Given a Match, emits into the builder the evaluated
         /// Right-to-Left substitution pattern.
         /// </summary>
-        public void ReplacementImplRTL(ref ValueStringBuilder vsb, Match match)
+        public void ReplacementImplRTL(ref SegmentStringBuilder segments, Match match)
         {
-            for (int i = _rules.Count - 1; i >= 0; i--)
+            for (int i = _rules.Length - 1; i >= 0; i--)
             {
                 int r = _rules[i];
-                if (r >= 0)  // string lookup
-                    vsb.AppendReversed(_strings[r]);
-                else if (r < -Specials) // group lookup
-                    vsb.AppendReversed(match.GroupToStringImpl(-Specials - 1 - r));
+                if (r >= 0)
+                {
+                    // string lookup
+                    segments.Add(_strings[r].AsMemory());
+                }
+                else if (r < -Specials)
+                {
+                    // group lookup
+                    segments.Add(match.GroupToStringImpl(-Specials - 1 - r));
+                }
                 else
                 {
+                    // special insertion patterns
                     switch (-Specials - 1 - r)
-                    { // special insertion patterns
+                    {
                         case LeftPortion:
-                            vsb.AppendReversed(match.GetLeftSubstring());
+                            segments.Add(match.GetLeftSubstring());
                             break;
                         case RightPortion:
-                            vsb.AppendReversed(match.GetRightSubstring());
+                            segments.Add(match.GetRightSubstring());
                             break;
                         case LastGroup:
-                            vsb.AppendReversed(match.LastGroupToStringImpl());
+                            segments.Add(match.LastGroupToStringImpl());
                             break;
                         case WholeString:
-                            vsb.AppendReversed(match.Text);
+                            segments.Add(match.Text.AsMemory());
                             break;
                     }
                 }
             }
         }
 
-        // Three very similar algorithms appear below: replace (pattern),
-        // replace (evaluator), and split.
-
         /// <summary>
         /// Replaces all occurrences of the regex in the string with the
         /// replacement pattern.
@@ -189,71 +205,60 @@ namespace System.Text.RegularExpressions
         public string Replace(Regex regex, string input, int count, int startat)
         {
             if (count < -1)
-                throw new ArgumentOutOfRangeException(nameof(count), SR.CountTooSmall);
-            if (startat < 0 || startat > input.Length)
-                throw new ArgumentOutOfRangeException(nameof(startat), SR.BeginIndexNotNegative);
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.count, ExceptionResource.CountTooSmall);
+            }
+            if ((uint)startat > (uint)input.Length)
+            {
+                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startat, ExceptionResource.BeginIndexNotNegative);
+            }
 
             if (count == 0)
-                return input;
-
-            Match match = regex.Match(input, startat);
-            if (!match.Success)
             {
                 return input;
             }
-            else
-            {
-                var vsb = new ValueStringBuilder(stackalloc char[256]);
-
-                if (!regex.RightToLeft)
-                {
-                    int prevat = 0;
-
-                    do
-                    {
-                        if (match.Index != prevat)
-                            vsb.Append(input.AsSpan(prevat, match.Index - prevat));
-
-                        prevat = match.Index + match.Length;
-                        ReplacementImpl(ref vsb, match);
-                        if (--count == 0)
-                            break;
 
-                        match = match.NextMatch();
-                    } while (match.Success);
+            var state = (replacement: this, segments: new SegmentStringBuilder(256), inputMemory: input.AsMemory(), prevat: 0, count);
 
-                    if (prevat < input.Length)
-                        vsb.Append(input.AsSpan(prevat, input.Length - prevat));
-                }
-                else
+            if (!regex.RightToLeft)
+            {
+                regex.Run(input, startat, ref state, (ref (RegexReplacement thisRef, SegmentStringBuilder segments, ReadOnlyMemory<char> inputMemory, int prevat, int count) state, Match match) =>
                 {
-                    // In right to left mode append all the inputs in reversed order to avoid an extra dynamic data structure
-                    // and to be able to work with Spans. A final reverse of the transformed reversed input string generates
-                    // the desired output. Similar to Tower of Hanoi.
+                    state.segments.Add(state.inputMemory.Slice(state.prevat, match.Index - state.prevat));
+                    state.prevat = match.Index + match.Length;
+                    state.thisRef.ReplacementImpl(ref state.segments, match);
+                    return --state.count != 0;
+                });
 
-                    int prevat = input.Length;
-
-                    do
-                    {
-                        if (match.Index + match.Length != prevat)
-                            vsb.AppendReversed(input.AsSpan(match.Index + match.Length, prevat - match.Index - match.Length));
-
-                        prevat = match.Index;
-                        ReplacementImplRTL(ref vsb, match);
-                        if (--count == 0)
-                            break;
+                if (state.segments.Count == 0)
+                {
+                    return input;
+                }
 
-                        match = match.NextMatch();
-                    } while (match.Success);
+                state.segments.Add(state.inputMemory.Slice(state.prevat, input.Length - state.prevat));
+            }
+            else
+            {
+                state.prevat = input.Length;
 
-                    if (prevat > 0)
-                        vsb.AppendReversed(input.AsSpan(0, prevat));
+                regex.Run(input, startat, ref state, (ref (RegexReplacement thisRef, SegmentStringBuilder segments, ReadOnlyMemory<char> inputMemory, int prevat, int count) state, Match match) =>
+                {
+                    state.segments.Add(state.inputMemory.Slice(match.Index + match.Length, state.prevat - match.Index - match.Length));
+                    state.prevat = match.Index;
+                    state.thisRef.ReplacementImplRTL(ref state.segments, match);
+                    return --state.count != 0;
+                });
 
-                    vsb.Reverse();
+                if (state.segments.Count == 0)
+                {
+                    return input;
                 }
 
-                return vsb.ToString();
+                state.segments.Add(state.inputMemory.Slice(0, state.prevat));
+                state.segments.AsSpan().Reverse();
             }
+
+            return state.segments.ToString();
         }
     }
 }
index 6f2bb75f97b69eaa576a958931ba71e309c81eb4..acaf1e21258448001ebcd84b5e4d37cd5d4b5328 100644 (file)
@@ -14,6 +14,7 @@
 // methods to push new subpattern match results into (or remove
 // backtracked results from) the Match instance.
 
+using System.Collections.Generic;
 using System.Diagnostics;
 using System.Diagnostics.CodeAnalysis;
 using System.Globalization;
@@ -63,7 +64,6 @@ namespace System.Text.RegularExpressions
         private bool _ignoreTimeout;
         private int _timeoutOccursAt;
 
-
         // We have determined this value in a series of experiments where x86 retail
         // builds (ono-lab-optimized) were run on different pattern/input pairs. Larger values
         // of TimeoutCheckFrequency did not tend to increase performance; smaller values
@@ -85,118 +85,251 @@ namespace System.Text.RegularExpressions
         /// and we could use a separate method Skip() that will quickly scan past
         /// any characters that we know can't match.
         /// </summary>
-        protected internal Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick)
-        {
-            return Scan(regex, text, textbeg, textend, textstart, prevlen, quick, regex.MatchTimeout);
-        }
+        protected internal Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick) =>
+            Scan(regex, text, textbeg, textend, textstart, prevlen, quick, regex.MatchTimeout);
 
         protected internal Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick, TimeSpan timeout)
         {
-            int bump;
-            int stoppos;
-            bool initted = false;
-
-            // We need to re-validate timeout here because Scan is historically protected and
-            // thus there is a possibility it is called from user code:
-            Regex.ValidateMatchTimeout(timeout);
-
-            _ignoreTimeout = (Regex.InfiniteMatchTimeout == timeout);
-            _timeout = _ignoreTimeout
-                                    ? (int)Regex.InfiniteMatchTimeout.TotalMilliseconds
-                                    : (int)(timeout.TotalMilliseconds + 0.5); // Round
-
+            // Store arguments into fields for derived runner to examine
             runregex = regex;
             runtext = text;
             runtextbeg = textbeg;
             runtextend = textend;
-            runtextstart = textstart;
+            runtextpos = runtextstart = textstart;
 
-            bump = runregex.RightToLeft ? -1 : 1;
-            stoppos = runregex.RightToLeft ? runtextbeg : runtextend;
-
-            runtextpos = textstart;
+            // Handle timeout argument
+            _timeout = -1; // (int)Regex.InfiniteMatchTimeout.TotalMilliseconds
+            bool ignoreTimeout = _ignoreTimeout = Regex.InfiniteMatchTimeout == timeout;
+            if (!ignoreTimeout)
+            {
+                // We are using Environment.TickCount and not Stopwatch for performance reasons.
+                // Environment.TickCount is an int that cycles. We intentionally let timeoutOccursAt
+                // overflow it will still stay ahead of Environment.TickCount for comparisons made
+                // in DoCheckTimeout().
+                Regex.ValidateMatchTimeout(timeout); // validate timeout as this could be called from user code due to being protected
+                _timeout = (int)(timeout.TotalMilliseconds + 0.5); // Round;
+                _timeoutOccursAt = Environment.TickCount + _timeout;
+                _timeoutChecksToSkip = TimeoutCheckFrequency;
+            }
 
-            // If previous match was empty or failed, advance by one before matching
+            // Configure the additional value to "bump" the position along each time we loop around
+            // to call FindFirstChar again, as well as the stopping position for the loop.  We generally
+            // bump by 1 and stop at runtextend, but if we're examining right-to-left, we instead bump
+            // by -1 and stop at runtextbeg.
+            int bump = 1, stoppos = runtextend;
+            if (runregex.RightToLeft)
+            {
+                bump = -1;
+                stoppos = runtextbeg;
+            }
 
+            // If previous match was empty or failed, advance by one before matching.
             if (prevlen == 0)
             {
                 if (runtextpos == stoppos)
+                {
                     return Match.Empty;
+                }
 
                 runtextpos += bump;
             }
 
-            StartTimeoutWatch();
-
+            // Main loop: FindFirstChar/Go + bump until the ending position.
+            bool initialized = false;
             while (true)
             {
 #if DEBUG
-                if (runregex.Debug)
+                if (runregex.IsDebug)
                 {
                     Debug.WriteLine("");
-                    Debug.WriteLine("Search range: from " + runtextbeg.ToString(CultureInfo.InvariantCulture) + " to " + runtextend.ToString(CultureInfo.InvariantCulture));
-                    Debug.WriteLine("Firstchar search starting at " + runtextpos.ToString(CultureInfo.InvariantCulture) + " stopping at " + stoppos.ToString(CultureInfo.InvariantCulture));
+                    Debug.WriteLine($"Search range: from {runtextbeg} to {runtextend}");
+                    Debug.WriteLine($"Firstchar search starting at {runtextpos} stopping at {stoppos}");
                 }
 #endif
+
+                // Find the next potential location for a match in the input.
                 if (FindFirstChar())
                 {
-                    CheckTimeout();
+                    if (!ignoreTimeout)
+                    {
+                        DoCheckTimeout();
+                    }
 
-                    if (!initted)
+                    // Ensure that the runner is initialized.  This includes initializing all of the state in the runner
+                    // that Go might use, such as the backtracking stack, as well as a Match object for it to populate.
+                    if (!initialized)
                     {
-                        InitMatch();
-                        initted = true;
+                        InitializeForGo();
+                        initialized = true;
                     }
+
 #if DEBUG
-                    if (runregex.Debug)
+                    if (runregex.IsDebug)
                     {
-                        Debug.WriteLine("Executing engine starting at " + runtextpos.ToString(CultureInfo.InvariantCulture));
+                        Debug.WriteLine($"Executing engine starting at {runtextpos}");
                         Debug.WriteLine("");
                     }
 #endif
+
+                    // See if there's a match at this position.
                     Go();
 
-                    if (runmatch!._matchcount[0] > 0)
+                    // If we got a match, we're done.
+                    Match match = runmatch!;
+                    if (match._matchcount[0] > 0)
                     {
-                        // We'll return a match even if it touches a previous empty match
-                        return TidyMatch(quick);
+                        if (quick)
+                        {
+                            return null;
+                        }
+
+                        // Return the match in its canonical form.
+                        runmatch = null;
+                        match.Tidy(runtextpos);
+                        return match;
                     }
 
-                    // reset state for another go
+                    // Reset state for another iteration.
                     runtrackpos = runtrack!.Length;
                     runstackpos = runstack!.Length;
                     runcrawlpos = runcrawl!.Length;
                 }
 
-                // failure!
-
+                // We failed to match at this position.  If we're at the stopping point, we're done.
                 if (runtextpos == stoppos)
                 {
-                    TidyMatch(true);
                     return Match.Empty;
                 }
 
-                // Recognize leading []* and various anchors, and bump on failure accordingly
-
-                // Bump by one and start again
-
+                // Bump by one (in whichever direction is appropriate) and loop to go again.
                 runtextpos += bump;
             }
-            // We never get here
         }
 
-        private void StartTimeoutWatch()
+        /// <summary>Enumerates all of the matches with the specified regex, invoking the callback for each.</summary>
+        /// <remarks>
+        /// This repeatedly hands out the same Match instance, updated with new information.
+        /// </remarks>
+        internal void Scan<TState>(Regex regex, string text, int textstart, ref TState state, MatchCallback<TState> callback, TimeSpan timeout)
         {
-            if (_ignoreTimeout)
-                return;
+            // Store arguments into fields for derived runner to examine
+            runregex = regex;
+            runtext = text;
+            runtextbeg = 0;
+            runtextend = text.Length;
+            runtextpos = runtextstart = textstart;
+
+            // Handle timeout argument
+            _timeout = -1; // (int)Regex.InfiniteMatchTimeout.TotalMilliseconds
+            bool ignoreTimeout = _ignoreTimeout = Regex.InfiniteMatchTimeout == timeout;
+            if (!ignoreTimeout)
+            {
+                // We are using Environment.TickCount and not Stopwatch for performance reasons.
+                // Environment.TickCount is an int that cycles. We intentionally let timeoutOccursAt
+                // overflow it will still stay ahead of Environment.TickCount for comparisons made
+                // in DoCheckTimeout().
+                _timeout = (int)(timeout.TotalMilliseconds + 0.5); // Round;
+                _timeoutOccursAt = Environment.TickCount + _timeout;
+                _timeoutChecksToSkip = TimeoutCheckFrequency;
+            }
 
-            _timeoutChecksToSkip = TimeoutCheckFrequency;
+            // Configure the additional value to "bump" the position along each time we loop around
+            // to call FindFirstChar again, as well as the stopping position for the loop.  We generally
+            // bump by 1 and stop at runtextend, but if we're examining right-to-left, we instead bump
+            // by -1 and stop at runtextbeg.
+            int bump = 1, stoppos = runtextend;
+            if (runregex.RightToLeft)
+            {
+                bump = -1;
+                stoppos = runtextbeg;
+            }
+
+            // Main loop: FindFirstChar/Go + bump until the ending position.
+            bool initialized = false;
+            while (true)
+            {
+#if DEBUG
+                if (runregex.IsDebug)
+                {
+                    Debug.WriteLine("");
+                    Debug.WriteLine($"Search range: from {runtextbeg} to {runtextend}");
+                    Debug.WriteLine($"Firstchar search starting at {runtextpos} stopping at {stoppos}");
+                }
+#endif
 
-            // We are using Environment.TickCount and not Timewatch for performance reasons.
-            // Environment.TickCount is an int that cycles. We intentionally let timeoutOccursAt
-            // overflow it will still stay ahead of Environment.TickCount for comparisons made
-            // in DoCheckTimeout():
-            _timeoutOccursAt = Environment.TickCount + _timeout;
+                // Find the next potential location for a match in the input.
+                if (FindFirstChar())
+                {
+                    if (!ignoreTimeout)
+                    {
+                        DoCheckTimeout();
+                    }
+
+                    // Ensure that the runner is initialized.  This includes initializing all of the state in the runner
+                    // that Go might use, such as the backtracking stack, as well as a Match object for it to populate.
+                    if (!initialized)
+                    {
+                        InitializeForGo();
+                        initialized = true;
+                    }
+
+#if DEBUG
+                    if (runregex.IsDebug)
+                    {
+                        Debug.WriteLine($"Executing engine starting at {runtextpos}");
+                        Debug.WriteLine("");
+                    }
+#endif
+
+                    // See if there's a match at this position.
+                    Go();
+
+                    // See if we have a match.
+                    Match match = runmatch!;
+                    if (match._matchcount[0] > 0)
+                    {
+                        // Hand it out to the callback in canonical form.
+                        match.Tidy(runtextpos);
+                        initialized = false;
+                        if (!callback(ref state, match))
+                        {
+                            // If the callback returns false, we're done.
+                            return;
+                        }
+
+                        // Reset state for another iteration.
+                        runtrackpos = runtrack!.Length;
+                        runstackpos = runstack!.Length;
+                        runcrawlpos = runcrawl!.Length;
+                        if (match.Length == 0)
+                        {
+                            if (runtextpos == stoppos)
+                            {
+                                return;
+                            }
+
+                            runtextpos += bump;
+                        }
+
+                        // Loop around to perform next match from where we left off.
+                        continue;
+                    }
+
+                    // Ran Go but it didn't find a match. Reset state for another iteration.
+                    runtrackpos = runtrack!.Length;
+                    runstackpos = runstack!.Length;
+                    runcrawlpos = runcrawl!.Length;
+                }
+
+                // We failed to match at this position.  If we're at the stopping point, we're done.
+                if (runtextpos == stoppos)
+                {
+                    return;
+                }
+
+                // Bump by one (in whichever direction is appropriate) and loop to go again.
+                runtextpos += bump;
+            }
         }
 
         protected void CheckTimeout()
@@ -226,14 +359,14 @@ namespace System.Text.RegularExpressions
                 return;
 
 #if DEBUG
-            if (runregex!.Debug)
+            if (runregex!.IsDebug)
             {
                 Debug.WriteLine("");
                 Debug.WriteLine("RegEx match timeout occurred!");
-                Debug.WriteLine("Specified timeout:       " + TimeSpan.FromMilliseconds(_timeout).ToString());
-                Debug.WriteLine("Timeout check frequency: " + TimeoutCheckFrequency);
-                Debug.WriteLine("Search pattern:          " + runregex.pattern);
-                Debug.WriteLine("Input:                   " + runtext);
+                Debug.WriteLine($"Specified timeout:       {TimeSpan.FromMilliseconds(_timeout)}");
+                Debug.WriteLine($"Timeout check frequency: {TimeoutCheckFrequency}");
+                Debug.WriteLine($"Search pattern:          {runregex.pattern}");
+                Debug.WriteLine($"Input:                   {runtext}");
                 Debug.WriteLine("About to throw RegexMatchTimeoutException.");
             }
 #endif
@@ -266,27 +399,24 @@ namespace System.Text.RegularExpressions
         /// <summary>
         /// Initializes all the data members that are used by Go()
         /// </summary>
-        private void InitMatch()
+        private void InitializeForGo()
         {
-            // Use a hashtabled Match object if the capture numbers are sparse
-
-            if (runmatch == null)
+            if (runmatch is null)
             {
-                if (runregex!.caps != null)
-                    runmatch = new MatchSparse(runregex, runregex.caps, runregex.capsize, runtext!, runtextbeg, runtextend - runtextbeg, runtextstart);
-                else
-                    runmatch = new Match(runregex, runregex.capsize, runtext!, runtextbeg, runtextend - runtextbeg, runtextstart);
+                // Use a hashtabled Match object if the capture numbers are sparse
+                runmatch = runregex!.caps is null ?
+                    new Match(runregex, runregex.capsize, runtext!, runtextbeg, runtextend - runtextbeg, runtextstart) :
+                    new MatchSparse(runregex, runregex.caps, runregex.capsize, runtext!, runtextbeg, runtextend - runtextbeg, runtextstart);
             }
             else
             {
                 runmatch.Reset(runregex!, runtext!, runtextbeg, runtextend, runtextstart);
             }
 
-            // note we test runcrawl, because it is the last one to be allocated
+            // Note we test runcrawl, because it is the last one to be allocated
             // If there is an alloc failure in the middle of the three allocations,
             // we may still return to reuse this instance, and we want to behave
-            // as if the allocations didn't occur. (we used to test _trackcount != 0)
-
+            // as if the allocations didn't occur.
             if (runcrawl != null)
             {
                 runtrackpos = runtrack!.Length;
@@ -295,15 +425,22 @@ namespace System.Text.RegularExpressions
                 return;
             }
 
+            // Everything above runs once per match.
+            // Everything below runs once per runner.
+
             InitTrackCount();
 
-            int tracksize = runtrackcount * 8;
-            int stacksize = runtrackcount * 8;
+            int stacksize;
+            int tracksize = stacksize = runtrackcount * 8;
 
             if (tracksize < 32)
+            {
                 tracksize = 32;
+            }
             if (stacksize < 16)
+            {
                 stacksize = 16;
+            }
 
             runtrack = new int[tracksize];
             runtrackpos = tracksize;
@@ -315,29 +452,6 @@ namespace System.Text.RegularExpressions
             runcrawlpos = 32;
         }
 
-        /// <summary>
-        /// Put match in its canonical form before returning it.
-        /// </summary>
-        private Match? TidyMatch(bool quick)
-        {
-            if (!quick)
-            {
-                Match match = runmatch!;
-
-                runmatch = null;
-
-                match.Tidy(runtextpos);
-                return match;
-            }
-            else
-            {
-                // in quick mode, a successful match returns null, and
-                // the allocated match object is left in the cache
-
-                return null;
-            }
-        }
-
         /// <summary>
         /// Called by the implementation of Go() to increase the size of storage
         /// </summary>
@@ -551,9 +665,9 @@ namespace System.Text.RegularExpressions
         [ExcludeFromCodeCoverage]
         internal virtual void DumpState()
         {
-            Debug.WriteLine("Text:  " + TextposDescription());
-            Debug.WriteLine("Track: " + StackDescription(runtrack!, runtrackpos));
-            Debug.WriteLine("Stack: " + StackDescription(runstack!, runstackpos));
+            Debug.WriteLine($"Text:  {TextposDescription()}");
+            Debug.WriteLine($"Track: {StackDescription(runtrack!, runtrackpos)}");
+            Debug.WriteLine($"Stack: {StackDescription(runstack!, runstackpos)}");
         }
 
         [ExcludeFromCodeCoverage]
diff --git a/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/ThrowHelper.cs b/src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/ThrowHelper.cs
new file mode 100644 (file)
index 0000000..7c22974
--- /dev/null
@@ -0,0 +1,97 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+using System.Diagnostics.CodeAnalysis;
+
+namespace System.Text.RegularExpressions
+{
+    internal static class ThrowHelper
+    {
+        [DoesNotReturn]
+        internal static Exception CreateArgumentException(ExceptionResource resource) =>
+            throw new ArgumentException(GetStringForExceptionResource(resource));
+
+        [DoesNotReturn]
+        internal static void ThrowArgumentNullException(ExceptionArgument arg) =>
+            throw new ArgumentNullException(GetStringForExceptionArgument(arg));
+
+        [DoesNotReturn]
+        internal static void ThrowArgumentOutOfRangeException(ExceptionArgument arg) =>
+            throw new ArgumentOutOfRangeException(GetStringForExceptionArgument(arg));
+
+        [DoesNotReturn]
+        internal static void ThrowArgumentOutOfRangeException(ExceptionArgument arg, ExceptionResource resource) =>
+            throw new ArgumentOutOfRangeException(GetStringForExceptionArgument(arg), GetStringForExceptionResource(resource));
+
+        [DoesNotReturn]
+        internal static void ThrowNotSupportedException(ExceptionResource resource) =>
+            throw new NotSupportedException(GetStringForExceptionResource(resource));
+
+        private static string? GetStringForExceptionArgument(ExceptionArgument arg) =>
+            arg switch
+            {
+                ExceptionArgument.assemblyname => nameof(ExceptionArgument.assemblyname),
+                ExceptionArgument.array => nameof(ExceptionArgument.array),
+                ExceptionArgument.arrayIndex => nameof(ExceptionArgument.arrayIndex),
+                ExceptionArgument.count => nameof(ExceptionArgument.count),
+                ExceptionArgument.evaluator => nameof(ExceptionArgument.evaluator),
+                ExceptionArgument.i => nameof(ExceptionArgument.i),
+                ExceptionArgument.inner => nameof(ExceptionArgument.inner),
+                ExceptionArgument.input => nameof(ExceptionArgument.input),
+                ExceptionArgument.length => nameof(ExceptionArgument.length),
+                ExceptionArgument.matchTimeout => nameof(ExceptionArgument.matchTimeout),
+                ExceptionArgument.name => nameof(ExceptionArgument.name),
+                ExceptionArgument.options => nameof(ExceptionArgument.options),
+                ExceptionArgument.pattern => nameof(ExceptionArgument.pattern),
+                ExceptionArgument.regexinfos => nameof(ExceptionArgument.regexinfos),
+                ExceptionArgument.replacement => nameof(ExceptionArgument.replacement),
+                ExceptionArgument.startat => nameof(ExceptionArgument.startat),
+                ExceptionArgument.str => nameof(ExceptionArgument.str),
+                ExceptionArgument.value => nameof(ExceptionArgument.value),
+                _ => null
+            };
+
+        private static string? GetStringForExceptionResource(ExceptionResource resource) =>
+            resource switch
+            {
+                ExceptionResource.BeginIndexNotNegative => SR.BeginIndexNotNegative,
+                ExceptionResource.CountTooSmall => SR.CountTooSmall,
+                ExceptionResource.LengthNotNegative => SR.LengthNotNegative,
+                ExceptionResource.OnlyAllowedOnce => SR.OnlyAllowedOnce,
+                ExceptionResource.ReplacementError => SR.ReplacementError,
+                _ => null
+            };
+    }
+
+    internal enum ExceptionArgument
+    {
+        assemblyname,
+        array,
+        arrayIndex,
+        count,
+        evaluator,
+        i,
+        inner,
+        input,
+        length,
+        matchTimeout,
+        name,
+        options,
+        pattern,
+        regexinfos,
+        replacement,
+        startat,
+        str,
+        value,
+    }
+
+    internal enum ExceptionResource
+    {
+        BeginIndexNotNegative,
+        CountTooSmall,
+        LengthNotNegative,
+        OnlyAllowedOnce,
+        ReplacementError,
+    }
+}
diff --git a/src/libraries/System.Text.RegularExpressions/src/System/Text/SegmentStringBuilder.cs b/src/libraries/System.Text.RegularExpressions/src/System/Text/SegmentStringBuilder.cs
new file mode 100644 (file)
index 0000000..71f5839
--- /dev/null
@@ -0,0 +1,94 @@
+// Licensed to the .NET Foundation under one or more agreements.
+// The .NET Foundation licenses this file to you under the MIT license.
+// See the LICENSE file in the project root for more information.
+
+using System.Buffers;
+using System.Diagnostics;
+using System.Runtime.CompilerServices;
+
+namespace System.Text
+{
+    /// <summary>Provides a value type string builder composed of individual segments represented as <see cref="ReadOnlyMemory{T}"/> instances.</summary>
+    [DebuggerDisplay("Count = {_count}")]
+    internal struct SegmentStringBuilder
+    {
+        /// <summary>The array backing the builder, obtained from <see cref="ArrayPool{T}.Shared"/>.</summary>
+        private ReadOnlyMemory<char>[] _array;
+        /// <summary>The number of items in <see cref="_array"/>, and thus also the next position in the array to be filled.</summary>
+        private int _count;
+
+        /// <summary>Initializes the builder.</summary>
+        /// <param name="capacity">The initial capacity of the builder.</param>
+        public SegmentStringBuilder(int capacity)
+        {
+            Debug.Assert(capacity > 0);
+            _array = ArrayPool<ReadOnlyMemory<char>>.Shared.Rent(capacity);
+            _count = 0;
+        }
+
+        /// <summary>Gets the number of segments added to the builder.</summary>
+        public int Count => _count;
+
+        /// <summary>Adds a segment to the builder.</summary>
+        /// <param name="segment">The segment.</param>
+        [MethodImpl(MethodImplOptions.AggressiveInlining)]
+        public void Add(ReadOnlyMemory<char> segment)
+        {
+            ReadOnlyMemory<char>[] array = _array;
+            int pos = _count;
+            if ((uint)pos < (uint)array.Length)
+            {
+                array[pos] = segment;
+                _count = pos + 1;
+            }
+            else
+            {
+                GrowAndAdd(segment);
+            }
+        }
+
+        /// <summary>Grows the builder to accomodate another segment.</summary>
+        /// <param name="segment"></param>
+        [MethodImpl(MethodImplOptions.NoInlining)]
+        private void GrowAndAdd(ReadOnlyMemory<char> segment)
+        {
+            ReadOnlyMemory<char>[] array = _array;
+            Debug.Assert(array.Length == _count);
+
+            ReadOnlyMemory<char>[] newArray = _array = ArrayPool<ReadOnlyMemory<char>>.Shared.Rent(array.Length * 2);
+            Array.Copy(array, newArray, _count);
+            ArrayPool<ReadOnlyMemory<char>>.Shared.Return(array, clearArray: true);
+            newArray[_count++] = segment;
+        }
+
+        /// <summary>Gets a span of all segments in the builder.</summary>
+        /// <returns></returns>
+        public Span<ReadOnlyMemory<char>> AsSpan() => new Span<ReadOnlyMemory<char>>(_array, 0, _count);
+
+        /// <summary>Creates a string from all the segments in the builder and then disposes of the builder.</summary>
+        public override string ToString()
+        {
+            int length = 0;
+            foreach (ReadOnlyMemory<char> segment in AsSpan())
+            {
+                length += segment.Length;
+            }
+
+            string result = string.Create(length, this, (dest, builder) =>
+            {
+                foreach (ReadOnlyMemory<char> segment in builder.AsSpan())
+                {
+                    segment.Span.CopyTo(dest);
+                    dest = dest.Slice(segment.Length);
+                }
+            });
+
+            ReadOnlyMemory<char>[] array = _array;
+            AsSpan().Clear(); // clear just what's been filled
+            this = default;
+            ArrayPool<ReadOnlyMemory<char>>.Shared.Return(array);
+
+            return result;
+        }
+    }
+}
diff --git a/src/libraries/System.Text.RegularExpressions/src/System/Text/ValueStringBuilder.Reverse.cs b/src/libraries/System.Text.RegularExpressions/src/System/Text/ValueStringBuilder.Reverse.cs
deleted file mode 100644 (file)
index 76e2144..0000000
+++ /dev/null
@@ -1,23 +0,0 @@
-// Licensed to the .NET Foundation under one or more agreements.
-// The .NET Foundation licenses this file to you under the MIT license.
-// See the LICENSE file in the project root for more information.
-
-namespace System.Text
-{
-    internal ref partial struct ValueStringBuilder
-    {
-        public void AppendReversed(ReadOnlySpan<char> value)
-        {
-            Span<char> span = AppendSpan(value.Length);
-            for (int i = 0; i < span.Length; i++)
-            {
-                span[i] = value[value.Length - i - 1];
-            }
-        }
-
-        public void Reverse()
-        {
-            _chars.Slice(0, _pos).Reverse();
-        }
-    }
-}
index 8b65fee14690dcd02e334313479b028e2e4b81e2..09a569f0573d17e7bc17aa935564c3e622b42b7f 100644 (file)
@@ -7,7 +7,9 @@ using System.Collections;
 using System.Text.RegularExpressions;
 using RegexTestNamespace;
 using Xunit;
-using System.Collections.Generic;
+
+// NOTE: Be very thoughtful when editing this test file.  It's decompiled from an assembly generated
+// by CompileToAssembly on .NET Framework, and is used to help validate compatibility with such assemblies.
 
 namespace System.Text.RegularExpressionsTests
 {
index cef5a825e2bad51105cdf6f72e2094f95f86b468..a97ec741df4d1a885f23ded1a2fec6f2f30b3aba 100644 (file)
@@ -26,9 +26,9 @@ namespace System.Text.RegularExpressions.Tests
             // Stress
             string pattern = string.Concat(Enumerable.Repeat("([a-z]", 1000).Concat(Enumerable.Repeat(")", 1000)));
             string input = string.Concat(Enumerable.Repeat("abcde", 200));
-
             yield return new object[] { pattern, input, "$1000", RegexOptions.None, input.Length, 0, "e" };
             yield return new object[] { pattern, input, "$1", RegexOptions.None, input.Length, 0, input };
+            yield return new object[] { ".", new string('a', 1000), "b", RegexOptions.None, 1000, 0, new string('b', 1000) };
 
             // Undefined group
             yield return new object[] { "([a_z])(.+)", "abc", "$3", RegexOptions.None, 3, 0, "$3" };