The code to decide what substring of a pattern target to copy for the
sake of $1, $& etc, would, in the absence of $&, only copy the minimum
range needed to cover $1,$2,...., which might be a shorter range than
what $& covers. This is fine most of the time, but, when calculating
$+[0] on a unicode string, it needs a copy of the whole part of the string
covered by $&, since it needs to convert the byte offest into a char
offset.
So to fix this, always copy as a minimum, the $& range.
I suppose we could be more clever about this: detect the presence
of @+ in the code, only do it for UTF8 etc; but this is simple
and non-fragile.
&& !(RX_EXTFLAGS(rx) & RXf_PMf_KEEPCOPY) /* //p */
&& !(PL_sawampersand & SAWAMPERSAND_RIGHT)
) { /* don't copy $' part of string */
- U32 n = (PL_sawampersand & SAWAMPERSAND_MIDDLE) ? 0 : 1;
+ U32 n = 0;
max = -1;
/* calculate the right-most part of the string covered
* by a capture. Due to look-ahead, this may be to
&& !(RX_EXTFLAGS(rx) & RXf_PMf_KEEPCOPY) /* //p */
&& !(PL_sawampersand & SAWAMPERSAND_LEFT)
) { /* don't copy $` part of string */
- U32 n = (PL_sawampersand & SAWAMPERSAND_MIDDLE) ? 0 : 1;
+ U32 n = 0;
min = max;
/* calculate the left-most part of the string covered
* by a capture. Due to look-behind, this may be to
()ef def y $&-$1 ef-
()ef def y $-[0] 1
()ef def y $+[0] 3
+()\x{100}\x{1000} d\x{100}\x{1000} y $+[0] 3
()ef def y $-[1] 1
()ef def y $+[1] 1
*a - c - Quantifier follows nothing