From: David Mitchell Date: Sat, 1 Feb 2014 00:33:11 +0000 (+0000) Subject: re_intuit_start(): harmonise other_last++ X-Git-Tag: upstream/5.20.0~464^2~37 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=7d4ad9f90baf4256bba3cc273851b52721d4074c;p=platform%2Fupstream%2Fperl.git re_intuit_start(): harmonise other_last++ In the other=anchored branch, at the end on failure or success, we set other_last to HOP3(last, 1) or HOP3(s, 1) respectively, indicating the minimum point we should start matching if we ever have to try again. Clearly for failure, we know the substring can't be found at any position up to, or including last, so next time we should try at last+1. For success, if we return later it means that some other constraint failed, and we already know that the substr wasn't found at positions up to s-1, and that if we tried position s again we'd just repeat the previous failure. So in both cases set to N+1. In the other=float branch however, other_last is set to last or s on failure or success, with a big "XXX is this right?" against the "other_last = s" code. It turns out that "other_last = s" *is* right, for the special reasons explained in the code comments added by this commit; while "other_last = last" is changed to be "other_last = HOP3(last,1)". --- diff --git a/regexec.c b/regexec.c index 9ad2dca..87c7d9a 100644 --- a/regexec.c +++ b/regexec.c @@ -1086,17 +1086,23 @@ Perl_re_intuit_start(pTHX_ DEBUG_EXECUTE_r(PerlIO_printf(Perl_debug_log, ", trying anchored starting at offset %ld...\n", (long)(saved_s + 1 - i_strpos))); - other_last = last; /* highest failure */ + other_last = HOP3c(last, 1, strend); /* highest failure */ rx_origin = HOP3c(rx_origin, 1, strend); goto restart; } else { DEBUG_EXECUTE_r(PerlIO_printf(Perl_debug_log, " at offset %ld...\n", (long)(s - i_strpos))); - other_last = s; /* Fix this later. --Hugo */ - /* XXX the commit that introduced the "Fix this" comment - * above, changed the other_last assignment from s+1 to s. - * I've no idea what's right. DAPM */ + + /* other_last is set to s, not s+1, since its possible for + * a floating substr to fail first time, then succeed + * second time at the same floating position; e.g.: + * "-AB--AABZ" =~ /\wAB\d*Z/ + * The first time round, anchored and float match at + * "-(AB)--AAB(Z)" then fail on the initial \w character + * class. Second time round, they match at "-AB--A(AB)(Z)". + */ + other_last = s; s = saved_s; if (rx_origin == strpos) goto try_at_start;