From: Father Chrysostomos Date: Tue, 19 Nov 2013 05:53:43 +0000 (-0800) Subject: Move <-- HERE arrow for ‘Switch condition not recognized’ X-Git-Tag: upstream/5.20.0~1179 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=311cc1adfb2eac3d98a549ed5f912313fc528cea;p=platform%2Fupstream%2Fperl.git Move <-- HERE arrow for ‘Switch condition not recognized’ $ ./perl -Ilib -e '/(?(1(?#...)))/' Switch condition not recognized in regex; marked by <-- HERE in m/(?(1( <-- HERE ?#...)))/ at -e line 1. $ ./perl -Ilib -e '/(?(1x(?#...)))/' Switch condition not recognized in regex; marked by <-- HERE in m/(?(1x(?#...) <-- HERE ))/ at -e line 1. With the first one-liner, the arrow in the error message is pointing to the first offending character. With the second one-liner, the arrow points to the comment following the offending character. The logic for positioning the character is a little odd. The idea is supposed to be something like: if current_character++ is not ')' croak with the arrow right before current_character But nextchar() is used instead of ++, and nextchar() skips trailing whitespace and comments after incrementing the current parse position. We already have code right here to revert back to the previous parse position and then increment it by one character, for the sake of UTF8. Indeed, it behaves differently if you add a non-ASCII character under ‘use utf8’: $ ./perl -Ilib -e 'use utf8; /é(?(1x(?#...)))/' Switch condition not recognized in regex; marked by <-- HERE in m/?(?(1x <-- HERE (?#...)))/ at -e line 1. So what this commit does is extend that backtrack logic to happen all the time, not just with UTF8. --- diff --git a/regcomp.c b/regcomp.c index c9464ef..e78d2fc 100644 --- a/regcomp.c +++ b/regcomp.c @@ -9573,14 +9573,11 @@ S_reg(pTHX_ RExC_state_t *pRExC_state, I32 paren, I32 *flagp,U32 depth) insert_if_check_paren: if (*(tmp = nextchar(pRExC_state)) != ')') { - if ( UTF ) { - /* Like the name implies, nextchar deals in chars, - * not characters, so if under UTF, undo its work + /* nextchar also skips comments, so undo its work * and skip over the the next character. */ - RExC_parse = tmp; - RExC_parse += UTF8SKIP(RExC_parse); - } + RExC_parse = tmp; + RExC_parse += UTF ? UTF8SKIP(RExC_parse) : 1; vFAIL("Switch condition not recognized"); } insert_if: diff --git a/t/re/reg_mesg.t b/t/re/reg_mesg.t index 70c0b01..b235338 100644 --- a/t/re/reg_mesg.t +++ b/t/re/reg_mesg.t @@ -91,6 +91,7 @@ my @death = '/(?{ 1/' => 'Missing right curly or square bracket', '/(?(1x))/' => 'Switch condition not recognized {#} m/(?(1x{#}))/', + '/(?(1x(?#)))/'=> 'Switch condition not recognized {#} m/(?(1x{#}(?#)))/', '/(?(1)x|y|z)/' => 'Switch (?(condition)... contains too many branches {#} m/(?(1)x|y|{#}z)/',