From f580a93dfe4c980e326a70e8ab035562b0b6dbf7 Mon Sep 17 00:00:00 2001 From: Karl Williamson Date: Sat, 9 Apr 2011 11:03:37 -0600 Subject: [PATCH] PATCH: [perl #87812] BBC breaks Pod::Coverage::TrustPod This patch completes the fixing of this problem. The problem is that the failing .t set @INC to exclude lib, and hence couldn't find utf8.pm, which 5.14 was requiring in places where it previously didn't. This patch finishes the job of not requiring utf8.pm in so many places as were inadvertently added in 5.14. Commit 3ad98780b4bded02c371c83a668dc8f323e57718 started the job. This patch changes regcomp.c to not set ANYOF_NONBITMAP_NON_UTF8 where it inappropriately was. I don't know what I was thinking when I originally did what this changes. In order to match outside the bitmap, these characters all must match something that requires utf8, such as a LIGATURE FI. --- regcomp.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/regcomp.c b/regcomp.c index e626489..ade999c 100644 --- a/regcomp.c +++ b/regcomp.c @@ -9426,21 +9426,17 @@ S_set_regclass_bit_fold(pTHX_ RExC_state_t *pRExC_state, regnode* node, const U8 case 'I': case 'i': case 'L': case 'l': case 'T': case 't': - /* These all are targets of multi-character folds, which can - * occur with only non-Latin1 characters in the fold, so they - * can match if the target string isn't UTF-8 */ - ANYOF_FLAGS(node) |= ANYOF_NONBITMAP_NON_UTF8; - break; case 'A': case 'a': case 'H': case 'h': case 'J': case 'j': case 'N': case 'n': case 'W': case 'w': case 'Y': case 'y': - /* These all are targets of multi-character folds, which occur - * only with a non-Latin1 character as part of the fold, so - * they can't match unless the target string is in UTF-8, so no - * action here is necessary */ + /* These all are targets of multi-character folds from code + * points that require UTF8 to express, so they can't match + * unless the target string is in UTF-8, so no action here is + * necessary, as regexec.c properly handles the general case + * for UTF-8 matching */ break; default: /* Use deprecated warning to increase the chances of this -- 2.7.4