From 10fdd3268bfedb0d10912f2f0ba6be13995de3fe Mon Sep 17 00:00:00 2001 From: Jarkko Hietaniemi Date: Mon, 26 Nov 2007 06:55:03 +0200 Subject: [PATCH] pod/perlrebackslash.pod: small Unicode additions Message-Id: <200711260255.lAQ2t37n188664@kosh.hut.fi> p4raw-id: //depot/perl@32493 --- pod/perlrebackslash.pod | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod index d8cfb6a..ac95ace 100644 --- a/pod/perlrebackslash.pod +++ b/pod/perlrebackslash.pod @@ -500,7 +500,9 @@ C<< (?>\x0D\x0A)|\v) >>. Since C<\R> can match a more than one character, it cannot be put inside a bracketed character class; C is an error. C<\R> is introduced in perl 5.10. -Mnemonic: none really. C<\R> was picked because PCRE already uses C<\R>. +Mnemonic: none really. C<\R> was picked because PCRE already uses C<\R>, +and more importantly because Unicode recommends such a regular expression +metacharacter, and suggests C<\R> as the notation. =item \X @@ -512,6 +514,11 @@ mark character followed by zero or more mark characters. Mark characters include (but are not restricted to) I and I. +C<\X> matches quite well what normal (non-Unicode-programmer) usage +would consider a single character: for example a base character +(the C<\PM> above), for example a letter, followed by zero or more +diacritics, which are I (the C<\pM*> above). + Mnemonic: eItended Unicode character. =back -- 2.7.4