Banish "use utf8".

author Jarkko Hietaniemi <jhi@iki.fi>

Sat, 17 Nov 2001 22:22:47 +0000 (22:22 +0000)

committer Jarkko Hietaniemi <jhi@iki.fi>

Sat, 17 Nov 2001 22:22:47 +0000 (22:22 +0000)
author Jarkko Hietaniemi <jhi@iki.fi>
Sat, 17 Nov 2001 22:22:47 +0000 (22:22 +0000)
committer Jarkko Hietaniemi <jhi@iki.fi>
Sat, 17 Nov 2001 22:22:47 +0000 (22:22 +0000)
diff --git a/pod/perlre.pod b/pod/perlre.pod

index 6c687495cb0cd973421a74829eb80d7738f9c026..5c7e76b5ad6b06a55f2de294cbc3aa0d9cc8ef9f 100644 (file)
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -184,7 +184,9 @@ In addition, Perl defines the following:
      \PP        Match non-P
      \X Match eXtended Unicode "combining character sequence",
          equivalent to C<(?:\PM\pM*)>
-    \C Match a single C char (octet) even under utf8.
+    \C Match a single C char (octet) even under Unicode.
+       B<NOTE:> breaks up characters into their UTF-8 bytes,
+       so you may end up with malformed pieces of UTF-8.
  
  A C<\w> matches a single alphanumeric character or C<_>, not a whole word.
  Use C<\w+> to match a string of Perl-identifier characters (which isn't 
@@ -193,7 +195,7 @@ list of alphabetic characters generated by C<\w> is taken from the
  current locale.  See L<perllocale>.  You may use C<\w>, C<\W>, C<\s>, C<\S>,
  C<\d>, and C<\D> within character classes, but if you try to use them
  as endpoints of a range, that's not a range, the "-" is understood literally.
-See L<utf8> for details about C<\pP>, C<\PP>, and C<\X>.
+See L<perlunicode> for details about C<\pP>, C<\PP>, and C<\X>.
  
  The POSIX character class syntax
  
@@ -230,9 +232,10 @@ whole character class.  For example:
  
  matches zero, one, any alphabetic character, and the percentage sign.
  
-If the C<utf8> pragma is used, the following equivalences to Unicode
-\p{} constructs and equivalent backslash character classes (if available),
-will hold:
+The following equivalences to Unicode \p{} constructs and equivalent
+backslash character classes (if available), will hold:
+
+    [:...:]    \p{...}         backslash
  
      alpha       IsAlpha
      alnum       IsAlnum
@@ -291,7 +294,7 @@ work just fine) it is included for completeness.
  You can negate the [::] character classes by prefixing the class name
  with a '^'. This is a Perl extension.  For example:
  
-    POSIX      trad. Perl  utf8 Perl
+    POSIX      traditional Unicode
  
      [:^digit:]      \D      \P{IsDigit}
      [:^space:]     \S      \P{IsSpace}
diff --git a/pod/perlretut.pod b/pod/perlretut.pod

index f4e9bb64405d98c186de608f24c17b56ed9110da..bb2423b8af0e10dc08090ff2459f98d7b6dc2b5d 100644 (file)
--- a/pod/perlretut.pod
+++ b/pod/perlretut.pod
@@ -1653,12 +1653,11 @@ Unicode characters in the range of 128-255 use two hexadecimal digits
  with braces: C<\x{ab}>.  Note that this is different than C<\xab>,
  which is just a hexadecimal byte with no Unicode significance.
  
-B<NOTE>: in perl 5.6.0 it used to be that one needed to say C<use utf8>
-to use any Unicode features.  This is no more the case: for almost all
-Unicode processing, the explicit C<utf8> pragma is not needed.
-(The only case where it matters is if your Perl script is in Unicode,
-that is, encoded in UTF-8/UTF-16/UTF-EBCDIC: then an explicit C<use utf8>
-is needed.)
+B<NOTE>: in Perl 5.6.0 it used to be that one needed to say C<use
+utf8> to use any Unicode features.  This is no more the case: for
+almost all Unicode processing, the explicit C<utf8> pragma is not
+needed.  (The only case where it matters is if your Perl script is in
+Unicode and encoded in UTF-8, then an explicit C<use utf8> is needed.)
  
  Figuring out the hexadecimal sequence of a Unicode character you want
  or deciphering someone else's hexadecimal Unicode regexp is about as
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod

index 2fca71454a7f9c7b696c3178ef21174c7925748d..64116bcae10856ff0b2a952719213290a2a01e72 100644 (file)
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -782,7 +782,7 @@ for more discussion of the issues.
  
  =head1 SEE ALSO
  
-L<encoding>, L<Encode>, L<open>, L<bytes>, L<utf8>, L<perlretut>,
-L<perlvar/"${^WIDE_SYSTEM_CALLS}">
+L<perluniintro>, L<encoding>, L<Encode>, L<open>, L<utf8>, L<bytes>,
+L<perlretut>, L<perlvar/"${^WIDE_SYSTEM_CALLS}">
  
  =cut
author	Jarkko Hietaniemi <jhi@iki.fi>
	Sat, 17 Nov 2001 22:22:47 +0000 (22:22 +0000)
committer	Jarkko Hietaniemi <jhi@iki.fi>
	Sat, 17 Nov 2001 22:22:47 +0000 (22:22 +0000)
pod/perlre.pod		patch \| blob \| history
pod/perlretut.pod		patch \| blob \| history
pod/perlunicode.pod		patch \| blob \| history