NOTE: breaks up characters into their UTF-8 bytes,
so you may end up with malformed pieces of UTF-8.
-A C<\w> matches a single alphanumeric character or C<_>, not a whole word.
-Use C<\w+> to match a string of Perl-identifier characters (which isn't
-the same as matching an English word). If C<use locale> is in effect, the
-list of alphabetic characters generated by C<\w> is taken from the
-current locale. See L<perllocale>. You may use C<\w>, C<\W>, C<\s>, C<\S>,
+A C<\w> matches a single alphanumeric character (an alphabetic
+character, or a decimal digit) or C<_>, not a whole word. Use C<\w+>
+to match a string of Perl-identifier characters (which isn't the same
+as matching an English word). If C<use locale> is in effect, the list
+of alphabetic characters generated by C<\w> is taken from the current
+locale. See L<perllocale>. You may use C<\w>, C<\W>, C<\s>, C<\S>,
C<\d>, and C<\D> within character classes, but if you try to use them
-as endpoints of a range, that's not a range, the "-" is understood literally.
-See L<perlunicode> for details about C<\pP>, C<\PP>, and C<\X>.
+as endpoints of a range, that's not a range, the "-" is understood
+literally. If Unicode is in effect, C<\s> matches also "\x{85}",
+"\x{2028}, and "\x{2029}", see L<perlunicode> for more details about
+C<\pP>, C<\PP>, and C<\X>, and L<perluniintro> about Unicode in
+general.
The POSIX character class syntax
=item [2]
Not exactly equivalent to C<\s> since the C<[[:space:]]> includes
-also the (very rare) `vertical tabulator', \ck", chr(11).
+also the (very rare) `vertical tabulator', "\ck", chr(11).
=item [3]
-A Perl extension.
+A Perl extension, see above.
=back