platform/upstream/perl.git
12 years ago[perl #112944] perldelta: typo
Shirakata Kentaro [Tue, 15 May 2012 20:02:50 +0000 (13:02 -0700)]
[perl #112944] perldelta: typo

12 years agoAdd Shirataka Kentaro to AUTHORS
Father Chrysostomos [Tue, 15 May 2012 19:58:42 +0000 (12:58 -0700)]
Add Shirataka Kentaro to AUTHORS

12 years agoadd 5.16.0-RC0 and -RC1 to perlhist
Ricardo Signes [Tue, 15 May 2012 02:59:38 +0000 (22:59 -0400)]
add 5.16.0-RC0 and -RC1 to perlhist

12 years agominor grammar correction
Ricardo Signes [Tue, 15 May 2012 01:52:47 +0000 (21:52 -0400)]
minor grammar correction

thanks, Jim Keenan!

12 years agoadd Daniel Kahn Gillmor to AUTHORS
Ricardo Signes [Tue, 15 May 2012 01:49:01 +0000 (21:49 -0400)]
add Daniel Kahn Gillmor to AUTHORS

12 years agodocument the yet-explained Win32 test hanging
Ricardo Signes [Tue, 15 May 2012 01:22:06 +0000 (21:22 -0400)]
document the yet-explained Win32 test hanging

We will ship with this unfixed unless someone comes up with the
cure in the next week.

12 years agoperldelta: fix a noun/verb number agreement
Ricardo Signes [Tue, 15 May 2012 00:53:50 +0000 (20:53 -0400)]
perldelta: fix a noun/verb number agreement

reported by mauke

12 years agoskip t/win32/runenv.t unless -DPERL_IMPLICIT_SYS
Ricardo Signes [Tue, 15 May 2012 00:15:59 +0000 (20:15 -0400)]
skip t/win32/runenv.t unless -DPERL_IMPLICIT_SYS

this test fails without PERL_IMPLICIT_SYS, as reported by Steve
Hay in <CADED=K4EqXkJa2uC13wVYY_=uGDCx=uQ_rXu3Me4+3FvVM8D+g@mail.gmail.com>

12 years agoRevert fixes for [rt.cpan.org #61577]
Ricardo Signes [Mon, 14 May 2012 19:49:27 +0000 (15:49 -0400)]
Revert fixes for [rt.cpan.org #61577]

These changes introduced some test failures on AIX and other platforms,
and rather than dig around for more failing platforms during the RCx
period, we will revert this to reapply later when it is more tested.

This reverts commit 01b71c89216c9f447494638a5d108e13c45c3863.

This reverts commit b6903614db213f07401367249dc84c896eb099b7.

This reverts commit 271d04eee1933df0971f54f7bf9a5ca3575e7e6a.

12 years agonext release will be RC1
Ricardo Signes [Mon, 14 May 2012 16:26:36 +0000 (12:26 -0400)]
next release will be RC1

12 years agoperldelta: fix version named in acknowledgements
Ricardo Signes [Mon, 14 May 2012 16:26:24 +0000 (12:26 -0400)]
perldelta: fix version named in acknowledgements

12 years agoIn the Linux hints, invoke gcc with LANG and LC_ALL set to "C".
Nicholas Clark [Mon, 14 May 2012 09:17:06 +0000 (10:17 +0100)]
In the Linux hints, invoke gcc with LANG and LC_ALL set to "C".

The output of gcc -print-search-dirs is subject to localisation, which means
that the literal text "libraries" will not be present if the user has a
non-English locale, and we won't determine the correct path for libraries
such as -lm, breaking the build. Problem diagnosed by Alexander Hartmaier.

12 years agoDon't test that errno is still 0 after POSIX::f?pathconf
Paul Johnson [Mon, 14 May 2012 08:45:10 +0000 (09:45 +0100)]
Don't test that errno is still 0 after POSIX::f?pathconf

I think the best we can do with respect to the f?pathconf tests is to
make sure that the perl call doesn't die, and that the system call
doesn't fail.  And it's arguable we should only be testing the former.
But since we've been testing more that this anyway, it's probably safe
to test both.

With respect to the sysconf call, I think we shouldn't test more than
that perl doesn't die.  Any further testing would require different
tests based the argument being passed in.  Before doing that, it's
probably worth considering the purpose of the tests.  I don't think we
really want to test that POSIX has been implemented correctly, only that
our layer over it is correctly implemented.

This fixes RT #112866.

12 years agoperldelta: Remove duplicate paragraph
Karl Williamson [Mon, 14 May 2012 15:47:36 +0000 (09:47 -0600)]
perldelta: Remove duplicate paragraph

12 years agostudy as no-op is a bugfix, not performance enhancement
Ricardo Signes [Fri, 11 May 2012 22:00:03 +0000 (18:00 -0400)]
study as no-op is a bugfix, not performance enhancement

12 years agoperldelta: Add ‘(5.14.2)’ markers
Father Chrysostomos [Fri, 11 May 2012 16:55:09 +0000 (09:55 -0700)]
perldelta: Add ‘(5.14.2)’ markers

12 years agoperldelta: Explain the ‘(5.14.1)’ markers
Father Chrysostomos [Fri, 11 May 2012 16:50:20 +0000 (09:50 -0700)]
perldelta: Explain the ‘(5.14.1)’ markers

12 years agoperldelta: Use single quotes in C<>
Father Chrysostomos [Fri, 11 May 2012 16:48:49 +0000 (09:48 -0700)]
perldelta: Use single quotes in C<>

C<> renders as "..." in nroff, so C<... "..." ...> ends up looking weird.

12 years agoperldelta: Use L<> to link to changed module pods
Karl Williamson [Fri, 11 May 2012 16:44:10 +0000 (10:44 -0600)]
perldelta: Use L<> to link to changed module pods

Spotted by Vincent Pit

12 years agoperldelta: Reorder to avoid pronoun confusion
Karl Williamson [Fri, 11 May 2012 16:35:13 +0000 (10:35 -0600)]
perldelta: Reorder to avoid pronoun confusion

Spotted by Zsbán Ambrus

12 years agoperldelta: typo
Karl Williamson [Fri, 11 May 2012 16:29:31 +0000 (10:29 -0600)]
perldelta: typo

Spotted by Zsbán Ambrus

12 years agoperldelta: Add future deprecation text about \Q
Karl Williamson [Fri, 11 May 2012 16:25:15 +0000 (10:25 -0600)]
perldelta: Add future deprecation text about \Q

12 years agoperldelta: misuse of commas
Father Chrysostomos [Fri, 11 May 2012 16:30:25 +0000 (09:30 -0700)]
perldelta: misuse of commas

12 years agoperldelta: typo
Father Chrysostomos [Fri, 11 May 2012 16:27:08 +0000 (09:27 -0700)]
perldelta: typo

12 years agoperldelta: [rt.cpan.org #0], not RT 0
Father Chrysostomos [Fri, 11 May 2012 16:26:43 +0000 (09:26 -0700)]
perldelta: [rt.cpan.org #0], not RT 0

12 years agoRmv second ‘version’ in upgrade notices
Father Chrysostomos [Fri, 11 May 2012 16:24:14 +0000 (09:24 -0700)]
Rmv second ‘version’ in upgrade notices

Some of these were like this:

...from version 123 to version 456.

and some like this:

...from version 123 to 456.

Since the former is wordy, I’ve used the latter throughout.

12 years agoperldelta: Consistent fullstops for ‘upgraded from x to x’
Father Chrysostomos [Fri, 11 May 2012 16:20:46 +0000 (09:20 -0700)]
perldelta: Consistent fullstops for ‘upgraded from x to x’

12 years agoperldelta: consistent spaces after dots
Father Chrysostomos [Fri, 11 May 2012 16:18:44 +0000 (09:18 -0700)]
perldelta: consistent spaces after dots

12 years agoperldelta: consistent semicolons in CGI example
Father Chrysostomos [Fri, 11 May 2012 16:17:10 +0000 (09:17 -0700)]
perldelta: consistent semicolons in CGI example

12 years agoperldelta: grammar
Father Chrysostomos [Fri, 11 May 2012 16:16:43 +0000 (09:16 -0700)]
perldelta: grammar

12 years agoperldelta: fix capitalisation
Father Chrysostomos [Fri, 11 May 2012 16:16:11 +0000 (09:16 -0700)]
perldelta: fix capitalisation

12 years agoperldelta: Mention 5.14.0, not 5.13.6
Karl Williamson [Fri, 11 May 2012 15:40:20 +0000 (09:40 -0600)]
perldelta: Mention 5.14.0, not 5.13.6

12 years agoperldelta: Correct statement
Karl Williamson [Fri, 11 May 2012 15:37:37 +0000 (09:37 -0600)]
perldelta: Correct statement

It was pointed out to me after I wrote the text in an earlier perldelta
that this one is extracted from, that it is extremely unlikely to run
out of memory; I had not bothered to really do the math.

12 years agoperldelta: correct statement
Karl Williamson [Fri, 11 May 2012 15:36:45 +0000 (09:36 -0600)]
perldelta: correct statement

12 years agoperldelta: grammar
Karl Williamson [Fri, 11 May 2012 15:33:04 +0000 (09:33 -0600)]
perldelta: grammar

12 years agoperldelta: slightly expand and clarify policy note
Ricardo Signes [Fri, 11 May 2012 14:06:39 +0000 (10:06 -0400)]
perldelta: slightly expand and clarify policy note

12 years agoperldelta: break Pod:: deprecations onto two items
Ricardo Signes [Fri, 11 May 2012 12:18:05 +0000 (08:18 -0400)]
perldelta: break Pod:: deprecations onto two items

12 years agoRevert "perl5160delta: The coreargs opcode is undeserving of mention"
Ricardo Signes [Fri, 11 May 2012 12:07:25 +0000 (08:07 -0400)]
Revert "perl5160delta: The coreargs opcode is undeserving of mention"

This reverts commit 1061b56a7b2cc84a8ac96a405e5b8c185936605c.

This is a reversion of a reversion.  The reversion in 1061b56a7b2cc was
a bizarre mistake made during merging some blead/release conflicts, and
rjbs sincerely apologizes!

12 years agoadd long-form keys for newer versions in CoreList
Ricardo Signes [Fri, 11 May 2012 12:04:55 +0000 (08:04 -0400)]
add long-form keys for newer versions in CoreList

12 years agoVarious small grammar fixes in perldelta
Dave Rolsky [Fri, 11 May 2012 07:28:33 +0000 (02:28 -0500)]
Various small grammar fixes in perldelta

12 years agoperldelta: update "Updated Modules" with highlights
Ricardo Signes [Fri, 11 May 2012 02:36:37 +0000 (22:36 -0400)]
perldelta: update "Updated Modules" with highlights

12 years agobump the CoreList version in CoreList for 5.16
Ricardo Signes [Fri, 11 May 2012 01:24:47 +0000 (21:24 -0400)]
bump the CoreList version in CoreList for 5.16

12 years agoskip the porting/utils.t unless in a git checkout
Ricardo Signes [Thu, 10 May 2012 20:56:40 +0000 (16:56 -0400)]
skip the porting/utils.t unless in a git checkout

Today I tried to build 5.16.0-RC0 on my Linode and I got this:

  ok 78 # skip utils/cpanp-run-perl executes code in a BEGIN block which fails for
   empty @ARGV
  not ok 79 - utils/cpanp compiles
  # Failed test 79 - utils/cpanp compiles at porting/utils.t line 81
  #      got "defined(%hash) is deprecated at /usr/local/lib/perl5/site_perl/5.10.
  0/Locale/Maketext/Lexicon.pm line 307.\n\t(Maybe you should just omit the define
  d()?)\nutils/cpanp syntax OK\n"
  # expected "utils/cpanp syntax OK\n"

Ugh.  We really don't want this to happen to somebody else, because this
test is "do not let the developer break stuff" not "make sure the install
works."

12 years agoadd a changes_between function in Module::CoreList
Ricardo Signes [Thu, 10 May 2012 19:08:21 +0000 (15:08 -0400)]
add a changes_between function in Module::CoreList

12 years agopoint out "corelist --diff" in perldelta
Ricardo Signes [Thu, 10 May 2012 18:47:18 +0000 (14:47 -0400)]
point out "corelist --diff" in perldelta

12 years agoadd the --diff option to corelist
Ricardo Signes [Tue, 1 May 2012 22:28:43 +0000 (18:28 -0400)]
add the --diff option to corelist

12 years agoupdate Module::CoreList for 5.16.0
Ricardo Signes [Thu, 10 May 2012 18:37:22 +0000 (14:37 -0400)]
update Module::CoreList for 5.16.0

12 years agoallow for .tgz dists in the CoreList updater
Ricardo Signes [Thu, 10 May 2012 18:34:27 +0000 (14:34 -0400)]
allow for .tgz dists in the CoreList updater

12 years agoperldelta: the acknowledgements section!
Ricardo Signes [Thu, 10 May 2012 18:17:04 +0000 (14:17 -0400)]
perldelta: the acknowledgements section!

12 years agoperl5160delta: The coreargs opcode is undeserving of mention
Father Chrysostomos [Wed, 25 Apr 2012 05:23:34 +0000 (22:23 -0700)]
perl5160delta: The coreargs opcode is undeserving of mention

12 years agoimport perldelta from eb83ed8 into release branch
Ricardo Signes [Thu, 10 May 2012 13:38:02 +0000 (09:38 -0400)]
import perldelta from eb83ed8 into release branch

12 years agoperldelta: Explain stdio/sfio future deprecation.
Leon Timmermans [Thu, 3 May 2012 12:19:31 +0000 (14:19 +0200)]
perldelta: Explain stdio/sfio future deprecation.

12 years agookay the links to CPAN modules in the perldelta
Ricardo Signes [Wed, 2 May 2012 12:27:24 +0000 (08:27 -0400)]
okay the links to CPAN modules in the perldelta

12 years agoupdate .gitignore: we generate 5160delta now
Ricardo Signes [Wed, 2 May 2012 12:27:06 +0000 (08:27 -0400)]
update .gitignore: we generate 5160delta now

12 years agoregenerate uconfig.h
Ricardo Signes [Wed, 2 May 2012 12:17:38 +0000 (08:17 -0400)]
regenerate uconfig.h

12 years agoremove perl515*delta, add perl5160delta
Ricardo Signes [Wed, 2 May 2012 02:33:19 +0000 (22:33 -0400)]
remove perl515*delta, add perl5160delta

12 years agobump version to 5.16.0 RC0
Ricardo Signes [Wed, 2 May 2012 01:18:37 +0000 (21:18 -0400)]
bump version to 5.16.0 RC0

Done with:

  ./perl -Ilib Porting/bump-perl-version -i 5.15.9 5.16.0

...followed by a small edit to INSTALL and patchlevel.h.

12 years agoadd Test::More as a prereq to Makefile.PL
Dominic Hargreaves [Wed, 9 May 2012 18:09:18 +0000 (19:09 +0100)]
add Test::More as a prereq to Makefile.PL

12 years agosometimes fork() isn't available
Tony Cook [Wed, 9 May 2012 18:04:28 +0000 (19:04 +0100)]
sometimes fork() isn't available

This was amended from the original Tony prepared in a parallel branch

12 years ago[rt.cpan.org #61577] sockdomain and socktype undef on newly accepted sockets
Daniel Kahn Gillmor [Fri, 17 Feb 2012 22:29:14 +0000 (14:29 -0800)]
[rt.cpan.org #61577] sockdomain and socktype undef on newly accepted sockets

There appears to be a flaw in IO::Socket where some IO::Socket objects
are unable to properly report their socktype, sockdomain, or protocol
(they return undef, even when the underlying socket is sufficiently
initialized to have these properties).

The attached patch should cover IO::Socket objects created via accept(),
new_from_fd(), new(), and anywhere else whose details haven't been
properly cached.

No new code should be executed on IO::Socket objects whose details are
already cached and present.

12 years agoSkip Carp tests on VMS.
Craig A. Berry [Wed, 9 May 2012 23:41:05 +0000 (18:41 -0500)]
Skip Carp tests on VMS.

They want IPC::Open3::open3, which is not currently working.

12 years agoperl5160delta: tweaks
Father Chrysostomos [Wed, 9 May 2012 19:49:49 +0000 (12:49 -0700)]
perl5160delta: tweaks

sdio -> stdio
two spaces after dots

12 years agoperldelta: Explain stdio/sfio future deprecation.
Leon Timmermans [Thu, 3 May 2012 12:19:31 +0000 (14:19 +0200)]
perldelta: Explain stdio/sfio future deprecation.

12 years agoadd a missing blink above =item to s2p.PL
Ricardo Signes [Wed, 9 May 2012 01:11:12 +0000 (21:11 -0400)]
add a missing blink above =item to s2p.PL

12 years agoFix test failure
Father Chrysostomos [Tue, 8 May 2012 15:26:54 +0000 (08:26 -0700)]
Fix test failure

Lesson learnt: After switching from threaded to unthreaded and fixing
the test, switch back again and re-run the test. :-)

12 years ago[perl #112780] Don’t set cloned in-memory handles to ""
Father Chrysostomos [Tue, 8 May 2012 03:43:18 +0000 (20:43 -0700)]
[perl #112780] Don’t set cloned in-memory handles to ""

PerlIO::scalar’s dup function (PerlIOScalar_dup) calls the base imple-
mentation (PerlIOBase_dup), which pushes the scalar layer on to the
new file handle.

When the scalar layer is pushed, if the mode is ">" then
PerlIOScalar_pushed sets the scalar to the empty string.  If it is
already a string, it does this simply by setting SvCUR to 0, without
touching the string buffer.

The upshot of this is that newly-cloned in-memory handles turn into
the empty string, as in this example:

use threads;
my $str = '';
open my $fh, ">", \$str;
$str = 'a';
async {
    warn $str;  # something's wrong
}->join;

This has probably always been this way.

The test suite for MSCHWERN/Test-Simple-1.005000_005.tar.gz does some-
thing similar to this:

use threads;
my $str = '';
open my $fh, ">", \$str;
print $fh "a";
async {
    print $fh "b";
    warn $str;  # "ab" expected, but 5.15.7-9 gives "\0b"
}->join;

What was happening before commit b6597275 was that two bugs were can-
celling each other out: $str would be "" when the new thread started,
but with a string buffer containing "a" beyond the end of the string
and $fh remembering 1 as its position.  The bug fixed by b6597275 was
that writing past the end of a string through a filehandle was leaving
junk (whatever was in memory already) in the intervening space between
the old end of string and the beginning of what was being written to
the string.  This allowed "" to turn magically into "ab" when "b" was
written one character past the end of the string.  Commit b6597275
started zeroing out the intervening space in that case, causing the
cloning bug to rear its head.

This commit solves the problem by hiding the scalar temporarily
in PerlIOScalar_dup so that PerlIOScalar_pushed won’t be able to
modify it.

Should PerlIOScalar_pushed stop clobbering the string and should
PerlIOScalar_open do it instead?  Perhaps.  But that would be a bigger
change, and we are supposed to be in code freeze right now.

12 years agoIncrease $PerlIO::scalar::VERSION to 0.14
Father Chrysostomos [Mon, 7 May 2012 21:53:20 +0000 (14:53 -0700)]
Increase $PerlIO::scalar::VERSION to 0.14

12 years agowith 5.16.0, 5.12.x is security-only
Ricardo Signes [Mon, 7 May 2012 15:03:37 +0000 (11:03 -0400)]
with 5.16.0, 5.12.x is security-only

12 years agocheck for PA* in both branches of case
H.Merijn Brand [Sun, 6 May 2012 11:11:03 +0000 (13:11 +0200)]
check for PA* in both branches of case

(thanks ilmari for spotting)

12 years agoDisable optimizer for 32bit PA-RISC builds on HP-UX
H.Merijn Brand [Sun, 6 May 2012 11:03:08 +0000 (13:03 +0200)]
Disable optimizer for 32bit PA-RISC builds on HP-UX

The (ANSI) C compiler fails to compile precompiled (.i) files when both
-g and -O (all +O1 and above) are given. When -g is requested, -O, +O,
and +Onolimit are removed from optimize flags

This #fail does not occur with the newer aCC compiler B3910B, which is
also used on HP-UX on Itanium.

The check/modification has to be done as late as possible, as the other
options, like -Duse64bitall and -DDEBUGING, will modify the variables
that need to be checked after hints/hpux.sh has been dealt with.

12 years agoAdd --libpods back as a non-functional option to pod2html.
Steve Peters [Fri, 4 May 2012 15:51:06 +0000 (10:51 -0500)]
Add --libpods back as a non-functional option to pod2html.

When --libpods was removed, this broke backward compatiblility with
existing uses.  This change adds back the option, but warns that
--libpods is no longer supported.

12 years agodelete PERL_YAML_BACKEND and PERL_JSON_BACKEND in T/TEST
David Golden [Fri, 4 May 2012 15:02:26 +0000 (11:02 -0400)]
delete PERL_YAML_BACKEND and PERL_JSON_BACKEND in T/TEST

If these are set, Parse-CPAN-Meta and other things that depend
on it may fail.

12 years agoCorrect variable name in example.
Paul Johnson [Sun, 29 Apr 2012 18:27:37 +0000 (20:27 +0200)]
Correct variable name in example.

As noticed by Lawrence Statton <lawrence@cluon.com>

12 years agoBump the version of perl5db since the porting scripts care
Jesse Vincent [Tue, 24 Apr 2012 23:02:34 +0000 (19:02 -0400)]
Bump the version of perl5db since the porting scripts care

12 years agowe no longer have in-file changelogs, since we have a version control system
Jesse Vincent [Tue, 24 Apr 2012 19:35:39 +0000 (15:35 -0400)]
we no longer have in-file changelogs, since we have a version control system

12 years agoWe now have version control and no longer need a changelog in perl5db
Jesse Vincent [Tue, 24 Apr 2012 19:05:55 +0000 (15:05 -0400)]
We now have version control and no longer need a changelog in perl5db

12 years agoutf8n_to_uvuni(): Fix broken malformation interactions
Karl Williamson [Fri, 27 Apr 2012 17:09:14 +0000 (11:09 -0600)]
utf8n_to_uvuni(): Fix broken malformation interactions

All code points whose UTF-8 representations start with a byte containing
either \xFE or \xFF are considered problematic because they are not
portable.  There are many such code points that are too large to
represent on a 32 or even a 64 bit platform.  Commit
eb83ed87110e41de6a4cd4463f75df60798a9243 failed to properly catch
overflow when the input flags to this function say to warn on, but
otherwise accept FE and FF sequences.  Now overflow is checked for
unconditionally.

12 years agoReally increase $File::DosGlob::VERSION to 1.07
Father Chrysostomos [Fri, 27 Apr 2012 20:31:20 +0000 (13:31 -0700)]
Really increase $File::DosGlob::VERSION to 1.07

I honestly thought I had run the tests, but I suppose not.

12 years agoIncrease $version::VERSION to 0.99
Father Chrysostomos [Fri, 27 Apr 2012 16:43:07 +0000 (09:43 -0700)]
Increase $version::VERSION to 0.99

What we have in blead right now matches that CPAN release, so this
version bump *must* happen before 5.16.

12 years agodisable codes_in_verbatim for Pod::Html
Ricardo Signes [Fri, 27 Apr 2012 01:39:33 +0000 (21:39 -0400)]
disable codes_in_verbatim for Pod::Html

...otherwise all our verbatim blocks will change radically!

12 years agois_utf8_char_slow(): Avoid accepting overlongs
Karl Williamson [Thu, 19 Apr 2012 04:14:15 +0000 (22:14 -0600)]
is_utf8_char_slow(): Avoid accepting overlongs

There are possible overlong sequences that this function blindly
accepts.  Instead of developing the code to figure this out, turn this
function into a wrapper for utf8n_to_uvuni() which already has this
check.

12 years agoperlapi: Update for changes in utf8 decoding
Karl Williamson [Thu, 19 Apr 2012 00:32:57 +0000 (18:32 -0600)]
perlapi: Update for changes in utf8 decoding

12 years agoutf8.c: White-space only
Karl Williamson [Mon, 23 Apr 2012 19:28:32 +0000 (13:28 -0600)]
utf8.c: White-space only

This outdents to account for the removal of a surrounding block.

12 years agoutf8.c: refactor utf8n_to_uvuni()
Karl Williamson [Wed, 18 Apr 2012 23:36:01 +0000 (17:36 -0600)]
utf8.c: refactor utf8n_to_uvuni()

The prior version had a number of issues, some of which have been taken
care of in previous commits.

The goal when presented with malformed input is to consume as few bytes
as possible, so as to position the input for the next try to the first
possible byte that could be the beginning of a character.  We don't want
to consume too few bytes, so that the next call has us thinking that
what is the middle of a character is really the beginning; nor do we
want to consume too many, so as to skip valid input characters.  (This
is forbidden by the Unicode standard because of security
considerations.)  The previous code could do both of these under various
circumstances.

In some cases it took as a given that the first byte in a character is
correct, and skipped looking at the rest of the bytes in the sequence.
This is wrong when just that first byte is garbled.  We have to look at
all bytes in the expected sequence to make sure it hasn't been
prematurely terminated from what we were led to expect by that first
byte.

Likewise when we get an overflow: we have to keep looking at each byte
in the sequence.  It may be that the initial byte was garbled, so that
it appeared that there was going to be overflow, but in reality, the
input was supposed to be a shorter sequence that doesn't overflow.  We
want to have an error on that shorter sequence, and advance the pointer
to just beyond it, which is the first position where a valid character
could start.

This fixes a long-standing TODO from an externally supplied utf8 decode
test suite.

And, the old algorithm for finding overflow failed to detect it on some
inputs.  This was spotted by Hugo van der Sanden, who suggested the new
algorithm that this commit uses, and which should work in all instances.
For example, on a 32-bit machine, any string beginning with "\xFE" and
having the next byte be either "\x86" or \x87 overflows, but this was
missed by the old algorithm.

Another bug was that the code was careless about what happens when a
malformation occurs that the input flags allow. For example, a sequence
should not start with a continuation byte.  If that malformation is
allowed, the code pretended it is a start byte and extracts the "length"
of the sequence from it.  But pretending it is a start byte is not the
same thing as it actually being a start byte, and so there is no
extractable length in it, so the number that this code thought was
"length" was bogus.

Yet another bug fixed is that if only the warning subcategories of the
utf8 category were turned on, and not the entire utf8 category itself,
warnings were not raised that should have been.

And yet another change is that given malformed input with warnings
turned off, this function used to return whatever it had computed so
far, which is incomplete or erroneous garbage.  This commit changes to
return the REPLACEMENT CHARACTER instead.

Thanks to Hugo van der Sanden for reviewing and finding problems with an
earlier version of these commits

12 years agoutf8n_to_uvuni: Avoid reading outside of buffer
Karl Williamson [Wed, 18 Apr 2012 22:48:29 +0000 (16:48 -0600)]
utf8n_to_uvuni: Avoid reading outside of buffer

Prior to this patch, if the first byte of a UTF-8 sequence indicated
that the sequence occupied n bytes, but the input parameters indicated
that fewer were available, all n were attempted to be read

12 years agoutf8.c: Clarify and correct pod
Karl Williamson [Wed, 18 Apr 2012 22:35:39 +0000 (16:35 -0600)]
utf8.c: Clarify and correct pod

Some of these were spotted by Hugo van der Sanden

12 years agoutf8.c: Use macros instead of if..else.. sequence
Karl Williamson [Wed, 18 Apr 2012 22:20:22 +0000 (16:20 -0600)]
utf8.c: Use macros instead of if..else.. sequence

There are two existing macros that do the job that this longish sequence
does.  One, UTF8SKIP(), does an array lookup and is very likely to be in
the machine's cache as it is used ubiquitously when processing UTF-8.
The other is a simple test and shift.  These simplify the code and
should speed things up as well.

12 years agoutf8.h: Use correct definition of start byte
Karl Williamson [Wed, 18 Apr 2012 21:25:28 +0000 (15:25 -0600)]
utf8.h: Use correct definition of start byte

The previous definition allowed for (illegal) overlongs.  The uses of
this macro in the core assume that it is accurate.  The inacurracy can
cause such code to fail.

12 years agoutf8.h: Use correct UTF-8 downgradeable definition
Christian Hansen [Wed, 18 Apr 2012 20:32:16 +0000 (14:32 -0600)]
utf8.h: Use correct UTF-8 downgradeable definition

Previously, the macro changed by this commit would accept overlong
sequences.

This patch was changed by the committer to to include EBCDIC changes;
and in the non-EBCDIC case, to save a test, by using a mask instead, in
keeping with the prior version of the code

12 years agoMake unicode label tests use unicode_eval.
Brian Fraser [Sat, 21 Apr 2012 01:09:56 +0000 (22:09 -0300)]
Make unicode label tests use unicode_eval.

A recent change exposed a faulty test, in t/uni/labels.t;
Previously, a downgraded label passed to eval under 'use utf8;'
would've been erroneously considered UTF-8 and the tests
would pass. Now it's correctly reported as illegal UTF-8
unless unicode_eval is in effect.

12 years agoperlsub: Fix new typo
Father Chrysostomos [Wed, 25 Apr 2012 05:27:04 +0000 (22:27 -0700)]
perlsub: Fix new typo

Since the typo was added during code freeze, I hope I’m not going too
far in fixing it during code freeze. :-)

12 years agoperl5160delta: The coreargs opcode is undeserving of mention
Father Chrysostomos [Wed, 25 Apr 2012 05:23:34 +0000 (22:23 -0700)]
perl5160delta: The coreargs opcode is undeserving of mention

12 years agoperl5160delta: 2 dots after spaces, please
Father Chrysostomos [Wed, 25 Apr 2012 05:22:38 +0000 (22:22 -0700)]
perl5160delta: 2 dots after spaces, please

12 years agoperl5160delta: Document wrap_op_checker
Father Chrysostomos [Wed, 25 Apr 2012 05:19:47 +0000 (22:19 -0700)]
perl5160delta: Document wrap_op_checker

This was added in e8570548af4, but was somehow missed from
perl5158delta.

12 years agoperl5160delta: Tweak punctuation
Father Chrysostomos [Wed, 25 Apr 2012 05:18:17 +0000 (22:18 -0700)]
perl5160delta: Tweak punctuation

12 years agoperl5160delta: More details for C<utf8_to_uv*_buf>
Father Chrysostomos [Wed, 25 Apr 2012 05:15:13 +0000 (22:15 -0700)]
perl5160delta: More details for C<utf8_to_uv*_buf>

I had a note to myself to make sure these were added.  I don’t
remember why they were not already there.  Perhaps the changes
were after 5.15.9.

12 years agoperl5160delta: Add missing C<>
Father Chrysostomos [Wed, 25 Apr 2012 05:08:16 +0000 (22:08 -0700)]
perl5160delta: Add missing C<>

12 years agoC<> is not L<> and does not take two |-delim parts
Ricardo Signes [Wed, 25 Apr 2012 02:51:06 +0000 (22:51 -0400)]
C<> is not L<> and does not take two |-delim parts

reported by Tom Christiansen

12 years agominor unicode doc tweaks
Ricardo Signes [Wed, 25 Apr 2012 02:44:23 +0000 (22:44 -0400)]
minor unicode doc tweaks

reported by Tom Christiansen

12 years agofix some typos in perlsyn.pod
Ricardo Signes [Wed, 25 Apr 2012 02:36:03 +0000 (22:36 -0400)]
fix some typos in perlsyn.pod

(both reported by Tom Christiansen)