Steve Hay [Thu, 23 Aug 2012 08:05:38 +0000 (09:05 +0100)]
RMG - CPAN /src and /src/README.html are the same
Steve Hay [Thu, 23 Aug 2012 07:52:00 +0000 (08:52 +0100)]
RMG - corelist.pl uses HTTP::Tiny, not wget or curl
It also fetches files remotely even when using a local CPAN mirror if
the files are missing.
Nicholas Clark [Thu, 23 Aug 2012 17:48:14 +0000 (19:48 +0200)]
Record the story behind the pack format specifiers H, h, B and b.
Father Chrysostomos [Thu, 23 Aug 2012 16:32:03 +0000 (09:32 -0700)]
Increase $Module::CoreList::VERSION to 2.73
Even though cmp_version.t doesn’t mind 2.72, we need a version bump,
as 2.72 is already on CPAN.
David Leadbeater [Wed, 22 Aug 2012 15:03:43 +0000 (17:03 +0200)]
Clean up data for ExtUtils::Miniperl in Module::CoreList
Some corelist data was constructed without ExtUtils::Miniperl being
present, presumably because perl wasn't fully built at the time.
David Leadbeater [Wed, 22 Aug 2012 14:50:13 +0000 (16:50 +0200)]
Clean up data for Pod::Perldoc::ToTk in Module:CoreList
It was alternating between 'undef' and undef.
David Leadbeater [Wed, 22 Aug 2012 14:41:16 +0000 (16:41 +0200)]
Clean up data for Carp::Heavy in Module::CoreList
It was lagging behind by about one release -- presumably due to it being
based on $Carp::VERSION.
David Leadbeater [Wed, 22 Aug 2012 14:27:29 +0000 (16:27 +0200)]
Fix the version of Scalar::Util in corelist for 5.7.3
Father Chrysostomos [Thu, 23 Aug 2012 07:19:55 +0000 (00:19 -0700)]
pad.h: PadnameSTATE
Father Chrysostomos [Thu, 23 Aug 2012 04:48:56 +0000 (21:48 -0700)]
Use FooBAR convention for new pad macros
After a while, I realised that it can be confusing for PAD_ARRAY and
PAD_MAX to take a pad argument, but for PAD_SV to take a number and
PAD_SET_CUR a padlist.
I was copying the HEK_KEY convention, which was probably a bad idea.
This is what we use elsewhere:
TypeMACRO
----=====
AvMAX
CopFILE
PmopSTASH
StashHANDLER
OpslabREFCNT_dec
Furthermore, heks are not part of the API, so what convention they use
is not so important.
So these:
PADNAMELIST_*
PADLIST_*
PADNAME_*
PAD_*
are now:
Padnamelist*
Padlist*
Padname*
Pad*
Father Chrysostomos [Thu, 23 Aug 2012 01:15:34 +0000 (18:15 -0700)]
Increase $B::Deparse::VERSION to 1.17
Father Chrysostomos [Thu, 23 Aug 2012 01:15:11 +0000 (18:15 -0700)]
B::Deparse: Suppress trailing ; in formats
While it doesn’t change the behaviour, nobody writes formats that way,
and this makes the output match 5.17.2 and earlier.
Father Chrysostomos [Thu, 23 Aug 2012 01:11:33 +0000 (18:11 -0700)]
pad.h: Let PADNAME_PV return null
Father Chrysostomos [Wed, 22 Aug 2012 23:48:45 +0000 (16:48 -0700)]
pad.h: typos in macro definitions
It would help to define these macros properly.
Father Chrysostomos [Wed, 22 Aug 2012 23:33:06 +0000 (16:33 -0700)]
pad.h: PADNAME_SV
If CPAN modules should not assume that pad names are SVs, we need
to provide a better way than newSVpvn(PADNAME_PV(pn),PADNAME_LEN(pn))
to get an SV out of it, as, knowing that pad names are just SVs, the
core can do it more efficiently by simply returning the pad name
itself.
Father Chrysostomos [Wed, 22 Aug 2012 23:24:37 +0000 (16:24 -0700)]
pad.[ch]: PADNAME_OUTER
I think this is the last bit of pad-as-sv stuff that was not
abstracted away in pad-specific macros.
Father Chrysostomos [Wed, 22 Aug 2012 22:59:23 +0000 (15:59 -0700)]
toke.c: Extreme paranoia
Karl Williamson [Wed, 22 Aug 2012 20:50:43 +0000 (14:50 -0600)]
PATCH: Devel::Peek doesn't compile under C++
Commit
c9795579db61c900bacee2790bdceb7bad3dd45d introduced
an error in C++: it's missing a cast.
Father Chrysostomos [Wed, 22 Aug 2012 21:07:44 +0000 (14:07 -0700)]
[perl #114040] Fix here-docs in multiline re-evals
Commit
5097bf9b8 only partially fixed this, or, rather, did the
groundwork for fixing it.
If we have a pattern like this:
/(?{<<foo . baz
bar
foo
})/
Then PL_linestr contains this while we are parsing the block:
"(?{<<foo . baz\nbar\nfoo\n})"
The code for parsing a here-doc in a multiline PL_linestr buffer
(which applies to here-docs in string evals or in quote-like operat-
ors) likes to modify PL_linestr to contain everything after the
<<heredoc marker except the here-doc body, which has been stolen (but
it oddly includes the last character of the marker, which does not
matter, as PL_bufptr is set to PL_linestart+1):
"o . baz\n})"
The regexp block parsing code expects to be able to extract the entire
block (as a string) from PL_linestr after parsing it. So it is not
helpful for S_scan_heredoc to go and modify it like that.
Before modifying PL_linestr, we can set aside a copy of the source
code (in PL_sublex_info.re_eval_str) from the beginning of the regexp
block to the end of PL_linestr, so that the regexp block code can
retrieve the original source from there.
We also adjust PL_sublex_info.re_eval_start so that at the end of the
regexp block PL_bufptr - PL_sublex_info.re_eval_start is the length of
the block.
Instead of clobbering PL_linestr, we can copy everything after the
here-doc to when the body begins. And this for two reasons: it
requires less allocation (I would have made that change in the end
anyway, for efficiency), and it makes it easier to calculate how much
to subtract from re_eval_start.
This fix does not apply to here-docs in quotes in multiline string
evals, which crashes and always has.
Father Chrysostomos [Wed, 22 Aug 2012 19:52:15 +0000 (12:52 -0700)]
Peek.t: Test that DeadCode doesn’t crash
I broke it, but Karl Williamson’s commit (the previous) with my tweaks
fixes it. This function was not at all exercised by the test suite.
Karl Williamson [Wed, 22 Aug 2012 17:16:55 +0000 (11:16 -0600)]
Devel::Peek: Fix so compiles under C++
Commit
86b9d29366aea0e71ad75b61d04f56f1fe5b0d4d created a new PADLIST
type. However, this broke the compilation of Devel::Peek with C++.
This commit gets it to compile again, and pass our regression test
suite.
[Modified by the committer to use the correct PADLIST_ macros; other-
wise it will crash.]
Father Chrysostomos [Wed, 22 Aug 2012 16:46:28 +0000 (09:46 -0700)]
toke.c: -DT should report forced tokens under -Dmad
I was wondering why the -DT output was missing things out.
This is why:
#ifdef PERL_MAD
/* FIXME - can these be merged? */
return next_type;
#else
return REPORT(next_type);
#endif
Father Chrysostomos [Wed, 22 Aug 2012 15:43:40 +0000 (08:43 -0700)]
heredoc.t: Add a CRLF test
I nearly broke this in recent bug fixes
Father Chrysostomos [Wed, 22 Aug 2012 01:02:39 +0000 (18:02 -0700)]
[Merge] New PADLIST type
To fix a bug (
db4cf31d1d) and to facilitate the lexical subs I’m work-
ing on, I needed to be able to add extra fields to a padlist. But
padlists are AVs, making that nontrivial.
There is no reason they need to be AVs, and they take less memory when
they are not, so I made a new padlist struct.
This is going to break CPAN modules that manipulate padlists.
To avoid having to patch those modules again later if we change pads
from AVs into their own types, I have added APIs for accessing the
contents of pads.
There is also a new PADNAMELIST type (currently equivalent to AV), in
case the pad holding the names needs to be a different type from a pad
some time in the future.
Father Chrysostomos [Wed, 22 Aug 2012 01:02:10 +0000 (18:02 -0700)]
pad.c: fix pod link
Father Chrysostomos [Tue, 21 Aug 2012 23:52:15 +0000 (16:52 -0700)]
Increase $XS:APItest::VERSION to 0.43
Father Chrysostomos [Tue, 21 Aug 2012 23:51:48 +0000 (16:51 -0700)]
Increase $B::VERSION to 1.38
Father Chrysostomos [Sat, 18 Aug 2012 19:12:36 +0000 (12:12 -0700)]
pad.c: CvPADLIST docs: one more thing
Father Chrysostomos [Sat, 18 Aug 2012 18:46:40 +0000 (11:46 -0700)]
pad.c: Use PAD_ARRAY rather than AvARRAY in curpad docs
Father Chrysostomos [Sat, 18 Aug 2012 18:38:50 +0000 (11:38 -0700)]
Use new types for comppad and comppad_name
I know that a few times I’ve looked at perl source files to find out
what type to use in ‘<type> foo = PL_whatever’. So I am changing
intrpvar.h as well as the api docs.
Father Chrysostomos [Sat, 18 Aug 2012 18:36:32 +0000 (11:36 -0700)]
pad.c: CvPADLIST doc update
Father Chrysostomos [Fri, 17 Aug 2012 21:21:37 +0000 (14:21 -0700)]
More PAD APIs
If we are making padlists their own type, and no longer AVs, it makes
sense to add APIs for pads, too, so that CPAN code that needs to
change now will only have to change once if we ever stop pads them-
selves from being AVs.
There is no reason pad names have to be SVs, so I am adding sep-
arate APIs for pad names, too. The AV containing pad names is
now officially a PADNAMELIST, which is accessed, not via
*PADLIST_ARRAY(padlist), but via PADLIST_NAMES(padlist).
Future optimisations may even merge the padlist with its name list so
I have also added macros to access the parts of the name list directly
from the padlist.
Father Chrysostomos [Fri, 17 Aug 2012 20:01:49 +0000 (13:01 -0700)]
Fix format closure bug with redefined outer sub
CVs close over their outer CVs. So, when you write:
my $x = 52;
sub foo {
sub bar {
sub baz {
$x
}
}
}
baz’s CvOUTSIDE pointer points to bar, bar’s CvOUTSIDE points to foo,
and foo’s to the main cv.
When the inner reference to $x is looked up, the CvOUTSIDE chain is
followed, and each sub’s pad is looked at to see if it has an $x.
(This happens at compile time.)
It can happen that bar is undefined and then redefined:
undef &bar;
eval 'sub bar { my $x = 34 }';
After this, baz will still refer to the main cv’s $x (52), but, if baz
had ‘eval '$x'’ instead of just $x, it would see the new bar’s $x.
(It’s not really a new bar, as its refaddr is the same, but it has a
new body.)
This particular case is harmless, and is obscure enough that we could
define it any way we want, and it could still be considered correct.
The real problem happens when CVs are cloned.
When a CV is cloned, its name pad already contains the offsets into
the parent pad where the values are to be found. If the outer CV
has been undefined and redefined, those pad offsets can be com-
pletely bogus.
Normally, a CV cannot be cloned except when its outer CV is running.
And the outer CV cannot have been undefined without also throwing
away the op that would have cloned the prototype.
But formats can be cloned when the outer CV is not running. So it
is possible for cloned formats to close over bogus entries in a new
parent pad.
In this example, \$x gives us an array ref. It shows ARRAY(0xbaff1ed)
instead of SCALAR(0xdeafbee):
sub foo {
my $x;
format =
@
($x,warn \$x)[0]
.
}
undef &foo;
eval 'sub foo { my @x; write }';
foo
__END__
And if the offset that the format’s pad closes over is beyond the end
of the parent’s new pad, we can even get a crash, as in this case:
eval
'sub foo {' .
'{my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l,$m,$n,$o,$p,$q,$r,$s,$t,$u)}'x999
. q|
my $x;
format =
@
($x,warn \$x)[0]
.
}
|;
undef &foo;
eval 'sub foo { my @x; my $x = 34; write }';
foo();
__END__
So now, instead of using CvROOT to identify clones of
CvOUTSIDE(format), we use the padlist ID instead. Padlists don’t
actually have an ID, so we give them one. Any time a sub is cloned,
the new padlist gets the same ID as the old. The format needs to
remember what its outer sub’s padlist ID was, so we put that in the
padlist struct, too.
Father Chrysostomos [Thu, 16 Aug 2012 23:47:38 +0000 (16:47 -0700)]
Increase $B::Xref::VERSION from 1.03 to 1.04
Father Chrysostomos [Thu, 16 Aug 2012 23:46:20 +0000 (16:46 -0700)]
Stop padlists from being AVs
In order to fix a bug, I need to add new fields to padlists. But I
cannot easily do that as long as they are AVs.
So I have created a new padlist struct.
This not only allows me to extend the padlist struct with new members
as necessary, but also saves memory, as we now have a three-pointer
struct where before we had a whole SV head (3-4 pointers) + XPVAV (5
pointers).
This will unfortunately break half of CPAN, but the pad API docs
clearly say this:
NOTE: this function is experimental and may change or be
removed without notice.
This would have broken B::Debug, but a patch sent upstream has already
been integrated into blead with commit
9d2d23d981.
Father Chrysostomos [Thu, 16 Aug 2012 05:27:54 +0000 (22:27 -0700)]
Use PADLIST in more places
Much code relies on the fact that PADLIST is typedeffed as AV.
PADLIST should be treated as a distinct type.
Father Chrysostomos [Thu, 16 Aug 2012 05:11:46 +0000 (22:11 -0700)]
Move PAD(LIST) typedefs to perl.h
otherwise they can only be used in some header files.
Father Chrysostomos [Tue, 21 Aug 2012 23:39:10 +0000 (16:39 -0700)]
[Merge] Enter inline.h
This is a home for static inline functions that cannot go in other
headers because they depend on proto.h or struct definitions.
This allows us to avoid repeating macros with GCC and non-GCC ver-
sions. It also makes it easier to avoid evaluating macro argu-
ments twice.
I’ve moved just enough things into it to offset the additional lines
added by the comments at the top. The ‘net code removal’ of this
branch is 4 lines.
Father Chrysostomos [Sat, 18 Aug 2012 20:16:31 +0000 (13:16 -0700)]
Move S_CvDEPTHp from cv.h to inline.h; shrink macros
This allows us to use assert() inside S_CvDEPTHp, so we no longer need
GCC and non-GCC variants of the macro that calls it.
Father Chrysostomos [Sat, 18 Aug 2012 19:58:38 +0000 (12:58 -0700)]
Static inline functions for SvPADTMP and SvPADSTALE
This allows non-GCC compilers to have assertions and avoids
repeating the macros.
Father Chrysostomos [Sat, 18 Aug 2012 19:39:40 +0000 (12:39 -0700)]
Use fast SvREFCNT_dec for non-GCC
Father Chrysostomos [Sat, 18 Aug 2012 19:34:33 +0000 (12:34 -0700)]
Use static inline functions for SvREFCNT_inc
This avoids the need to repeat the macros in GCC and non-GCC versions.
For non-GCC compilers capable of inlining, this should speed things up
slightly, too, as PL_Sv is no longer needed.
Father Chrysostomos [Fri, 17 Aug 2012 04:54:53 +0000 (21:54 -0700)]
[perl #113718] Add inline.h
We can put static inline functions here, and they can depend on
function prototypes and struct definitions from other header
files.
Chris 'BinGOs' Williams [Tue, 21 Aug 2012 22:55:41 +0000 (23:55 +0100)]
Sync Module-CoreList in Maintainers.pl for CPAN release
Chris 'BinGOs' Williams [Tue, 21 Aug 2012 22:46:13 +0000 (23:46 +0100)]
Update Changes fr Module-CoreList and bump to version 2.72
Father Chrysostomos [Tue, 21 Aug 2012 21:13:02 +0000 (14:13 -0700)]
[Merge] Here-doc parsing
I was waiting for 5.17.3 to be released, before merging my work on
padlists (which is blocking lexical subs), since I thought it would be
mean to inflict it on blead at the last minute before a release.
So, in the mean time, I decided to fix a small here-doc parsing bug,
that prevented them from occurring inside regexp code blocks.
As often happens, it turned out to be more involved than that....
I ended up writing a history of here-doc parsing, which you can find
in the commit message for
5097bf9b8d, which shows that the way they
have interacted with other quote-like operators (or other here-docs)
has changed over time in interesting ways.
While I was fixing those, I started to find other bugs. Since I was
modifying the code, I decided to try applying David Nicol’s patch that
allows a here-doc terminator with no newline after it, to avoid creat-
ing more conflicts through my changes. The patch didn’t work. And
while I was resolving what conflicts there were, I figured out a sim-
pler approach. So, instead of trying to investigate into why the
patch didn’t work, I just wrote my own version, which used less code.
Instead of working back on error to try to see whether we could have
accepted a terminator without a newline, we can just tack a newline on
the string buffer at EOF and let the rest of the code handle it the
usual way.
I continued to find more bugs as I went, till my ‘Yay, another bug!’
started to become ‘What? *Another* bug?’.
In the end:
• I fixed here-doc parsing, such that the body starts on the line fol-
lowing the <<foo marker, regardless of whether it is inside quotes,
string evals, or what have you (but see remaining bugs below). This
was contrary to the documentation, but the documentation was actu-
ally wrong half the time, so I corrected it.
• Here-doc terminators no longer require a final newline at EOF.
• You no longer get crashes with edge cases.
• Nulls in comments no longer confuse the here-doc parser.
And, finally, one bug that I fixed was not related to here-docs per
se, but got in the way. It deserves its own JAPH:
s/${s|||, \""}Just another Perl hacker,
/anything/;
print
There are still two bugs remaining:
• Here-docs whose markers occur in single-line s/// patterns where the
replacement part is multi-line or starts on a subsequent line are
still screwed.
• CR and CR LF line terminators are treated inconsistently inside and
outside of string evals.
I’ve decided to set those aside for later and merge what I’ve
done so far.
Father Chrysostomos [Tue, 21 Aug 2012 21:09:51 +0000 (14:09 -0700)]
perlop.pod: Update here-doc-in-quotes parsing rules
Father Chrysostomos [Tue, 21 Aug 2012 08:11:34 +0000 (01:11 -0700)]
smoke-me diag
nt,hun
Father Chrysostomos [Tue, 21 Aug 2012 08:45:15 +0000 (01:45 -0700)]
toke.c:scan_heredoc: Use PL_tokenbuf less
When scanning for a heredoc terminator in a string eval or quote-like
operator, the first character we are looking for is always a newline.
So instead of setting term to *PL_tokenbuf in those two code paths,
we can just hard-code '\n'.
Father Chrysostomos [Tue, 21 Aug 2012 06:58:59 +0000 (23:58 -0700)]
Fix substitution in substitution pattern
Guess what this prints:
s/${s|||, \""}Just another Perl hacker,
/anything/;
print
And look at this:
$ perl5.6.2 -e 's/${s|||;\""}/foo\n/; print;'
$ perl5.16.0 -e 's/${s|||;\""}/foo\n/; print;'
$ perl5.17.2 -e 's/${s|||;\""}/foo\n/; print;'
Bus error
$ ./miniperl -e 's/${s|||;\""}/foo\n/; print;'
Bus error
The first two gave no output, though they should have shown "foo".
And bleadperl now crashes.
When the lexer parses a quote-like operator, it begins by extracting
what is between the quotes. It puts it in an SV stored in the varia-
ble PL_lex_stuff. Then, if it is y/// or s///, it scans the replace-
ment part and puts it in an SV in PL_lex_repl. When it finishes with
it, it sets PL_lex_repl to NULL.
Now, if you put s/// in the pattern part of s/// (or y in s), the
inner s/// will clobber PL_lex_repl with its own replacement string.
So, when the outer s/// finish parsing its pattern and wants its
replacement string. If it is not there, it assumes it has already
parsed it (whether PL_lex_repl is set is how it remembers which half
of s/// it is parsing), and proceeds to feed bad code to the parser,
resulting in a bad op tree.
PL_lex_repl needs to be localised when a quote-like operator is
parsed. Since localisation for quote-like operators happens in a sep-
arate yylex call (yylex calls sublex_push, which does it) after the
string delimiters are found, at which point PL_lex_repl has already
been set (clobbering the previous value), we change the delim-
iter-scanning code (scan_{str,trans,subst}) to use the new
PL_sublex_info.repl, which sublex_push now copies into PL_lex_repl
after localising the latter.
Father Chrysostomos [Tue, 21 Aug 2012 02:08:57 +0000 (19:08 -0700)]
Fix here-docs in nested quote-like operators
When the lexer encounters a quote-like operator, it extracts the con-
tents of the quotes and starts an inner lexing scope.
To handle eval "s//<<FOO/e\n...", the here-doc parser peeks into the
outer lexing scope’s PL_linestr (current line buffer, which inside an
eval contains the entire string of code being parsed; for quote-like
operators, that is where the contents of the quote are stored). It
only does this inside a string eval. When parsing a file, the input
comes in one line at a time. So the here-doc parser steals lines from
the input stream for s//<<FOO/e outside an eval.
This approach fails in this case, as the peekee is the linestr for
s///, not for the eval:
eval ' s//"${\<<END}"/e; print
Just another Perl hacker,
END
'or die $@
__END__
Can't find string terminator "END" anywhere before EOF at (eval 1) line 1.
We also need to do this peeking stuff outside of a string eval, to
solve this:
s//"${\<<END}"
Just another Perl hacker,
END
/e; print
__END__
Can't find string terminator "END" anywhere before EOF at - line 1.
In the first example above, we need to look not in the parent lexing
scope’s linestr, but in that of the grandparent.
To solve the second example, we need to check whether the outer lexing
scope is a quote-like operator when we are not in an eval.
For parsing here-docs in quotes in eval, we currently store two
things, the former buffer pointer and the former linestr, in
PL_sublex_info.super_{bufp,lines}tr. The values for upper scopes are
stashed away on the savestack somewhere.
We need to be able to iterate through the outer lexer scopes till we
find one with multiple lines. Retrieving the information from the
savestack would be too complex and error-prone.
Since PL_linestr is an SV, we can abuse a couple of fields in it.
Upgrading it to PVNV gives it both IVX and NVX fields, which are big
enough to store pointers.
IVX is already used to hold an op number. So for the innermost quoted
scope we still need to use PL_sublex_info.super_bufptr. When entering
a new lexing scope (in sublex_push), we can localise the IVX field of
the outer PL_linestr SV and set it to what PL_sublex_info.super_bufptr
was in that scope. SvIVX(linestr) is only used for an op number when
that linestr’s lexing scope is the innermost one.
PL_sublex_info.super_linestr can be eliminated and replaced with
SvNVX(PL_linestr).
Father Chrysostomos [Tue, 21 Aug 2012 01:06:41 +0000 (18:06 -0700)]
Don’t use strchr when scanning for newline after <<foo
The code that uses this is specifically for parsing <<foo inside a
quote-like operator inside a string eval.
This prints bar:
eval "s//<<foo/e
bar
foo
";
print $_ || $@;
This prints Can't find string terminator blah blah blah:
eval "s//<<foo/e #\0
bar
foo
";
print $_ || $@;
Nulls in comments are allowed elsewhere. This prints bar:
eval "\$_ = <<foo #\0
bar
foo
";
print $_ || $@;
The problem with strchr is that it is specifically for scanning null-
terminated strings. If embedded nulls are permitted (and should be in
this case), memchr should be used.
This code was added by
0244c3a403.
David Nicol [Mon, 20 Aug 2012 23:22:15 +0000 (16:22 -0700)]
[perl #65838] perlop: remove caveat here-doc without newline
Father Chrysostomos [Mon, 20 Aug 2012 21:55:09 +0000 (14:55 -0700)]
here-doc in quotes in multiline s//.../e in eval
When <<END occurs on the last line of a quote-like operator inside a
string eval ("${\<<END}"), it peeks into the linestr buffer of the
parent lexing scope (quote-like operators start a new lexing scope
with the linestr buffer containing what is between the quotes) to find
the body of the here-doc. It modifies that buffer, stealing however
much it needs.
It was not leaving things in the consistent state that s///e checks
for when it finishes parsing the replacement (to make sure s//}+{/
doesn’t ‘work’). Specifically, it was not shrinking the parent buf-
fer, so when PL_bufend was reset in sublex_done to the end of the par-
ent buffer, it was pointing to the wrong spot.
Father Chrysostomos [Mon, 20 Aug 2012 19:57:29 +0000 (12:57 -0700)]
heredoc after "" in s/// in eval
This works fine:
eval ' s//<<END.""/e; print
Just another Perl hacker,
END
'or die $@
__END__
Just another Perl hacker,
But this doesn’t:
eval ' s//"$1".<<END/e; print
Just another Perl hacker,
END
'or die $@
__END__
Can't find string terminator "END" anywhere before EOF at (eval 1) line 1.
It fails because PL_sublex_info.super_buf*, added by commit
0244c3a403, are not localised, so, after the "", s/// sees its own
buffer pointers in those variables, instead of its parent string eval.
This used to happen only with s///e inside s///e, but that was because
here-docs would peek inside the parent linestr buffer only inside
s///e, and not other quote-like operators. That was fixed in
recent commits.
Simply moving the assignment of super_buf* into sublex_push does solve
the bug for a simple "", as "" does sublex_start, but not sublex_push.
We do need to localise those variables for "${\''}", however.
David Nicol [Mon, 20 Aug 2012 06:05:40 +0000 (23:05 -0700)]
toke.c:S_scan_heredoc: Add comment about <<\FOO
Father Chrysostomos [Mon, 20 Aug 2012 06:05:06 +0000 (23:05 -0700)]
[perl #65838] Allow here-doc with no final newline
When reading a line of input while scanning a here-doc, if the line
does not end in \n, then we know we have reached the end of input. By
simply tacking a \n on to the buffer, we can meet the expectations of
the rest of the here-doc parsing code. If it turns out the delimiter
is not found on that line, it does not matter that we modified it, as
we will croak anyway.
I had to add a new flag to lex_next_chunk. Before commit
f0e67a1d2,
S_scan_heredoc would read from the stream itself, without closing any
handles. So the next time through yylex, the eof code would supply
the final implicit semicolon.
Since
f0e67a1d2, S_scan_heredoc has been calling lex_next_chunk, which
takes care of reading from the stream an supply any final ; at eof.
The here-doc parser will just get confused as a result (<<';' would
work without any terminator). The new flag tells lex_next_chunk not
to do anything at eof (not even closing handles and resetting the
parser state), but to return false and leave everything as it was.
Father Chrysostomos [Mon, 20 Aug 2012 05:41:08 +0000 (22:41 -0700)]
heredoc.t: Suppress deprecation warnings
Michael G. Schwern [Fri, 12 Jun 2009 22:35:00 +0000 (15:35 -0700)]
Clean up heredoc.t
* Made the tests more independent, mostly by decoupling the use of
a single $string. This will make it easier to expand on the test file
later.
* Replace ok( $foo eq $bar ) with is() for better diagnostics
* Remove unnecessary STDERR redirection. fresh_perl does that for you.
* fix fresh_perl to honor progfile and stderr arguments passed in
rather than just blowing over them
David Nicol [Mon, 20 Aug 2012 05:16:13 +0000 (22:16 -0700)]
[perl #65838] Tests for here-docs without final newlines
and a few error cases
Father Chrysostomos [Sun, 19 Aug 2012 09:45:38 +0000 (02:45 -0700)]
[perl #114040] Parse here-docs correctly in quoted constructs
When parsing code outside a string eval or quoted construct, the lexer
reads one line at a time into PL_linestr.
To parse a here-doc (hereinafter ‘deer hock’, because I spike lunar-
isms), the lexer has to pull extra lines out of the input stream ahead
of the current line, the value of PL_linestr remaining the same.
In a string eval, the entire piece of code being parsed is in
PL_linestr.
To parse a deer hock inside a string eval, the lexer has to fiddle
with the contents of PL_linestr, scanning for newline characters.
Originally, S_scan_heredoc just followed those two approaches.
When the lexer encounters a quoted construct, it looks for the end-
ing delimiter (reading from the input stream if necessary), puts the
entire quoted thing (minus quotes) in PL_linestr, and then starts an
inner lexing scope.
This means that deer hocks would not nest properly outside of a string
eval, because the body of the inner deer hock would be pulled out of
the input stream *after* the outer deer hock.
Larry Wall fixed that in commit
fd2d095329 (Jan. 1997), so that this
would work:
<<foo
${\<<bar}
ber
bar
foo
He did so by following the string eval approach (looking for the deer
hock body in PL_linestr) if the deer hock was inside another quoted
construct.
Later, commit
a2c066523a (Mar. 1998) fixed this:
s/^not /substr(<<EOF, 0, 0)/e;
Ignored
EOF
by following the string eval approach only if the deer hock was inside
another non-backtick deer hock, not just any quoted construct.
The problem with the string eval approach inside a substitu-
tion is that it only looks in PL_linestr, which only contains
‘substr(<<EOF, 0, 0)’ when the lexer is handling the second part of
the s/// operator.
But that unfortunately broke this:
s/^not /substr(<<EOF, 0, 0)
Ignored
EOF
/e;
and this:
print <<`EOF`;
${\<<EOG}
echo stuff
EOG
EOF
reverting it to the pre-
fd2d095329 behaviour, because the outer quoted
construct was treated as one line.
Later on, commit
0244c3a403 (Mar. 1999) fixed this:
eval 's/.../<<FOO/e
stuff
FOO
';
which required a new approach not used before. When the replacement
part of the s/// is being parsed, PL_linestr contains ‘<<FOO’. The
body of the deer hock is not in the input stream (there isn’t one),
but in what was the previous value of PL_linestr before the lexer
encountered s///.
So
0244c3a403 fixed that by recording pointers into the outer string
and using them in S_scan_heredoc. That commit, for some reason, was
written such that it applied only to substitutions, and not to other
quoted constructs.
It also failed to take interpolation into account, and did not record
the outer buffer position, but then tried to use it anyway, resulting
in crashes in both these cases:
eval 's/${ <<END }//';
eval 's//${ <<END }//';
It also failed to take multiline s///’s into account, resulting in
neither of these working, because it lost track of the current cursor,
leaving it at 'D' instead of the line break following it:
eval '
s//<<END
/e;
blah blah blah
END
;1' or die $@;
eval '
s//<<END
blah blah blah
END
/e;
;1' or die $@;
S_scan_heredoc currently positions the cursor (s) at the last charac-
ter of <<END if there is a line break on the same line. There is an
s++ later on to account, but the code added by
0244c3a403 bypassed it.
So, in the end, deer hocks could only be nested in other quoted con-
structs if the outer construct was in a string eval and was not s///,
or was a non-backtick deer hock.
This commit hopefully fixes most of the problems. :-)
The s///-in-eval case is a little tricky. We have to see whether the
deer hock label is on the last line of the s///. If it is, we have
to peek into the outer buffer. Otherwise, we have to treat it like a
string eval.
This commit does not deal with <<END inside the pattern of a multi-
line s/// or in nested quotes.
Father Chrysostomos [Sun, 19 Aug 2012 06:54:02 +0000 (23:54 -0700)]
[perl #70836] Fix err msg for unterminated here-doc in eval
$ perl -e '<<foo'
Can't find string terminator "foo" anywhere before EOF at -e line 1.
$ perl -e 'eval "<<foo"; die $@'
Can't find string terminator "
foo" anywhere before EOF at (eval 1) line 1.
An internal implementation detail is leaking out.
When the lexer happens to have a multiline string in its line buffer
(in a string eval or quoted construct), it looks for "\nfoo" instead
of "foo". It was passing that same string to the error-reporting code
(S_missingterm), resulting in that extraneous newline.
Father Chrysostomos [Tue, 21 Aug 2012 15:25:13 +0000 (08:25 -0700)]
Increase $Module::CoreList::TieHashDelta::VERSION to 2.72
Father Chrysostomos [Tue, 21 Aug 2012 15:24:16 +0000 (08:24 -0700)]
[rt.cpan.org #79109] Avoid each $scalar in TieHashDelta.pm
This is dual-life, after all.
David Mitchell [Tue, 21 Aug 2012 09:55:00 +0000 (10:55 +0100)]
utf8 pos cache: always keep most recent value
UTF-8 strings may have magic attached that caches up to two byte position
to char position (or vice versa) mappings.
When a third position has been calculated (e.g. via sv_pos_b2u()), the
code has to decide how to update the cache: i.e. which value to discard.
Currently for each of the three possibilities, it looks at what would be
the remaining two values, and calculates the RMS sum of the three
distances between ^ ... cache A .. cache B ... $. Whichever permutation
gives the lowest result is picked. Note that this means that the most
recently calculated value may be discarded.
This makes sense if the next position request will be for a random part of
the string; however in reality, the next request is more likely to be for
the same position, or one a bit further along. Consider the following
innocuous code:
$_ = "\x{100}" x 1_000_000;
$p = pos while /./g;
This goes quadratic, and takes 150s on my system. The fix is is to always
keep the newest value, and use the RMS calculation only to decide which of
the two older values to discard. With this fix, the above code takes 0.4s.
The test suite takes the same time in both cases, so there's no obvious
slowdown elsewhere with this change.
Chris 'BinGOs' Williams [Tue, 21 Aug 2012 06:53:01 +0000 (07:53 +0100)]
Restore MANIFEST entry for Module::CoreList, sync with CPAN version
Craig A. Berry [Tue, 21 Aug 2012 00:15:23 +0000 (19:15 -0500)]
Consistent unixy path handling in File::Find::_find_opt.
Back in
a1ccf0c4149b we converted the current working directory to
Unix format on VMS, but neglected to change what later gets pasted
onto it with a hard-coded slash delimiter. The resulting mongrel
filespec was invalid and of course would not appear to exist even
if the file did exist under a properly assembled name.
So this commit makes the use of Unix-style paths on VMS within
_find_opt consistent.
The bug was tickled by a recent change to Module::Pluggable, whose
tests and the tests of other modules that depend on it started
failing en masse.
jkeenan [Sat, 11 Aug 2012 00:22:13 +0000 (20:22 -0400)]
Implement name change in POD example; Chris Waggoner++.
For: RT #114314.
Steve Hay [Mon, 20 Aug 2012 16:48:47 +0000 (17:48 +0100)]
RMG - update commit reference for version bump change
Still refer to a .0 bump since it contains more than other bumps, but
update it to 5.17.0's bump rather than 5.15.0's
Steve Hay [Mon, 20 Aug 2012 16:20:23 +0000 (17:20 +0100)]
Bump version to 5.17.4
Steve Hay [Mon, 20 Aug 2012 16:03:22 +0000 (17:03 +0100)]
RMG - update commit reference for new perldelta change
Refer to one that doesn't mention pod.lst since that is gone now.
Steve Hay [Mon, 20 Aug 2012 15:57:26 +0000 (16:57 +0100)]
Make new perldelta for 5.17.4
Steve Hay [Mon, 20 Aug 2012 15:30:52 +0000 (16:30 +0100)]
Undo VERSION bump for undone code
Commit
78ed4cf4d6 undid the accidental effect of
eb578fdb55 on OS2::REXX
but forgot to revert the accompanying VERSION bump, which is not otherwise
required since nothing else has changed.
Steve Hay [Mon, 20 Aug 2012 15:08:37 +0000 (16:08 +0100)]
Correct announcement date for 5.17.2's epigraph
Steve Hay [Mon, 20 Aug 2012 15:07:56 +0000 (16:07 +0100)]
Add epigraph for 5.17.3
Steve Hay [Mon, 20 Aug 2012 14:31:24 +0000 (15:31 +0100)]
Merge branch 'release-5.17.3' into blead
H.Merijn Brand [Mon, 20 Aug 2012 13:25:02 +0000 (15:25 +0200)]
Add the new smoke report test site
Better late than never!
Steve Hay [Mon, 20 Aug 2012 11:09:23 +0000 (12:09 +0100)]
Fix Module::CoreList test - TieHashDelta is to be expected too now
Steve Hay [Mon, 20 Aug 2012 10:36:53 +0000 (11:36 +0100)]
Add 5.17.3 to perlhist
Steve Hay [Mon, 20 Aug 2012 10:34:32 +0000 (11:34 +0100)]
Upgrade Module-CoreList to 2.71
Jesse Luehrs [Mon, 20 Aug 2012 10:18:00 +0000 (05:18 -0500)]
fix accidentally modified comment
Steve Hay [Mon, 20 Aug 2012 10:12:03 +0000 (11:12 +0100)]
perldelta - finalize with acknowledgements for 5.17.3
Steve Hay [Mon, 20 Aug 2012 10:08:48 +0000 (11:08 +0100)]
perldelta - Fix unescaped <>
Steve Hay [Mon, 20 Aug 2012 09:21:35 +0000 (10:21 +0100)]
Update RMG - note sync-with-cpan is untested on Windows
Steve Hay [Mon, 20 Aug 2012 08:32:19 +0000 (09:32 +0100)]
perldelta - Remove XXX sections ready for 5.17.3 release
Steve Hay [Sun, 19 Aug 2012 11:53:47 +0000 (12:53 +0100)]
Upgrade to Sys-Syslog-0.31
Steve Hay [Sun, 19 Aug 2012 11:27:06 +0000 (12:27 +0100)]
Corrections to Maintainers.pl and perldelta.pod for Text-Tabs+Wrap
Steve Hay [Sun, 19 Aug 2012 10:51:52 +0000 (11:51 +0100)]
Upgrade to Text-Tabs+Wrap-2012.0818
This incorporates earlier blead customizations to t/fill.t and t/tabs.t
Steve Hay [Sun, 19 Aug 2012 10:31:51 +0000 (11:31 +0100)]
Upgrade Module-Metadata to 1.000011
Steve Hay [Sun, 19 Aug 2012 10:24:53 +0000 (11:24 +0100)]
Upgrade Module-Build to 0.4003
Karl Williamson [Thu, 16 Aug 2012 16:50:14 +0000 (10:50 -0600)]
Omnibus removal of register declarations
This removes most register declarations in C code (and accompanying
documentation) in the Perl core. Retained are those in the ext
directory, Configure, and those that are associated with assembly
language.
See:
http://stackoverflow.com/questions/314994/whats-a-good-example-of-register-variable-usage-in-c
which says, in part:
There is no good example of register usage when using modern compilers
(read: last 10+ years) because it almost never does any good and can do
some bad. When you use register, you are telling the compiler "I know
how to optimize my code better than you do" which is almost never the
case. One of three things can happen when you use register:
The compiler ignores it, this is most likely. In this case the only
harm is that you cannot take the address of the variable in the
code.
The compiler honors your request and as a result the code runs slower.
The compiler honors your request and the code runs faster, this is the least likely scenario.
Even if one compiler produces better code when you use register, there
is no reason to believe another will do the same. If you have some
critical code that the compiler is not optimizing well enough your best
bet is probably to use assembler for that part anyway but of course do
the appropriate profiling to verify the generated code is really a
problem first.
Steve Hay [Sat, 18 Aug 2012 13:10:03 +0000 (14:10 +0100)]
Tweaks to RMG
Use the simpler syntax for starting the CPAN shell. Remove notes about
needing Unix tools on Windows for CPAN and CPANPLUS when LWP is not
installed: these are not required since the likes of Net::FTP and
HTTP::Tiny are used instead.
Steve Hay [Sat, 18 Aug 2012 11:28:32 +0000 (12:28 +0100)]
Don't use /dev/tty if it happens to exist on Windows
This fixes CPAN RT#79001 and CPAN RT#79064.
Steve Hay [Sat, 18 Aug 2012 09:39:56 +0000 (10:39 +0100)]
We don't support compilers other than MS VC++ and MinGW/gcc on Windows
Steve Hay [Sat, 18 Aug 2012 09:36:12 +0000 (10:36 +0100)]
Remove two unused #defines
Steve Hay [Sat, 18 Aug 2012 09:33:13 +0000 (10:33 +0100)]
We don't support MS VC++ < 6.0
Father Chrysostomos [Sat, 18 Aug 2012 06:20:53 +0000 (23:20 -0700)]
parser.t: Correct test count
Why do I keep making these mistakes? :-(
Father Chrysostomos [Fri, 17 Aug 2012 23:52:50 +0000 (16:52 -0700)]
sv.h: Don’t repeat _XPV_HEAD
Father Chrysostomos [Fri, 17 Aug 2012 23:54:40 +0000 (16:54 -0700)]
write.t: Eek! debugging code
Father Chrysostomos [Fri, 17 Aug 2012 23:44:57 +0000 (16:44 -0700)]
perldelta entries