Karl Williamson [Tue, 3 Dec 2013 04:46:37 +0000 (21:46 -0700)]
perlapi: May want to use savesharedpv on threaded Win32
Adds a note to savepv (and similar) that on threaded Windows
that you may need to use the saveshared version because the memory is
deallocated when the thread ends.
Karl Williamson [Fri, 29 Nov 2013 17:27:47 +0000 (10:27 -0700)]
doio.c: Remove EBCDIC dependency
For non-EBCDIC this file used \012 instead of \n. This may have been a
MAC OS Classic hack, which we no longer support.
Karl Williamson [Wed, 27 Nov 2013 19:46:42 +0000 (12:46 -0700)]
regcomp.c: Slight optimization
A swash is accessed either through its inversion list or its hash (only
large swashes actually have hashes; this is usually transparent). At
this point in regcomp.c, we only will be looking at the inversion list;
by telling swash_init function that via the flag, later accesses
automatically have a level of indirection removed.
Karl Williamson [Wed, 27 Nov 2013 17:19:22 +0000 (10:19 -0700)]
perlop: Add note about (?[])
Craig A. Berry [Sun, 1 Dec 2013 23:16:47 +0000 (17:16 -0600)]
Check unlink on directory for all users, not just root.
For cross-platform consistency, it makes more sense to reject
unlink attempts on directories in the same way for all users
rather than only for root. geteuid() always returns zero on
Windows, never returns zero on VMS, and is a poor indicator
of privilege on modern unixen, so the code really hasn't been
working as intended on all platforms anyway.
Evan Zacks [Tue, 26 Nov 2013 14:56:11 +0000 (09:56 -0500)]
Make unlink on directory as root set errno.
If unlink is called on an existing directory while running as root without -U
(PL_unsafe), the unlink call fails but does not set $!, because unlink(2) is
not actually called in this case.
If unlink is called as a user (or as root with -U), unlink(2) is invoked, so
attempting to remove a directory would set errno to EISDIR as expected. If
running as root without -U (PL_unsafe is false), lstat and S_ISDIR are called
instead. If the lstat succeeds and S_ISDIR returns true, the argument is
skipped without warning and without setting $!, meaning Perl's unlink can
return failure while leaving the previous value of errno in place.
This commit sets errno to EISDIR for this case.
Chris 'BinGOs' Williams [Mon, 2 Dec 2013 12:56:23 +0000 (12:56 +0000)]
Update Locale-Codes to CPAN version 3.28
Steve Hay [Mon, 2 Dec 2013 09:12:24 +0000 (09:12 +0000)]
Merge branch 'dirnames' into blead
Father Chrysostomos [Mon, 2 Dec 2013 06:40:19 +0000 (22:40 -0800)]
Page number for caretx.c Tolkien quote
Tom Christiansen kindly provided it in <30867.
1385946564@chthon>.
Father Chrysostomos [Sun, 1 Dec 2013 20:16:09 +0000 (12:16 -0800)]
sv.c: Rewrite COW logic
for readability, maintainability, and my sanity.
The comment about swipe and COW having ‘much in common’ notwithstand-
ing (actually they only shared two lines of code), I separated those
two code paths, splitting the horribly complex ‘if’ condition into
two. I also made the code slightly more repetitive, resulting in
fewer #ifdefs and more clarity.
James E Keenan [Sun, 1 Dec 2013 17:35:54 +0000 (18:35 +0100)]
Extract subroutines used to test File-Find into separate package.
t/porting/manifest.t and pod_rules.t: Add comments describing how to handle a
MANIFEST which is not sorted properly (per recommendation by Nicholas Clark).
James E Keenan [Sun, 1 Dec 2013 15:43:47 +0000 (16:43 +0100)]
Standardize subroutine definitions between tests.
In taint.t, replace touch() with create_file_ok() and MkDir() with mkdir_ok().
Make definition of wanted_File_Dir() identical in both test files.
This is a step on the road to eliminating repeated code.
Chris 'BinGOs' Williams [Sun, 1 Dec 2013 15:21:38 +0000 (15:21 +0000)]
Update Unicode-Collate to CPAN version 1.03
[DELTA]
1.03 Sun Dec 1 21:45:46 2013
- XS: now unpack_U() uses unpack('U*') in pure perl.
avoid XS for the internal "utf8" encoding of perl.
David Mitchell [Sun, 1 Dec 2013 11:16:52 +0000 (11:16 +0000)]
don't check format args on taint_proper
My recent commit
5d37acd6b65eb enabled (among other things)
format-arg checking of taint_proper(). This was not a good idea since
taint_proper() adds extra args before it actually calls a printf-style
function. This was masked since on some gcc systems, a NULLOK format arg
disables this check.
Chris 'BinGOs' Williams [Sat, 30 Nov 2013 22:55:42 +0000 (22:55 +0000)]
Update ExtUtils-MakeMaker to CPAN version 6.84
[DELTA]
6.84 Sat Nov 30 15:22:35 GMT 2013
No changes from 6.83_06
6.83_06 Fri Nov 29 21:50:51 GMT 2013
Doc fixes:
* Correct the documentation for MAGICXS
6.83_05 Mon Nov 25 22:51:11 GMT 2013
New Features:
* Added MAGICXS attribute to explicitly enable automagic
XS building.
Bug fixes:
* RT#90780 fix Macro `BOOTSTRAP' redefined warnings
* Only enable automatic OBJECT generation if MAGICXS is true
6.83_04 Sun Nov 17 11:41:43 GMT 2013
New Features:
* OBJECT can now be specified as an array
* build C_FILES/O_FILES/OBJECT automatically from XS
6.83_03 Fri Nov 15 09:44:26 GMT 2013
Bug fixes:
* Don't recurse into stale dist dirs
6.83_02 Tue Nov 12 11:11:34 GMT 2013
Misc:
* Enable bootstrapping to work on v5.10.x again
6.83_01 Tue Nov 5 11:43:50 GMT 2013
Misc:
* disable make parallelism for pure_all target
Father Chrysostomos [Sat, 30 Nov 2013 20:56:56 +0000 (12:56 -0800)]
sv.c: Clarify COW comments further
I was getting confused even when I wrote these comments.
Craig A. Berry [Sat, 30 Nov 2013 14:47:09 +0000 (08:47 -0600)]
Fix stdin inheritance for system and backticks on VMS.
The documentation to LIB$SPAWN says that standard input will be
inherited from the parent if not specified, and we've been
depending on that. But it seems not to actually work that way
as a simple
$ perl -e "system('edit foo.tmp');"
was failing due to the input not being a terminal. So set up the
input explicitly using the same mechanism we've always used for
output and error.
Except when SYS$INPUT is a "directory," which probably means it's
a channel open on a volume that holds a command procedure.
Father Chrysostomos [Sat, 30 Nov 2013 14:43:00 +0000 (06:43 -0800)]
regen pod issues
Father Chrysostomos [Sat, 30 Nov 2013 14:16:57 +0000 (06:16 -0800)]
sv.c: Fix Darwin g++ build
Father Chrysostomos [Sat, 30 Nov 2013 13:55:54 +0000 (05:55 -0800)]
[Merge] Swiping PADTMPs’ string buffers
This branch allows PADTMPs’ string buffers to be stolen, benefiting
programs that use long strings by avoiding a few copies.
Father Chrysostomos [Sat, 30 Nov 2013 13:55:40 +0000 (05:55 -0800)]
Peek.t: Update skip version
I did not have this branch ready in time for 5.19.6.
Father Chrysostomos [Sat, 30 Nov 2013 02:58:36 +0000 (18:58 -0800)]
Increase $constant::VERSION to 1.30
Father Chrysostomos [Wed, 20 Nov 2013 14:19:05 +0000 (06:19 -0800)]
sv.c: String copy comments
This layout, which goes back to
765f542df2 in 2002, I find a bit con-
fusing. The comments help. (Though rewriting it might be better.)
Granted, back in 2002 there wasn’t nearly as much code in any of
the branches.
Father Chrysostomos [Thu, 14 Nov 2013 02:10:49 +0000 (18:10 -0800)]
Allow PADTMPs’ strings to be swiped
While copy-on-write does speed things up, it is not perfect. Take
this snippet for example:
$a = "$b$c";
$a .= $d;
The concatenation operator on the rhs of the first line has its own
scalar that it reuses every time that operator is called (its target).
When the assignment happens, $a and that target share the same string
buffer, which is good, because we didn’t have to copy it. But because
it is shared between two scalars, the concatenation on the second line
forces it to be copied.
While copy-on-write may be fast, string swiping surpasses it, because
it has no later bookkeeping overhead. If we allow stealing targets’
strings, then $a = "$b$c" no longer causes $a to share the same string
buffer as the target; rather, $a steals that buffer and leaves the tar-
get undefined. The result is that neither ‘$a =’ nor ‘$a .= $d’ needs
to copy any strings. Only the "$b$c" will copy strings (unavoidably).
This commit only applies that to long strings, however. This is why:
Simply swiping the string from any swipable TARG (which I tried at
first) resulted in a significant slowdown. By swiping the string from
a TARG that is going to be reused (as opposed to a TEMP about to be
freed, which is where swipe was already happening), we force it to
allocate another string next time, greatly increasing the number
of malloc calls. malloc overhead exceeds the overhead of copying
short strings.
I tried swiping TARGs for short strings only when the buffer on the
lhs was not big enough for a copy (or there wasn’t one), but simple
benchmarks with mktables show that even checking SvLEN(dstr) is enough
to slow things down, since the speed-up this provides is minimal where
short strings are involved.
Then I tried checking just the string length, and saw a consistent
speed increase. So that’s what this patch uses. Programs using short
strings will not benefit. Programs using long strings may see a 1.5%
increase in speed, due to fewer string copies.
Father Chrysostomos [Wed, 13 Nov 2013 13:52:00 +0000 (05:52 -0800)]
op.c: Turn on read-only flag for folded constants
They are marked PADTMP, which causes them to be copied in any contexts
where readonliness makes a difference, so marking them as read-only
does not change the behaviour. What it does is allow a future commit
to implement string swiping for PADTMPs.
Father Chrysostomos [Wed, 13 Nov 2013 04:40:11 +0000 (20:40 -0800)]
constant.pm: Make elements of list consts read-only
Now that Internals::SvREADONLY turns on the PADTMP flag for all the
elements, the read-only flag on the elements themselves will not
actually make the returned elements read-only, because PADTMPs get
copied in all cases where readonliness matters. What this does is
prevent the original SV from being modified, allowing for more opti-
misations in perl’s internals (e.g., string buffers being stolen from
PADTMPs not marked read-only).
The order of the statements needs to be rearranged, otherwise we end
up setting the flag on a temporary copy of each element due to ‘for’.
David Mitchell [Sat, 30 Nov 2013 12:20:50 +0000 (12:20 +0000)]
silence -Wformat-nonliteral in pp_formline
David Mitchell [Sat, 30 Nov 2013 12:16:17 +0000 (12:16 +0000)]
revert pp_formline -Wformat-nonliteral fix
Revert the part of
5d37acd6b65eb421e938a3fde62cc1edde467dae that was
intended to silence non-literal format warnings. It's giving compiler
errors on AIX due (I suspect) to overlong macro expansions.
The next commit will fix the warning in a more prosaic fashion
Chris 'BinGOs' Williams [Fri, 29 Nov 2013 21:07:00 +0000 (21:07 +0000)]
Update HTTP-Tiny to CPAN version 0.039
[DELTA]
0.039 2013-11-27 19:48:29 America/New_York
[FIXED]
- Temporary file creating during mirror() is now opened with O_EXCL
for added security
David Mitchell [Fri, 29 Nov 2013 17:44:12 +0000 (17:44 +0000)]
fix -Wsign-compare in core
There were a few places that were doing
unsigned_var = cond ? signed_val : unsigned_val;
or similar. Fixed by suitable casts etc.
The four in utf8.c were fixed by assigning to an intermediate
unsigned var; this has the happy side-effect of collapsing
a large macro expansion, where toUPPER_LC() etc evaluate their arg
multiple times.
David Mitchell [Fri, 29 Nov 2013 16:12:20 +0000 (16:12 +0000)]
Move Cwd and List-Util: UNIX followup
the previous commit worked on win32; this commit makes it work under UNIX
too.
Basically Configure determines a list of "logical" extension names
such as "IPC/SysV", based on physical dirs under cpan/ etc such as
"IPC-SysV".
In this case, keep the original logical names "Cwd" and "List/Util",
even though the physical paths have been changed to "PathTools" and
"Scalar/List/Utils".
Father Chrysostomos [Fri, 29 Nov 2013 00:05:30 +0000 (16:05 -0800)]
Revert part of
2efab60d9
As Karl Williamson pointed out in <
5296C04E.10103@khwilliamson.com>,
sv_setpv will do SvOK_off if Strerror returns null, so this SvOK check
is not redundant.
David Mitchell [Thu, 28 Nov 2013 20:13:50 +0000 (20:13 +0000)]
GCC_DIAG_IGNORE() only with >= 4.6
The new GCC_DIAG_IGNORE() pragma I added a few commits ago was only
supposed to be enabled on gcc 4.6+; my code inadvertently did it on
4.2+
Steve Hay [Thu, 28 Nov 2013 18:20:18 +0000 (18:20 +0000)]
Document how to build perl for WinCE using EVC4
These notes are largely copied from those supplied by Daniel Dragan
<bulk88@hotmail.com> in [perl #120365] after verification by the committer.
David Mitchell [Thu, 28 Nov 2013 16:46:15 +0000 (16:46 +0000)]
silence -Wformat-nonliteral compiler warnings
Due to the security risks associated with user-supplied formats
being passed to C-level printf() style functions (eg %n),
gcc has a -Wformat-nonliteral warning that complains whenever such a
function is passed a non-literal format string.
This commit silences all such warnings in core and ext/.
The main changes are
1) the 'f' (format) flag in embed.fnc is now handled slightly more
cleverly. Rather than just applying to functions whose last arg is '...'
(and where the format arg is assumed to be the previous arg), it
can now handle non-'...' functions: arg checking is disabled, but format
checking is sill done: it works by assuming that an arg called 'fmt',
'pat' or 'f' is the format string (and dies if fails to find exactly one
such arg).
2) with the new embed.fnc functionally, more functions have been marked
with the 'f' flag. When such a function passes its fmt arg onto an inner
printf-like function, we simply disable the warning for that call using
GCC_DIAG_IGNORE(-Wformat-nonliteral), since we know that the caller must
have already checked it.
3) In quite a few places the format string isn't literal, but it *is*
constant (e.g. PL_warn_uninit_sv). For those cases, again disable the
warning.
4) In pp_formline(), a particular format was was one of several different
literal strings depending on circumstances. Rather than assigning this
string to a temporary variable, incorporate the ?: branches directly in
the function call arg. gcc is clever enough to decide the arg is then
always literal.
David Mitchell [Wed, 27 Nov 2013 16:56:59 +0000 (16:56 +0000)]
add GCC_DIAG_IGNORE(), GCC_DIAG_RESTORE macros
These allow you to temporarily disable a specific gcc or clang warning;
e.g.
GCC_DIAG_IGNORE(-Wmultichar);
char b = 'ab';
GCC_DIAG_RESTORE;
David Mitchell [Tue, 26 Nov 2013 15:50:45 +0000 (15:50 +0000)]
mark Perl_my_strftime with format attribute
mark this function with
__attribute__format__null_ok__(__strftime__,pTHX_1,0)
so that compiler checks and warnings about strftime-style format args
can be checked.
Rather than adding new flag(s) to embed.fnc, I just enhanced the f flag
to treat it as strftime-style rather than printf if the function name
matches /strftime/. This was quicker, and we're unlikely to have many
such functions.
Daniel Dragan [Fri, 15 Nov 2013 06:52:44 +0000 (01:52 -0500)]
remove almost unreachable NULL sv arg code from sv_2*n_flags
The NULL sv code being removed dates to commit
e334a159a5 Perl 1.0 as
the pre-SV str_2ptr and str_2num calls. When SVs were intoduced in
commit
79072805bf Perl 5.0 alpha 2, the NULL sv code was copied to the new
SV functions. The functions were bulk marked non-NULL in commit
f54cb97a39
during 5.9.3 development. The docs were corrected to say NULLOK support
in commit
53e8571218 during 5.11.0.
See the perldelta part of this patch for the rest of commit body.
Father Chrysostomos [Wed, 27 Nov 2013 15:18:59 +0000 (07:18 -0800)]
mg.c: Remove redundant SvOK checks
Chris 'BinGOs' Williams [Thu, 28 Nov 2013 00:06:04 +0000 (00:06 +0000)]
Update Module-Build to CPAN version 0.4203
[DELTA]
0.4203 - Wed Nov 27 19:09:05 CET 2013
[BUG FIXES]
- Map recommends back to runtime recommends [Leon Timmermans]
- Map restrictive license to restricted in meta 2.0 [Leon Timmermans]
Chris 'BinGOs' Williams [Wed, 27 Nov 2013 20:13:17 +0000 (20:13 +0000)]
Add deprecation for CGI.pm
Father Chrysostomos [Wed, 27 Nov 2013 15:13:15 +0000 (07:13 -0800)]
Revert "Squash COWs in the char* typemap"
This reverts commit
77ca9de6373481d905eed6af2904599353a658b3.
Father Chrysostomos [Wed, 27 Nov 2013 15:13:09 +0000 (07:13 -0800)]
Revert "Increase $XS::Typemap::VERSION to 0.13"
This reverts commit
3407f4a92d7d9731d099e0290b68d7e983ff2497.
Steve Hay [Wed, 27 Nov 2013 08:37:23 +0000 (08:37 +0000)]
Upgrade File-Fetch from version 0.44 to 0.46
* Blacklist 'lftp' on DragonflyBSD
Steve Hay [Wed, 27 Nov 2013 08:34:21 +0000 (08:34 +0000)]
Update CGI from version 3.63 to 3.64
[BUG FIXES]
- Avoid warning about "undefined variable in user_agent in some cases (RT#72882)
[INTERNALS]
- Avoiding warning about "unitialized value" in when calling user_agent() in some cases. (RT#72882, perl@max-maurer.de)
- Update minimum required version in Makefile.PL to 5.8.1. It had already been
updated to 5.8.1 in the CGI.pm module in 3.53.
- Fix POD errors reported by newer pod2man (Thanks to jmdh)
- Typo fixes, (dsteinbrunner).
- use deprecate.pm on perls 5.19.0 and later. (rjbs).
[DOCUMENTATION]
- Update CGI::Cookie docs to reflect that HttpOnly is widely supported now.
Karl Williamson [Thu, 17 Oct 2013 03:25:26 +0000 (21:25 -0600)]
Respect 'use bytes' in returning $! and $^E
This addresses some of the field problems caused by commit
1500bd919ffeae0f3252f8d1bb28b03b043d328e, but by no means all.
If the stringification of $^E or $! is done in the scope of 'use bytes',
the UTF-8 flag on the result is now never set. In such scope, the
behavior is then the same as it was prior to that commit. The actual
behavior will change before v5.20 ships.
Karl Williamson [Thu, 17 Oct 2013 03:21:52 +0000 (21:21 -0600)]
$^E should have same handling as $! for Win32 and OS/2
Commit
1500bd919ffeae0f3252f8d1bb28b03b043d328e changed the handling of
$! to look for UTF-8 messages. This issue is also present in Win32 for
$^E. The two should have uniform treatment, so this commit causes the
actual same code to be executed for both. OS/2 is also subject to
locale issues, and so it also is changed here to use the same code, so
that future changes will apply to it automatically.
VMS doesn't use locales, so it retains its current behavior.
Note that
1500bd919 has created some field problems, so that the changes
it introduced will be further changed or reverted. The current commit
just makes sure that whatever those further changes are will be
automatically propagated to all necessary platforms.
Karl Williamson [Thu, 17 Oct 2013 03:09:00 +0000 (21:09 -0600)]
mg.c: Extract code into a function.
This is in preparation for the same code to be used in additional
places. There should be no logic changes.
Karl Williamson [Tue, 10 Sep 2013 02:09:13 +0000 (20:09 -0600)]
mg.c: White-space only
Outdent this code as a result of removing outer blocks
Karl Williamson [Tue, 10 Sep 2013 02:04:39 +0000 (20:04 -0600)]
mg.c: Use $! code for $^E on platforms where are same
Only a few platforms have $^E distinguished from $!. On all others they
should behave identically. Previous commits, have caused these to get
out-of-sync. This causes them to share their code on platforms where
they mean the same thing, so this won't happen again.
Karl Williamson [Tue, 10 Sep 2013 01:39:07 +0000 (19:39 -0600)]
mg.c: Reorder if else clauses
This is to allow two cases in the switch statement to be combined in the
next commit. There should be no effective logic change
Karl Williamson [Tue, 10 Sep 2013 01:05:53 +0000 (19:05 -0600)]
mg.c: Reorder cases in a switch()
This is just cut and paste with no other changes; it is in preparation
for combining two cases in a future commit
Karl Williamson [Sat, 23 Nov 2013 17:04:57 +0000 (10:04 -0700)]
handy.h: Slightly refactor READ_XDIGIT macro
This adds comments as to how it works, factors out the mask to be
specified only once, and uses isDIGIT instead of isALPHA, as the former
is likely to be slightly more efficient (because isDIGIT doesn't have to
worry about there being non-ASCII digits, and isALPHA does have to worry
about non-ASCII alphas). The result is easier to understand what's
going on.
Steve Hay [Tue, 26 Nov 2013 09:06:30 +0000 (09:06 +0000)]
Extend Intel C++ compiler support (
a48cc4c427) to dmake builds
David Mitchell [Tue, 26 Nov 2013 14:04:17 +0000 (14:04 +0000)]
bump DynaLoader version after previous commits
David Mitchell [Tue, 26 Nov 2013 13:55:59 +0000 (13:55 +0000)]
dl_hpux.xs: fix PREINIT boundary
my previous commit moved some var declarations from CODE: to PREINIT:;
but I was slightly overzealous and moved a non-declaration line too.
David Mitchell [Tue, 26 Nov 2013 13:29:08 +0000 (13:29 +0000)]
Dynaloader: use PREINIT to avoid compiler errors
A recent change to the typmap definitions makes a var declaration
generate code. Move this declaration from CODE: to PREINIT: to avoid a
'Mixed declarations' compiler error.
This was spotted in dl_hpux.xs by bulk88, which I've blindly fixed.
From visual inspection there were also declarations in dl_vms.xs
that needed moving (although in that case it dosen't trigger the typemap
code emission).
Karl Williamson [Tue, 26 Nov 2013 02:26:17 +0000 (19:26 -0700)]
perlunicode: Nits
User-defined \p{}, \P{} are valid on not just Unicode code points; and
use C<> around a code fragment
Karl Williamson [Tue, 26 Nov 2013 02:14:46 +0000 (19:14 -0700)]
perlrecharclass: Add statement about above-Unicode and (?[])
The extended bracketed character class does not raise warnings when
non-Unicode code points are matched against it.
Karl Williamson [Tue, 26 Nov 2013 02:03:29 +0000 (19:03 -0700)]
perluniprops: Tweaks
Every time I read over something I wrote, I see some improvements in the
wording, and in this case the layout as well: a couple of lists are set
off by indententation
Karl Williamson [Tue, 26 Nov 2013 01:24:17 +0000 (18:24 -0700)]
perlapi: Refer to correct formal parameter name
Karl Williamson [Sat, 23 Nov 2013 16:55:04 +0000 (09:55 -0700)]
handy.h: Remove duplicate line
Two adjacent lines were identical. Only one is needed.
Ricardo Signes [Mon, 25 Nov 2013 23:18:08 +0000 (18:18 -0500)]
document the deprecation of literal ctrl char varnames
Steve Hay [Mon, 25 Nov 2013 18:05:33 +0000 (18:05 +0000)]
List another VC++ 2013 test failure
The first two happened when in BST but don't happen now, back in GMT; but
lib/File/Copy.t now fails instead!
This is all due to _utime() being broken in VC++ 2013's CRT. Microsoft have
acknowledged there is a regression from previous versions of the CRT in a
support ticket that I logged with them, and will publish a Knowledge Base
article about it in due course.
(Users can workaround it by using the Win32::UTCFileTime module on CPAN,
which exists to fix other (long-standing) issues with _stat() and _utime(),
but also fixes this new breakage too.)
Father Chrysostomos [Mon, 25 Nov 2013 07:30:09 +0000 (23:30 -0800)]
Test that (??{$pvlvregexp}) does not recompile the regexp
2685dc2d9a accidentally fixed this. Oops! :-)
Father Chrysostomos [Mon, 25 Nov 2013 07:11:57 +0000 (23:11 -0800)]
Reënable qr caching for (??{}) retval where possible
When a scalar is returned from (??{...}) inside a regexp, it gets com-
piled into a regexp if it is not one already. Then the regexp is sup-
posed to be cached on that scalar (in magic), so that the same scalar
returned again will not require another compilation.
Commit
e4bfbed39b disabled caching except on references to overloaded
objects. But in that one case the caching caused erroneous behaviour,
which was just fixed by
636209429f and this commit’s parent, effect-
ively disabling the cache altogether.
The cache is disabled because it does not apply to TEMP variables
(those about to be freed anyway, for which caching would be a waste
of CPU), and all non-overloaded non-qr thingies get copied into
new mortal (TEMP) scalars (as of
e4bfbed39b) before reaching the
caching code.
This commit skips the copy if the return value is already a non-magi-
cal string or number. It also allows the caching to happen on con-
stants, which has never been permitted before. (There is actually no
reason for disallowing qr magic on read-only variables.)
Father Chrysostomos [Mon, 25 Nov 2013 07:10:42 +0000 (23:10 -0800)]
Don’t cache qr magic on references
When a scalar is returned from (??{...}) inside a regexp, it gets com-
piled into a regexp if it is not one already. Then the regexp is sup-
posed to be cached on that scalar (in magic), so that the same scalar
returned again will not require another compilation.
This has never worked correctly with references, because the value was
being cached against the returned scalar itself, whereas the *refer-
ent* of a returned reference was being checked for qr magic.
Commit
636209429 (recent) attempted to fix the resulting bug, but
ended up exposing another, older bug, that
e4bfbed39b (5.18) acciden-
tally (?) fixed. The stringification of a reference can easily change
without the reference itself being touched. So set-magic (which
clears the qr cache) is never triggered:
{ package o; use overload '""'=>sub{"abc"} }
$x = bless \$y, "o";
sub foo { warn "abc" =~ /(??{$x})/ }
foo();
bless \$y;
warn "$x";
foo();
__END__
Output:
1 at - line 3.
main=SCALAR(0x7fcbc3027478) at - line 6.
1 at - line 3.
Blessing \$y into main causes it to stringify as
main=SCALAR(0x7fcbc3027478). So how can "abc" match
/main=SCALAR(0x7fcbc3027478)/?
Skipping the cache for references obviously fixes this. The cache was
only being stored on refs to overloaded objects, which don’t use the
cache. The only case in which is was being used was when the over-
loaded object was blessed into a non-overloaded class, and then it
was incorrect.
Father Chrysostomos [Mon, 25 Nov 2013 02:12:04 +0000 (18:12 -0800)]
Make (??{$tied_ovrld}) see the right $1
I can return $1 from a regexp code block and it refers to the last
match *within* the block:
"aab" =~ /(a)((??{"b" =~ m|(.)|; $1}))/;
print "[$1 $2]\n";
Output:
[a b]
Even via a tied variable’s FETCH method:
sub ReEvalTieTest::TIESCALAR {bless[], "ReEvalTieTest"}
sub ReEvalTieTest::FETCH { "$1" }
tie my $t, "ReEvalTieTest";
"aab" =~ /(a)((??{"b" =~ m|(.)|; $t}))/;
print "[$1 $2]\n";
Output:
[a b]
But not if I assign a reference to an overloaded object to the tied
variable first:
sub ReEvalTieTest::TIESCALAR {bless[], "ReEvalTieTest"}
sub ReEvalTieTest::STORE{}
sub ReEvalTieTest::FETCH { "$1" }
tie my $t, "ReEvalTieTest";
{ package o; use overload '""'=>sub { "abc" } }
$t = bless [], "o";
"aab" =~ /(a)((??{"b" =~ m|(.)|; $t}))/;
print "[$1 $2]\n";
Output:
[a a]
$1 now refers to the outer pattern, not the inner pattern.
The code that handles the return value of code blocks was not check-
ing get-magic before overloading.
This commit fixes it to do that.
Craig A. Berry [Mon, 25 Nov 2013 00:40:32 +0000 (18:40 -0600)]
Improve prefix removal from PPF translations.
When doing a logical name translation of a process-permanent file
(SYS$INPUT, SYS$OUTPUT, SYS$ERROR, or SYS$COMMAND), we need to
remove the special 0x001b prefix from the translation string
regardless of whether we are combining a search list into a
longer equivalence string or just doing a simple, index-free
lookup.
Since we now have two places needing the same logic, move that
logic into a static inline function.
Craig A. Berry [Mon, 25 Nov 2013 00:27:00 +0000 (18:27 -0600)]
Fix VMS-specific wraparound error in S_mayberelocate.
In trimming the trailing slash from a Unix path spec, we haven't
(since 5.003 or so) been ensuring that we weren't stepping off
the beginning of the string. No, it's not normal to have '/' as
a library path, but if it happens we shouldn't allow a zero or
negative (actually wraparound since unsigned) value for the path
length.
Father Chrysostomos [Sun, 24 Nov 2013 23:58:02 +0000 (15:58 -0800)]
Don’t skip pat_re_eval.t under miniperl
re.pm no longer loads the compiled C code under miniperl, and this
test does not use any features that re.xs provides; so the skip
is unnecessary.
Father Chrysostomos [Sun, 24 Nov 2013 23:55:18 +0000 (15:55 -0800)]
Fix bug with (??{$overload}) regexp caching
When a scalar is returned from (??{...}) inside a regexp, it get com-
piled into a regexp if it is not one already. Then the regexp is sup-
posed to be cached on that scalar (in magic), so that the same scalar
returned again will not require another compilation.
This has never worked correctly with references, because the value was
being cached against the returned scalar itself, whereas the *refer-
ent* of a returned reference was being checked for qr magic.
Commit
e4bfbed39b disabled caching for all scalars except references
to overloaded objects. This is the result of copy the return value
to a new mortal scalar. The actual returned scalar then remains
untouched.
So the only case in which the cache value was used was incorrect:
namely, when the regexp was cached against a reference to an over-
loaded object, and a later code block returned a reference to that
reference:
$\="\n";
{ package o; use overload '""'=>sub { "abc" } }
$x = bless [],"o";
$y = \$x;
($y_addr = "$y") =~ y/()//d; # REF(0x7fcb9c02ef08) -> REF0x7fcb9c02ef08
print "$x$y";
print "abc$y_addr" =~ /$x$y/;
print "abc$y_addr" =~ /(??{$x})(??{$y})/; # does not match; should
print "abcabc" =~ /(??{$x})(??{$y})/; # matches!
print "__END__";
__END__
Output:
abcREF(0x7ff37182ef68)
0x7ff37182ef68
1
__END__
Should be:
abcREF(0x7ff37182ef68)
0x7ff37182ef68
1
__END__
This commit corrects the logic that checks for cached qr magic, effec-
tively disabling the cache altogether.
A forthcoming commit will reënable it (if all goes as planned).
Father Chrysostomos [Sun, 24 Nov 2013 05:31:24 +0000 (21:31 -0800)]
->$#*
David Mitchell [Sun, 24 Nov 2013 19:58:26 +0000 (19:58 +0000)]
fix Gconvert 'ignoring return value' warnings
On some systems, Gconvert() is #deffed to gcvt(), and on linux,
that function has a mandatory return value, so you get lots of
'ignoring return value' warnings. So define a V_Gconvert()
macro that does Gconvert() in a void context. Ideally this macro
would be part of the original definition of Gconvert() in config.sh,
but since Gconvert() is only used in sv.c, it was easier to
to just define V_Gconvert() locally there.
David Mitchell [Sun, 24 Nov 2013 19:44:41 +0000 (19:44 +0000)]
fix 'ignoring return value' compiler warnings
Various system functions like write() are marked with the
__warn_unused_result__ attribute, which causes an 'ignoring return value'
warning to be emitted, even if the function call result is cast to (void).
The generic solution seems to be
int rc = write(...);
PERL_UNUSED_VAR(rc);
David Mitchell [Sun, 24 Nov 2013 14:44:40 +0000 (14:44 +0000)]
XS::APItest: remove unused var
(see Message-ID: <
20131121221646.GB21945@fysh.org>)
Yves Orton [Sun, 24 Nov 2013 16:57:33 +0000 (17:57 +0100)]
better diagnostics of RExC_seen in regcomp.c
Yves Orton [Sun, 24 Nov 2013 15:24:16 +0000 (16:24 +0100)]
Avoid pointer churn in study_chunk recursion bitmap allocation
Since we can only recurse into a given paren (or the entire pattern)
once, we know that the maximum recursion depth is the number of parens
in the pattern (plus one for "whole pattern"). This means we can
preallocate one large bitmap, and then use different chunks of it
for each level. That avoids SAVEFREEPV costs for each bitmap, which
are likely short anyway. (One could imagine an optimization where a
flag somewhere lets us use the RExC_study_chunk_recursed pointer
as a bitmap, so we dont have to allocate all when we have less than
32 parens.)
This removes the "recursed" argument from study_chunk() and replaces
it with a "recursive_depth" argument which counts how deep we
are in the bitmap "stack".
Father Chrysostomos [Sun, 24 Nov 2013 06:37:30 +0000 (22:37 -0800)]
regexec.c: Remove redundant S_regcp_restore call
made redundant by
e4bfbed3.
Father Chrysostomos [Sun, 24 Nov 2013 06:36:26 +0000 (22:36 -0800)]
Test line number for (??{$str}) regexp warnings
If the (??{}) does not occur on the same line as the match operator
(e.g., if we have /foo$qr/ and $qr has (??{}) in it), we need to
make sure warnings that occur when the return value of (??{}) is
compiled use the same line number as the /.../, since conceptually
it is part of matching, not part of returning.
I tried breaking this and making the line number wrong, but all
tests passed, showing that it is untested. Most regcomp tests are in
t/re/reg_mesg.t, but that file does not have infrastructure for check-
ing line numbers.
Yves Orton [Sun, 24 Nov 2013 12:07:20 +0000 (13:07 +0100)]
First steps to resolving RT #120618, better fix for RT #120600
Commit
099ec7dcf9e085a650e6d9010c12ad9649209bf4 tried to fix RT #120618,
however it resulted in RT #120600:
[perl #120618] Bleadperl v5.19.6-15-g099ec7d breaks ABIGAIL/Regexp-Common-
2013031301.tar.gz
Which includes breakage to:
MAUKE/Function-Parameters-1.0401.tar.gz
ABIGAIL/Regexp-Common-
2013031301.tar.gz
DCONWAY/Regexp-Grammars-1.033.tar.gz
AMBS/Text/Text-RewriteRules-0.25.tar.gz
To put it bluntly I didn't like the fix in
099ec7dcf9 in and it doesn't
entirely surprise me that it broke extreme modules like these.
This is a much better fix, and includes better debug output for frames
and recursion.
Both of the old strategies were flawed. The new strategy is much more
sound. Every time we recurse we create a new recursed bitmap, if
necessary copying the existing bitmap (thus treated a NULL "recursed"
pointer as an all zero bitmap). Instead of turning off the flag when
we exit, we simply "throw away" that bitmap, restoring the state of
the parent. Thus what occured in a child does not contaminate the
parent.
All of this is a bit confusing as there are two levels of recursion
at work here. First there is recursion, and pseudo-recursion in
study_chunk(), which is distinct from recursive patterns, even tho
the implementation of recursive patterns uses pseudo-recursion in
study-chunk. Anyway, to make the new bitmap pattern work I had to
extend the frame mechanism, and add diagnostics to it, which are
visible via -Mre=Debug,ALL.
I haven't tested that this fixes the modules, I just know that it
is conceptually a much better and cleaner fix.
James E Keenan [Sun, 24 Nov 2013 04:37:50 +0000 (05:37 +0100)]
Require Test::More v0.88 to use done_testing in Spec.t.
Reported in https://rt.cpan.org/Ticket/Display.html?id=87574.
Father Chrysostomos [Sun, 24 Nov 2013 00:39:38 +0000 (16:39 -0800)]
Increase $XS::Typemap::VERSION to 0.13
Father Chrysostomos [Sun, 24 Nov 2013 00:38:30 +0000 (16:38 -0800)]
Squash COWs in the char* typemap
Some XS modules expect to be able to modify strings passed in as char
pointers. Copy-on-write breaks that ability. So this commit makes
the T_PV typemap uncow mutable COWs when passing them.
const char* is now mapped to the new T_ROPV entry, to avoiding unnec-
essarily slowing it down.
Steffen Müller writes in <
52912E9C.3030403@cpan.org> that the typemap
is not dual-lifed, so it is not necessary to make this 5.16-compati-
ble. However, I had already written the patch, and I think it is good
to keep it possible to drop this typemap into a CPAN distribution.
Any self-respecting C compiler should be able to optimise away the
extra SvIsCOW(t_pv_tmp_sv) == 1 check, so there is no slowdown as a
result of compatibility.
Father Chrysostomos [Sat, 23 Nov 2013 12:43:37 +0000 (04:43 -0800)]
Increase $PerlIO::via::VERSION to 0.14
Father Chrysostomos [Sat, 23 Nov 2013 04:34:17 +0000 (20:34 -0800)]
Push a new stack in sv_recode_to_utf8
That prevents this from happening under STRESS_REALLOC:
$ ./perl -Ilib -Mencoding=johab -e '(chr(0x7f) eq "\x7f")'
Use of the encoding pragma is deprecated at -e line 0.
perl(3939) malloc: *** error for object 0x7fac20c03968: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
(That is an excerpt from t/uni/chr.t.)
It may also fix other instances of encoding.pm crashing, which I had
almost vowed not to do. Oh well.
Constant folding is generally expected not to reallocate the stack,
because it should never need to extend it, just reduce the number of
items on it.
sv_recode_to_utf8 breaks that assumption.
Nicholas Clark [Fri, 4 Oct 2013 13:33:49 +0000 (15:33 +0200)]
No need to wrap calls to Perl_load_module() in ENTER/LEAVE
As of commit
53a7735b62aee146 (May 2007) Perl_vload_module() wraps its call
to Perl_utilize() with ENTER/LEAVE, so there's no longer a need for callers
of Perl_load_module() to also wrap with ENTER/LEAVE.
Nicholas Clark [Fri, 4 Oct 2013 13:15:56 +0000 (15:15 +0200)]
Perl_load_module() no longer moves the current stack, so no need to save it.
Nicholas Clark [Fri, 4 Oct 2013 12:54:00 +0000 (14:54 +0200)]
S_process_special_blocks() should use a new stack for BEGIN blocks.
This avoids the stack moving underneath anything that directly or indirectly
calls Perl_load_module().
[Committer’s note:
This fixes bug #119993.
Furthermore, under STRESS_REALLOC, t/io/layers.t was crashing like this:
$ ./perl -Ilib -e ' open(UTF, "<:raw:encoding(utf8)", 'tmp75851B') or die $!; ref #blahblahblahblahblahblahblahblahblah'
Segmentation fault: 11
(The comment seems to be necessary to make it crash.)
It was happening because open() was causing a module to be loaded
while the arguments to open() were still on the stack.
]
Nicholas Clark [Fri, 4 Oct 2013 11:28:58 +0000 (13:28 +0200)]
Remove redundant SPAGAIN & PUTBACK after PUSHSTACKi().
PUSHSTACKi() calls SWITCHSTACK(), which sets PL_stack_sp and sp like this:
sp = PL_stack_sp = PL_stack_base + AvFILLp(t)
Hence after PUSHSTACKi() both are identical, so use of SPAGAIN or PUTBACK
to assign one to the other is redundant.
The use of SPAGAIN in encoding.xs and via.xs was added with commit
24f59afc531955e5 (April 2002) which added the use of PUSHSTACKi(). It feels
like cargo-cult.
The use of PUTBACK in Perl_amagic_call() predates the introduction of nested
stacks and PUSHSTACKi() in commit
e336de0d01f30cc4 (April 1998). It dates from
perl 5.000, but it's not clear that it was ever needed, as the code in
question looked like this, and nothing could have moved the stack between
the dSP and PUTBACK:
dSP;
BINOP myop;
SV* res;
Zero(&myop, 1, BINOP);
myop.op_last = (OP *) &myop;
myop.op_next = Nullop;
myop.op_flags = OPf_KNOW|OPf_STACKED;
ENTER;
SAVESPTR(op);
op = (OP *) &myop;
PUTBACK;
The PUTBACK and SPAGAIN in Perl_require_pv() were added by commit
d3acc0f7e5197310 (June 1998) which also added the PUSHSTACKi(). They have
both been redundant since they were added.
Father Chrysostomos [Thu, 21 Nov 2013 04:39:56 +0000 (20:39 -0800)]
Extend STRESS_REALLOC to move the stack with every EXTEND
This allows us easily to catch cases where the stack could move to a
new memory address while code still holds pointers to the old loca-
tion. Indeed, this causes test failures.
Father Chrysostomos [Thu, 21 Nov 2013 04:38:27 +0000 (20:38 -0800)]
Get perl to build under STRESS_REALLOC once more
It had been broken since v5.17.6-144-ga3444cc.
That commit added, inter alia, this comment to scope.h:
+ * Of course, doing the size check *after* pushing means we must always
+ * ensure there are SS_MAXPUSH free slots on the savestack
But STRESS_REALLOC makes the initial savestack size just 1, and
SS_MAXPUSH is 4. So ‘./miniperl -e0’ failed an assertion.
François Perrad [Sat, 23 Nov 2013 00:49:11 +0000 (01:49 +0100)]
Skip dist/ExtUtils-Install/t/InstallWithMM.t when cross-compiling.
The toolchain is not installed on the target when cross-compiling.
So, this test must be skipped, see patch below.
(same problem as RT#119769 and RT#120398)
For: RT #120615
Chris 'BinGOs' Williams [Fri, 22 Nov 2013 14:37:33 +0000 (14:37 +0000)]
Module-CoreList on CPAN is 3.01
Father Chrysostomos [Thu, 21 Nov 2013 22:43:26 +0000 (14:43 -0800)]
Remove unused var from APItest.xs
This was added at the same time as the containing function in
27fcb6ee.
It was probably due to copy and paste.
Yves Orton [Fri, 22 Nov 2013 00:08:39 +0000 (01:08 +0100)]
Fix RT #120600: Variable length lookbehind is not variable
Inside of study_chunk() we have to guard against infinite
recursion with recursive subpatterns. The existing logic
sort of worked, but didn't address all cases properly.
qr/
(?<W>a)
(?<BB>
(?=(?&W))(?<=(?&W))
)
(?&BB)
/x;
The pattern in the test would fail when the optimizer
was expanding (&BB). When it recursed, it creates a bitmap
for the recursion it performs, it then jumps back to
the BB node and then eventually does the first (&W) call.
At this point the bit for (&W) would be set in the bitmask.
When the recursion for the (&W) exited (fake exit through
the study frame logic) the bit was not /unset/. When the parser
then entered the (&W) again it was treated as a nested and
potentially infinite length pattern.
The fake-recursion in study-chunk made it little less obvious
what was going on in the debug output.
By reorganizing the code and adding logic to unset the bitmap
when exiting this bug was fixed. Unfortunately this also revealed
another little issue with patterns like this:
qr/x|(?0)/
qr/(x|(?1))/
which forced the creation of a new bitmask for each branch.
Effectively study_chunk treats each branch as an independent
pattern, so when we are expanding (?1) via the 'x' branch
we dont want that to prevent us from detecting the infinite recursion
in the (?1) branch. If you were to think of trips through study_chunk
as paths, and [] as recursive processing you would get something like:
BRANCH 'x' END
BRANCH (?0) [ 'x' END ]
BRANCH (?0) [ (?0) [ 'x' END ] ]
...
When we want something like:
BRANCH 'x' END
BRANCH (?0) [ 'x' END ]
BRANCH (?0) [ (?0) INFINITE_RECURSION ]
So when we deal with a branch we need to make a new recursion bitmask.
David Mitchell [Thu, 21 Nov 2013 17:11:40 +0000 (17:11 +0000)]
APItest.xs: fix various compiler warnings
David Mitchell [Thu, 21 Nov 2013 17:10:04 +0000 (17:10 +0000)]
toLOWER_LC(), toUPPER_LC(): fix signedness
The are documented to return UV, but in one definition they return
tolower()/toupper(), which on Linux return a signed value. So
cast away the compiler warnings.
David Mitchell [Thu, 21 Nov 2013 16:21:38 +0000 (16:21 +0000)]
Storable: silence compiler 'unused func' warnings
Two static functions are only used within asserts, so only define them
if asserts are enabled
David Mitchell [Thu, 21 Nov 2013 16:15:24 +0000 (16:15 +0000)]
Storable: silence some unused var warnings