Brian Fraser [Fri, 30 Sep 2011 20:26:26 +0000 (13:26 -0700)]
op.c: Scalar filehandles in errors UTF8 cleanup.
Father Chrysostomos [Fri, 30 Sep 2011 19:15:09 +0000 (12:15 -0700)]
Modify S_pending_ident to use sv_catpvn_flags
with the new SV_CAT* constants, since that’s faster than creating an
SV to pass to sv_catsv.
Brian Fraser [Thu, 6 Oct 2011 05:16:32 +0000 (22:16 -0700)]
TODO tests for parsing our() now pass
Father Chrysostomos [Fri, 30 Sep 2011 19:10:51 +0000 (12:10 -0700)]
Oust cv_ckproto_len
It is no longer used in core (having been superseded by
cv_ckproto_len_flags), is unused on CPAN, and is not part of the API.
The cv_ckproto ‘public’ macro is modified to use the _flags version.
I put ‘public’ in quotes because, even before this commit, cv_ckproto
was using a non-exported function, and hence could never have worked
on a strict linker (or whatever you call it).
Brian Fraser [Fri, 30 Sep 2011 13:25:45 +0000 (06:25 -0700)]
toke.c, op.c, sv.c: Prototype parsing and checking are nul-and-UTF8 clean.
This means that eval "sub foo ($;\0whoops) { say @_ }" will correctly
include \0whoops in the CV's prototype (while complaining about illegal
characters), and that
use utf8;
BEGIN { $::{"foo"} = "\$\0L\351on" }
BEGIN { eval "sub foo (\$\0L\x{c3}\x{a9}on) {};"; }
will not warn about a mismatched prototype.
Brian Fraser [Mon, 11 Jul 2011 17:50:10 +0000 (18:50 +0100)]
gv.c, op.c, pp.c: Stash-injected prototypes and prototype() are UTF-8 clean.
This makes perl -E '$::{example} = "\x{30cb}"; say prototype example;'
store and fetch the correctly flagged prototype.
With this, all TODO tests in gv.t pass; The next commit will deal
with making the parsing of prototypes nul-clean.
Brian Fraser [Mon, 11 Jul 2011 17:17:32 +0000 (18:17 +0100)]
pp.c: Got pp_gelem nul-clean.
Brian Fraser [Sun, 10 Jul 2011 11:20:26 +0000 (08:20 -0300)]
toke.c: Some simple mending to get readline() working with UTF-8 filehandles
Brian Fraser [Sun, 10 Jul 2011 11:18:57 +0000 (08:18 -0300)]
pp_sys.c: pp_select UTF8 cleanup.
Brian Fraser [Thu, 7 Jul 2011 12:25:18 +0000 (09:25 -0300)]
op.c: Malformed prototype warning on UTF8 sub name
Father Chrysostomos [Thu, 6 Oct 2011 05:13:35 +0000 (22:13 -0700)]
gv.c: Use name_end to avoid compiler warning
In this code path, name_cursor could be uninitialised if
gv_fetchpvn_flags is called with GV_NOTQUAL|GV_ADDWARN. Whenever it
is initialised, it is the same as name_end by the time this part
of the function is reached.
Father Chrysostomos [Thu, 6 Oct 2011 05:07:07 +0000 (22:07 -0700)]
globvar.t: Skip PL_warn_uninit_sv
Until someone can explain to me why these sorts of things are exported,
I’ll skipping the test. Nothing is failing for me (yet), and it is
not clear that we want to support this name for ever.
Father Chrysostomos [Thu, 6 Oct 2011 04:45:05 +0000 (21:45 -0700)]
Fix diag.t failure with diag_listed_as comment
Brian Fraser [Thu, 6 Oct 2011 03:45:21 +0000 (20:45 -0700)]
Several TODO tests that now pass.
Brian Fraser [Thu, 7 Jul 2011 09:09:12 +0000 (06:09 -0300)]
util.c UTF8 cleanup
Brian Fraser [Thu, 7 Jul 2011 09:01:33 +0000 (06:01 -0300)]
More warnings tests.
Brian Fraser [Thu, 7 Jul 2011 08:57:57 +0000 (05:57 -0300)]
universal.c: VERSION UTF8 cleanup
Brian Fraser [Thu, 7 Jul 2011 08:43:43 +0000 (05:43 -0300)]
universal.c: Make croak_xs_usage account for UTF8
Brian Fraser [Thu, 7 Jul 2011 08:36:34 +0000 (05:36 -0300)]
"Use of uninitialized value..." UTF8 cleanup
Brian Fraser [Thu, 6 Oct 2011 03:42:53 +0000 (20:42 -0700)]
gv.c: Make more warnings utf8-clean
Brian Fraser [Thu, 7 Jul 2011 07:35:35 +0000 (04:35 -0300)]
mro.(c|xs): Make warnings utf8-clean
Brian Fraser [Fri, 22 Jul 2011 13:05:11 +0000 (10:05 -0300)]
t/uni/gv.t, stringify is clean, remove the TODO
Brian Fraser [Fri, 30 Sep 2011 01:27:10 +0000 (18:27 -0700)]
Tests for DATA handle in UTF8 packages
Father Chrysostomos [Fri, 30 Sep 2011 01:23:27 +0000 (18:23 -0700)]
toke.c: Take utf8 into account when creating DATA handle
This is based on work from Brian Fraser, but differs from his original
in that it does not require an intermediate SV.
Brian Fraser [Fri, 22 Jul 2011 13:10:48 +0000 (10:10 -0300)]
Tests for UTF-8 stashes.
Brian Fraser [Fri, 22 Jul 2011 13:10:34 +0000 (10:10 -0300)]
Tests for package; declarations in UTF-8
Brian Fraser [Fri, 22 Jul 2011 13:10:57 +0000 (10:10 -0300)]
More tests for t/uni/method.t
Brian Fraser [Thu, 6 Oct 2011 00:57:20 +0000 (17:57 -0700)]
sv.c: Make most warnings utf8-clean
Brian Fraser [Thu, 29 Sep 2011 21:46:35 +0000 (14:46 -0700)]
sv.c: Make cloning account for UTF8 stash names
Brian Fraser [Thu, 29 Sep 2011 21:44:55 +0000 (14:44 -0700)]
Make sv.c:sv_clear account for UTF8 keys in PL_stashcache
Brian Fraser [Wed, 6 Jul 2011 17:44:11 +0000 (14:44 -0300)]
sv.c: Pass in UNI_DISPLAY_ISPRINT in S_not_a_number
Brian Fraser [Thu, 29 Sep 2011 21:39:35 +0000 (14:39 -0700)]
pp_sys.c: Make warnings utf8-clean
Brian Fraser [Wed, 6 Jul 2011 16:45:07 +0000 (13:45 -0300)]
pp_hot.c: Make warnings utf8-clean
Father Chrysostomos [Wed, 5 Oct 2011 20:33:36 +0000 (13:33 -0700)]
Teach porting/diag.t about SVf32 and SVf256
Brian Fraser [Wed, 6 Jul 2011 16:08:37 +0000 (13:08 -0300)]
pp.c: Make warnings utf8-clean
Brian Fraser [Wed, 6 Jul 2011 15:50:59 +0000 (12:50 -0300)]
Make op.c warnings UTF8-clean
Brian Fraser [Wed, 5 Oct 2011 19:48:07 +0000 (12:48 -0700)]
Make gv.c and pp_ctl.c warnings utf8-clean
Brian Fraser [Wed, 28 Sep 2011 03:33:02 +0000 (20:33 -0700)]
doio.c: Make warnings UTF8- and nul-clean
Brian Fraser [Sat, 23 Jul 2011 21:48:51 +0000 (18:48 -0300)]
util.c for threads: stashpv_hvname_match UTF8 cleanup.
Brian Fraser [Tue, 4 Oct 2011 21:53:12 +0000 (14:53 -0700)]
Tests for DOES/isa/can with UTF8 and embedded nuls
Father Chrysostomos [Thu, 6 Oct 2011 07:02:36 +0000 (00:02 -0700)]
Document sv_does_pvn
Father Chrysostomos [Fri, 30 Sep 2011 20:44:22 +0000 (13:44 -0700)]
Correct name of sv_does_sv apidoc entry
plus other tweaks
Brian Fraser [Fri, 30 Sep 2011 20:42:31 +0000 (13:42 -0700)]
universal.c: sv_does() UTF8 cleanup.
This adds _sv, _pv, and _pvn forms to sv_does, and changes it to use
sv_ref() instead of sv_reftype().
Father Chrysostomos [Thu, 29 Sep 2011 15:48:38 +0000 (08:48 -0700)]
mro.c: Correct utf8 and bytes concatenation
The previous commit introduced some code that concatenates a pv on to
an sv and then does SvUTF8_on on the sv if the pv was utf8.
That can’t work if the sv was in Latin-1 (or single-byte) encoding
and contained extra-ASCII characters. Nor can it work if bytes are
appended to a utf8 sv. Both produce mangled utf8.
There is apparently no function apart from sv_catsv that handle
this. So I’ve modified sv_catpvn_flags to handle this if passed the
SV_CATUTF8 (concatenating a utf8 pv) or SV_CATBYTES (cancatenating a
byte pv) flag.
This avoids the overhead of creating a new sv (in fact, sv_catsv
even copies its rhs in some cases, so that would mean creating two
new svs). It might even be worthwhile to redefine sv_catsv in terms
of this....
Brian Fraser [Wed, 6 Jul 2011 13:41:10 +0000 (10:41 -0300)]
mro UTF8 cleanup.
This patch also duplicates existing mro tests with copies that use
Unicode in identifiers, to test the mro code.
Since those tests trigger it, it also fixes a bug in the parsing
of *{...}: If the first character inside the braces is a non-ASCII
Unicode identifier character, the inside is now implicitly quoted
if it is just an identifier (just as it is with ASCII identifiers),
instead of being parsed as a bareword that would violate strict subs.
Brian Fraser [Wed, 6 Jul 2011 11:54:11 +0000 (08:54 -0300)]
universal.c: ->can UTF8 cleanup.
Brian Fraser [Tue, 27 Sep 2011 00:35:50 +0000 (17:35 -0700)]
universal.c: ->isa, sv_derived_from UTF8 cleanup.
This makes them both nul-and-UTF8 clean, although the latter
is somewhat superficial, as mro isn't clean yet.
(Tests coming once ->can and ->DOES are clean)
Brian Fraser [Wed, 6 Jul 2011 10:57:20 +0000 (07:57 -0300)]
pp_sys.c: pp_tie and untie UTF8 cleanup.
Brian Fraser [Tue, 27 Sep 2011 00:24:44 +0000 (17:24 -0700)]
pp.c: pp_substr for UTF-8 globs.
Since typeglobs may have the UTF8 flag set now, we need to avoid
testing SvCUR on a potential glob, as that would trip an assertion.
Brian Fraser [Mon, 26 Sep 2011 22:32:45 +0000 (15:32 -0700)]
pp_ctl.c: pp_caller UTF8 cleanup.
Brian Fraser [Mon, 26 Sep 2011 20:48:52 +0000 (13:48 -0700)]
sv.c: S_anonymise_cv_maybe UTF8 cleanup.
Brian Fraser [Mon, 26 Sep 2011 19:56:47 +0000 (12:56 -0700)]
pp.c & sv.c: pp_ref UTF8 and null cleanup.
This adds a new function to sv.c, sv_ref, which is a nul-and-UTF8
clean version of sv_reftype. pp_ref now uses that.
sv_ref() not only returns the SV, but also takes in an SV
to modify, so we can say both sv_ref(TARG, obj, TRUE); and
sv = sv_ref(NULL, obj, TRUE);
Brian Fraser [Thu, 6 Oct 2011 06:56:03 +0000 (23:56 -0700)]
Add a sv_sethek() function to sv.c
This is exported so that attributes.xs can use it.
Brian Fraser [Wed, 6 Jul 2011 09:16:30 +0000 (06:16 -0300)]
pp.c: pp_bless UTF8 cleanup.
Some tests in t/uni/bless.t are TODO, as ref() isn't
clean yet.
Brian Fraser [Mon, 26 Sep 2011 16:21:23 +0000 (09:21 -0700)]
op.c: Flag named methods if they are in UTF-8.
Brian Fraser [Mon, 26 Sep 2011 15:27:59 +0000 (08:27 -0700)]
pp_hot.c: method_common is UTF-8 aware.
Not really useful yet, since named methods aren't correctly
flagged; that is to access a \x{30cb} method, you'd need
to do something like Obj->${\"\x{30cb}"}.
Committer’s note: I’m also including one piece of the ‘gv.c and
pp_ctl.c warnings’ patch so that the newly-added tests in this
commit pass.
Brian Fraser [Tue, 4 Oct 2011 01:16:03 +0000 (18:16 -0700)]
gv.c: gv_fetchmethod_(flags|autoload) UTF8 cleanup.
Brian Fraser [Mon, 26 Sep 2011 05:32:52 +0000 (22:32 -0700)]
gv.c: S_gv_get_super_pkg UTF8 cleanup.
Brian Fraser [Wed, 6 Jul 2011 07:31:08 +0000 (04:31 -0300)]
gv.c: gv_fetchmeth_pvn_autoload UTF8 cleanup.
As with the previous commit, no Perl-level visible changes.
Brian Fraser [Mon, 26 Sep 2011 05:15:55 +0000 (22:15 -0700)]
gv.c: gv_fetchmeth_pvn UTF8 cleanup.
Since gv_fetchmeth_pvn is primarily used from within gv.c,
and not much of anything is passing in the flag yet, this has
no visible changes on the Perl level; So tests remain
entirely in XS::APItest for the time being.
Brian Fraser [Wed, 6 Jul 2011 06:03:15 +0000 (03:03 -0300)]
gv.c: gv_init_pvn now uses newCONSTSUB_flags.
Brian Fraser [Fri, 22 Jul 2011 12:52:28 +0000 (09:52 -0300)]
pp.c: Make pp_rv2cv use gv_autoload_pvn()
Brian Fraser [Fri, 22 Jul 2011 12:51:52 +0000 (09:51 -0300)]
pp_hot.c: pp_entersub UTF8 cleanup.
Brian Fraser [Fri, 22 Jul 2011 12:51:03 +0000 (09:51 -0300)]
pp_ctl.c: pp_goto UTF8 cleanup.
Brian Fraser [Fri, 22 Jul 2011 12:49:51 +0000 (09:49 -0300)]
gv.c: gv_autoload4 is now UTF-8 clean.
This also uncomments the UTF-8 tests in XS::APItest.
Brian Fraser [Wed, 6 Jul 2011 05:36:37 +0000 (02:36 -0300)]
gv.c: gp_free UTF8 cleanup
Brian Fraser [Wed, 6 Jul 2011 05:20:04 +0000 (02:20 -0300)]
Tests for UTF-8 GVs.
Basically t/op/gv.t with UTF-8 names. A vast majority of
the tests currently fail and are marked as TODO; Minus for
failures related to prototypes, these will start working
in the following commits.
Brian Fraser [Wed, 6 Jul 2011 04:50:31 +0000 (01:50 -0300)]
op.c: newCONSTSUB and newXS UTF8 cleanup.
newXS was merged into newXS_flags; added a line in the docs
recommeding using that instead.
newCONSTSUB got a _flags version, which generates the CV in
the right glob if passed the UTF-8 flag.
Brian Fraser [Wed, 6 Jul 2011 04:43:51 +0000 (01:43 -0300)]
sv.c: glob_assign_glob is now UTF-8 aware.
This means that
is($t = sub { *\x{30cb} }->(), "*main::\x{30cb}");
won't fail, as $t will get the right glob.
(Though possibly not the right stash, if that also has
UTF-8 in it. That will be done later.)
Brian Fraser [Tue, 5 Jul 2011 22:24:41 +0000 (19:24 -0300)]
Basic tests for UTF-8 vars.
Brian Fraser [Sat, 23 Jul 2011 21:34:05 +0000 (18:34 -0300)]
toke.c: S_scan_inputsymbol, initial GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 21:32:19 +0000 (18:32 -0300)]
toke.c: S_checkcomma, GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 21:26:51 +0000 (18:26 -0300)]
toke.c: yylex, GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 21:09:03 +0000 (18:09 -0300)]
toke.c: S_find_in_my_stash, GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 20:29:44 +0000 (17:29 -0300)]
toke.c: S_intuit_method, GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 20:16:15 +0000 (17:16 -0300)]
toke.c: S_intuit_more, GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 20:03:18 +0000 (17:03 -0300)]
toke.c: S_force_ident, GV-related UTF8 cleanup
Brian Fraser [Sat, 23 Jul 2011 19:54:00 +0000 (16:54 -0300)]
pp.c: pp_rv2gv UTF8 cleanup.
Father Chrysostomos [Sun, 2 Oct 2011 20:57:19 +0000 (13:57 -0700)]
Merge multi and flags params to gv_init_*
Since multi is a boolean (even though it’s typed as an int), there is
no need to have a separate parameter. We can just use a flag bit.
Brian Fraser [Sat, 24 Sep 2011 18:57:27 +0000 (11:57 -0700)]
gv.c: Initial gv_fetchpvn_flags and gv_stashpvn UTF8 cleanup
Now that a glob can be initialized and fetched in UTF-8,
the next commit will introduce some changes in toke.c to
actually test this.
Committer’s note: To keep tests passing I had to incorporate
the toke.c:S_pending_ident changes in the same patch.
Father Chrysostomos [Sun, 25 Sep 2011 00:58:16 +0000 (17:58 -0700)]
constant.pm: Disable the UTF8 downgrade when unnecessary
The downgrade bug that constant.pm has to imitate is about to be fixed
in the next commit. The bug workaround is itself a bug if the bug it
is trying to work around is not present.
Father Chrysostomos [Sat, 24 Sep 2011 13:29:10 +0000 (06:29 -0700)]
Fix thinko in hek_eq_pvn_flags
Doing memEQ(str1, str2, len2) without checking the length
first will cause memEQ("forth","fort"...) to compare equal and
memEQ("fort","forth"...) to read unallocated memory.
This was only a potential future problem, as none of the callers reach
this branch.
Brian Fraser [Sat, 23 Jul 2011 19:51:54 +0000 (16:51 -0300)]
hv.c: Stash-related UTF-8 cleanup.
This adds a new static function to hv.c, hek_eq_pvn_flags,
which replaces several memEQs.
It also cleans up hv_name_set and has the relevant calls
to hv_common and friends made UTF-8 aware.
Finally, it changes share_hek() to modify the hash passed
in if the pv was modified when downgrading.
Brian Fraser [Tue, 5 Jul 2011 10:00:02 +0000 (07:00 -0300)]
gv.c: gv_name_set and gv_init_(etc) now initialize the GV's name as UTF-8 if passed the UTF8 flag.
newCONSTSUB is still unclean however, so constant subs are
still generated under a wrong name.
gv_fullname4 is also UTF-8 aware now; While that should've gotten
it's own commit and tests, it's not possible to test the
UTF-8 part without the gv_init changes, and it's not possible
to test the gv_init changes without gv_fullname4.
Chicken and egg, as it were. So let's compromise and
wait for the relevant tests once globs can be intiialized as
UTF-8 from the Perl level without XS magic.
Brian Fraser [Mon, 18 Jul 2011 16:36:09 +0000 (17:36 +0100)]
SvUTF8() for globs.
This turns on the GV's UTF8 flag in sv.c when the GV is stringified.
This works the same way overloading and references work, in that the
SvUTF8 flag is only valid immediately after SvPV.
For Nick's much more detailed explanation, see
http://www.nntp.perl.org/group/perl.perl5.porters/2011/07/msg174703.html
Father Chrysostomos [Sat, 24 Sep 2011 12:40:41 +0000 (05:40 -0700)]
Restore newGVgen to perlapi.pod
Brian Fraser [Sun, 2 Oct 2011 05:14:50 +0000 (22:14 -0700)]
gv.c: newGVgen_flags and a flags parameter for gv_get_super_pkg.
Father Chrysostomos [Sun, 2 Oct 2011 05:14:19 +0000 (22:14 -0700)]
Remove method param from gv_autoload_*
method is a boolean flag (typed I32, but used as a boolean) added by
commit
54310121b442.
These new gv_autoload_* functions have a flags parameter, so there’s
no reason for this extra effective bool. We can just use a flag bit.
Father Chrysostomos [Sun, 2 Oct 2011 05:13:26 +0000 (22:13 -0700)]
Remove 4 from new gv_autoload4_(sv|pvn?) functions
The 4 was added in commit
54310121b442 (inseparable changes during
5.003/4 developement), presumably the ‘Don't look up &AUTOLOAD in @ISA
when calling plain function’ part.
Before that, gv_autoload had three arguments, so the 4 indicated the
new version (with the method argument).
Since these new functions don’t all have four arguments, and since
they have a new naming convention, there is not reason for the 4.
Father Chrysostomos [Sat, 24 Sep 2011 03:43:32 +0000 (20:43 -0700)]
Restore gv_autoload4 to perlapi.pod
Even if it’s not documented (which I hope to rectify), it should
still continue to be listed in perlapi.
Brian Fraser [Sun, 2 Oct 2011 05:12:18 +0000 (22:12 -0700)]
gv.c: Added gv_autoload4_(sv|pv|pvn)
Brian Fraser [Sun, 2 Oct 2011 05:11:42 +0000 (22:11 -0700)]
gv.c: Make Gv_AMupdate use gv_fetchmethod_sv_flags
Brian Fraser [Tue, 5 Jul 2011 07:37:42 +0000 (04:37 -0300)]
gv.c: Added gv_fetchmethod_(sv|pv|pvn)_flags.
In addition from taking a flags parameter, it also takes the
length of the method; This will eventually make method
lookup nul-clean.
Father Chrysostomos [Fri, 23 Sep 2011 03:47:39 +0000 (20:47 -0700)]
Minor correction to gv_fetchmeth_autoload.t
It was not doing the sanity check for all three functions.
Father Chrysostomos [Fri, 23 Sep 2011 03:42:33 +0000 (20:42 -0700)]
Restore gv_fetchmeth_autoload to perlapi.pod
Brian Fraser [Tue, 5 Jul 2011 06:26:09 +0000 (03:26 -0300)]
gv.c: Added gv_fetchmeth_(sv|pv|pvn)_autoload.
Father Chrysostomos [Wed, 21 Sep 2011 00:37:38 +0000 (17:37 -0700)]
Remove comment from hv.c that no longer applies
Father Chrysostomos [Tue, 20 Sep 2011 22:39:32 +0000 (15:39 -0700)]
Document and apiify hv name length/utf8 macros
Father Chrysostomos [Sat, 24 Sep 2011 17:17:58 +0000 (10:17 -0700)]
Remove some _get variants of *NAMEUTF8 macros in [gh]v.h
For macros that returns flags, the _get convention implies that there
could be a _set variant some day. But we don’t do that for flags.
Father Chrysostomos [Tue, 20 Sep 2011 22:13:59 +0000 (15:13 -0700)]
Restore gv_fetchmeth to perlapi