Tony Cook [Mon, 14 Nov 2011 08:48:54 +0000 (19:48 +1100)]
Internals::SvREFCNT() now treats reference counts as unsigned
Previously setting a large (negative in 32-bit signed) reference count
would be returned as a positive number on 64-bit builds and negative
on 32-bit builds.
Tony Cook [Mon, 14 Nov 2011 08:30:17 +0000 (19:30 +1100)]
[rt #103222] make Internals::SvREFCNT set/get consistent
Tony Cook [Mon, 14 Nov 2011 07:50:52 +0000 (18:50 +1100)]
match the definition of S_mro_gather_and_rename to its declaration
Based on the warning from:
http://www.nntp.perl.org/group/perl.daily-build.reports/2011/11/msg108741.html
which I haven't been able to produce in any other compiler.
Nicholas Clark [Sun, 13 Nov 2011 18:07:09 +0000 (19:07 +0100)]
In Perl_lex_start(), don't read the byte before SvPVX().
If len is 0, we shouldn't be reading from len - 1, as it's one before the
start of the buffer, and hence an out of bounds read.
Fixes a bug inadvertently added by commit
0abcdfa4c5da571f, restoring the
previous behaviour for the len == 0 case.
Craig A. Berry [Sun, 13 Nov 2011 13:21:05 +0000 (07:21 -0600)]
Update string copying in vms/vms.c
In several places strncpy was being used to copy a string of known
length with null handling done separately; those cases have been
converted to use memcpy. Most uses of strncpy and strcpy have
been converted to my_strlcpy and most uses of strcat have been
converted to my_strlcat.
Chris 'BinGOs' Williams [Sat, 12 Nov 2011 23:26:25 +0000 (23:26 +0000)]
Update CGI to CPAN version 3.58
[DELTA]
Version 3.58 Nov 11th, 2011
[DOCUMENTATION]
- Clarify that using query_string() only has defined behavior when using the GET method. (RT#60813)
Karl Williamson [Sat, 12 Nov 2011 17:42:23 +0000 (10:42 -0700)]
mktables: nits in comment and pod
Karl Williamson [Sat, 12 Nov 2011 17:34:43 +0000 (10:34 -0700)]
mktables: Remove support for deprecated properties
As proposed and agreed some months ago, certain deprecated since 5.12
Unicode properties that Perl should never have exposed are removed.
These are Unicode-internal properties that are proper subsets of the
properties that should be used instead; and are used by Unicode for
stability reasons in constructing those supersets.
Karl Williamson [Sat, 12 Nov 2011 16:01:00 +0000 (09:01 -0700)]
perlunicode: Update reference to Unicode standard
Karl Williamson [Sat, 12 Nov 2011 15:46:41 +0000 (08:46 -0700)]
utf8.c: typo in comment
Karl Williamson [Sat, 12 Nov 2011 15:45:40 +0000 (08:45 -0700)]
pp.c: Make sure variable is initialized
A compiler generated a warning about this. It is the degenerate case
with an empty input, so isn't really a problem, but silence the warning
Karl Williamson [Fri, 11 Nov 2011 20:08:18 +0000 (13:08 -0700)]
Add new mktables generated files to makefiles
This step was omitted from commit
0765b2b82b120f4e57571368932aa4863aba62f6
Tony Cook [Sat, 12 Nov 2011 00:55:24 +0000 (11:55 +1100)]
ignore some newer build generated files
Chris 'BinGOs' Williams [Fri, 11 Nov 2011 20:21:41 +0000 (20:21 +0000)]
Update CPANPLUS to CPAN version 0.9112
[DELTA]
Changes for 0.9112 Fri Nov 11 11:10:59 2011
================================================
* The 'perlwrapper' is no longer required.
Karl Williamson [Fri, 11 Nov 2011 20:05:19 +0000 (13:05 -0700)]
regexec.c: Bypass unneeded step
We don't have to convert from utf8 to code point to fold; instead can
call the function that starts from utf8
Karl Williamson [Fri, 11 Nov 2011 20:03:58 +0000 (13:03 -0700)]
regcomp.c: Bypass unneeded step
We don't have to convert from utf8 to code point to fold; instead can
call the function that starts from utf8
Karl Williamson [Fri, 11 Nov 2011 20:33:12 +0000 (13:33 -0700)]
Use new/revised case-changing functions in pp.c
The intelligence that formerly was only in the functions in pp.c has in
previous commits migrated downward to the base level functions in
utf8.c. Most of this was enabled by removing user-defined case
changing. It make the code smaller and more maintainable to not have to
repeat the intelligence needed to handle the special cases, at the cost
of one or two function call overheads, which could be eliminated by
converting to in-line functions
Some of this knowledge of special case changing is retained in
converting to titlecase, simply because I didn't see a way to eliminate
it. And in uppercase because, I felt it was worth it to keep the tight
loop.
Karl Williamson [Fri, 11 Nov 2011 20:01:35 +0000 (13:01 -0700)]
utf8.c: Skip extra function calls
The function to_uni_fold() works without requiring conversion first to
utf8.
Karl Williamson [Fri, 11 Nov 2011 19:23:00 +0000 (12:23 -0700)]
pp.c: Call subroutine instead of repeat code
Now that there is a function that can convert a latin1 character to
title or upper case without going out to swashes, we can call it instead
of repeating the code. There is the additional overhead of a function
call, but this could be avoided if it comes down to it by making it
in-line.
Karl Williamson [Fri, 11 Nov 2011 18:59:05 +0000 (11:59 -0700)]
pp.c: Remove macro no-longer called
Karl Williamson [Fri, 11 Nov 2011 18:45:29 +0000 (11:45 -0700)]
pp.c: Call subroutine instead of repeat code
Now that there is a function that can convert a latin1 character to
title or upper case without going out to swashes, we can call it
instead of repeating the code. There is the additional overhead of a
function call, but this could be avoided if it comes down to it by
making it in-line. And this only happens when upper-casing y with
diaresis, and the micro sign
Karl Williamson [Fri, 11 Nov 2011 18:03:35 +0000 (11:03 -0700)]
embed.fnc: Make _to_upper_title_latin1() avail to pp.c
If something like this were to be made more generally available, it
would be better to have two in-line functions, to_upper_latin1() and
to_title_latin1() that just call this underlying one with the correct
final parameter.
Karl Williamson [Fri, 11 Nov 2011 17:45:27 +0000 (10:45 -0700)]
pp.c: White-space only
This outdents and reflows comments as a result of the removal of a
surrounding block
Karl Williamson [Fri, 11 Nov 2011 17:42:13 +0000 (10:42 -0700)]
pp.c: Call subroutine instead of repeat code
Now that toLOWER_utf8() and toTITLE_utf8() have the intelligence to skip
going out to swashes for Latin1 code points, it's not so critical to
bypass calling them for these (for speed). It simplifies things not to
have the intelligence repeated. There is the additional overhead of two
function calls (minus the branches saved), but these could be avoided if
it comes down to it by making them in-line.
Karl Williamson [Fri, 11 Nov 2011 17:38:27 +0000 (10:38 -0700)]
pp.c: White-space only
This outdents and reflows comments as a result of the removal of a
surrounding block
Karl Williamson [Fri, 11 Nov 2011 17:28:44 +0000 (10:28 -0700)]
pp.c: Call subroutine instead of repeat code
Now that toUPPER_utf8() has the intelligence to skip going out to
swashes for Latin1 code points, it's not so critical to bypass calling
it for these (for speed). It simplifies things not to have the
intelligence repeated. There is the additional overhead of two function
calls (minus the branches saved), but these could be avoided if it comes
down to it by making them in-line.
Karl Williamson [Fri, 11 Nov 2011 17:22:48 +0000 (10:22 -0700)]
pp.c: Add compiler hint
Almost always the input to uc() will be one of the other 253 Latin1
characters rather than one of the three that gets here.
Karl Williamson [Fri, 11 Nov 2011 17:13:28 +0000 (10:13 -0700)]
pp.c: White-space only
This outdents and reflows comments as a result of the removal of a
surrounding block
Karl Williamson [Fri, 11 Nov 2011 17:06:11 +0000 (10:06 -0700)]
pp.c: Call subroutine instead of repeat code
Now that toLOWER_utf8() has the intelligence to skip going out to
swashes for Latin1 code points, it's not so critical to bypass calling
it for these (for speed). It simplifies things not to have the
intelligence repeated. There is the additional overhead of two function
calls (minus the branches saved), but these could be avoided if it comes
down to it by making them in-line.
Karl Williamson [Fri, 11 Nov 2011 16:29:09 +0000 (09:29 -0700)]
utf8.c: Add compiler hint
It's very rare that someone will be outputting these unusual code points
Karl Williamson [Fri, 11 Nov 2011 16:28:11 +0000 (09:28 -0700)]
utf8.c: Add and revise comments
I now understand swashes enough to document them better; nits in other
comments
Nicholas Clark [Fri, 11 Nov 2011 13:54:27 +0000 (14:54 +0100)]
Re-order intrpvar.h to avoid false warnings about holes.
Under the default configuration options for ithreads on x86_64 *nix,
PERL_IMPLICIT_CONTEXT is defined. The variables specific to this are at the
end of the interpreter struct, and their size is not an integer multiple of
its alignment constraint. Hence there will always be a "hole". Move the
"hole" so that it is beyond the end of the structure. This avoids the Linux
tool "pahole", used for finding wasted space, from a false positive report
of a hole that can't be avoided.
Nicholas Clark [Fri, 11 Nov 2011 12:22:07 +0000 (13:22 +0100)]
Re-order intrpvar.h to avoid holes in the interpreter struct.
Because commit
45d91b83242e0418 needed to change a buffer size in a
per-thread variable, it created a hole in the ithreads interpreter struct,
as structure members after the buffer must be word aligned.
Re-order various structure members to avoid the hole.
Karl Williamson [Fri, 11 Nov 2011 03:58:48 +0000 (20:58 -0700)]
podcheck.t: Add comment
This is in response to Tony Cook's noticing that the sort order changed,
but the data file was left in the old order. The next time someone
changed things and did a regen, it got sorted, but the git diff showed
more changes, as a result, than there actually were in that commit.
Karl Williamson [Fri, 11 Nov 2011 03:31:51 +0000 (20:31 -0700)]
perlunicode: Document that \p{user-defined} can match abvoe Unicode
Karl Williamson [Fri, 11 Nov 2011 03:30:02 +0000 (20:30 -0700)]
utf8.c: Don't warn on \p{user-defined} for above-Unicode
Perl has allowed user-defined properties to match above-Unicode code
points, while falsely warning that it doesn't. This removes that
warning.
Karl Williamson [Fri, 11 Nov 2011 02:35:10 +0000 (19:35 -0700)]
utf8_heavy.pl: Pass up USER_DEFINED to outside swash
If a sub-swash is user-defined, then the master one is.
Karl Williamson [Thu, 10 Nov 2011 21:37:40 +0000 (14:37 -0700)]
utf8.c: Handle swashes at UV_MAX
The code assumed that there is a code point above the highest value we
are looking at. That is true except when we are looking at the highest
representable code point on the machine. A special case is needed for
that.
Karl Williamson [Thu, 10 Nov 2011 21:32:26 +0000 (14:32 -0700)]
utf8.c: Fix swash handling under USE_MORE_BITS
On a 32 bit machine with USE_MORE_BITS, a UV is 64 bits, but STRLEN is
32 bits. A cast was missing during a bit complement that led to loss of
32 bits.
Karl Williamson [Wed, 9 Nov 2011 22:11:54 +0000 (15:11 -0700)]
utf8.h: clarify comment
Karl Williamson [Wed, 9 Nov 2011 22:54:55 +0000 (15:54 -0700)]
utf8.c: Make swashes work close to UV_MAX
When a code point is to be checked if it matches a property, a swatch of
the swash is read in. Typically this is a block of 64 code points that
contain the one desired. A bit map is set for those 64 code points,
apparently under the expectation that the program will desire code
points near the original.
However, it just adds 63 to the original code point to get the ending
point of the block. When the original is so close to the maximum UV
expressible on the platform, this will overflow.
The patch is simply to check for overflow and if it happens use the max
possible. A special case is still needed to handle the very maximum
possible code point, and a future commit will deal with that.
Karl Williamson [Wed, 9 Nov 2011 22:46:33 +0000 (15:46 -0700)]
intrpvar.h: Increase size of variable that stores UTF8 bytes
A Perl utf8 string can occupy 13 bytes. This only accepted up to 11,
causing a swash assertion failure for very large code points matching
Unicode properties.
Chris 'BinGOs' Williams [Wed, 9 Nov 2011 21:23:59 +0000 (21:23 +0000)]
Update CGI to CPAN version 3.57
[DELTA]
Version 3.57 Nov 9th, 2011
[INTERNALS]
- test failure in t/fast.t introduced in 3.56 is fixed. (Thanks to zefram and chansen).
- Test::More requirement has been bumped to 0.98
Version 3.56 Nov 8th, 2011
[SECURITY]
Use public and documented FCGI.pm API in CGI::Fast
CGI::Fast was using an FCGI API that was deprecated and removed from
documentation more than ten years ago. Usage of this deprecated API with
FCGI >= 0.70 or FCGI <= 0.73 introduces a security issue.
<https://rt.cpan.org/Public/Bug/Display.html?id=68380>
<http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2011-2766>
(Thanks to chansen)
[INTERNALS]
- tmp files are now cleaned up on VMS ( RT#69210, thanks to cberry@cpan.org )
- Fixed test failure: done_testing() added to url.t (Thanks to Ryan Jendoubi)
- Clarify preferred bug submission location in docs, and note that Mark Stosberg
is the current maintainer.
Chris 'BinGOs' Williams [Wed, 9 Nov 2011 21:04:32 +0000 (21:04 +0000)]
Update Digest-SHA to CPAN version 5.63
[DELTA]
5.63 Tue Nov 8 02:36:42 MST 2011
- added code to allow very large data inputs all at once
-- previously limited to several hundred MB at a time
-- many thanks to Thomas Drugeon for his elegant patch
- removed outdated reference URLs from several test scripts
-- these URLs aren't essential, and often go stale
-- thanks to Leon Brocard for spotting this
-- ref. rt.cpan.org #68740
Karl Williamson [Wed, 9 Nov 2011 17:51:03 +0000 (10:51 -0700)]
UCD.t: Fix 'uninit' warning
An initialization was out of place
Karl Williamson [Mon, 10 Oct 2011 19:16:17 +0000 (13:16 -0600)]
pat_advanced.t: Display better names for a few tests
Karl Williamson [Wed, 9 Nov 2011 17:42:10 +0000 (10:42 -0700)]
regexec.c: Stop looking for match even sooner
This revised commit
e067297c376fbbb5a0dc8428c65d922f11e1f4c6
slightly so that we round up to get the search stopping point.
We aren't matching partial characters, so if we were to match 3+1/3
characters, we really have to match 4 characters.
Karl Williamson [Wed, 9 Nov 2011 17:41:00 +0000 (10:41 -0700)]
regexec.c: revise comment
Karl Williamson [Wed, 9 Nov 2011 15:49:17 +0000 (08:49 -0700)]
mktables: Change member name for clarity
I find myself being confused by the old-name
Karl Williamson [Wed, 9 Nov 2011 15:54:23 +0000 (08:54 -0700)]
Fix volatile declaration
Commit
24efd69ba77ba76cd714519dccee88f45820d8b4 introduced a VOL
declaration. I thought I had tested this, but apparently not. It needs
to apply to the pointee instead of the pointer.
Karl Williamson [Wed, 9 Nov 2011 15:31:17 +0000 (08:31 -0700)]
regexec.c: typo in comment
Karl Williamson [Wed, 9 Nov 2011 15:29:58 +0000 (08:29 -0700)]
Change __attribute_unused__ to PERL_UNUSED_DECL
The latter is the Perl standard way of making this declaration
Karl Williamson [Wed, 9 Nov 2011 05:16:39 +0000 (22:16 -0700)]
utf8.c: Faster latin1 folding
This adds a function similar to the ones for the other three case
changing operations that works on latin1 characters only, and avoids
having to go out to swashes. It changes to_uni_fold() and
to_utf8_fold() to call it on the appropriate input
Karl Williamson [Wed, 9 Nov 2011 05:20:37 +0000 (22:20 -0700)]
regcomp.c: Change char used to force reading in fold swashes
Future commits will change things so that a latin1 character no longer
will go out to disk to load a swash.
Karl Williamson [Wed, 9 Nov 2011 05:20:08 +0000 (22:20 -0700)]
regcomp.c: Add assertion
Karl Williamson [Wed, 9 Nov 2011 04:51:07 +0000 (21:51 -0700)]
utf8.c: Faster latin1 upper/title casing
This creates a new function to handle upper/title casing code points in
the latin1 range, and avoids using a swash to compute the case. This is
because the correct values are compiled-in.
And it calls this function when appropriate for both title and upper
casing, in both utf8 and uni forms,
Unlike the similar function for lower casing, it may make sense for this function to be
called from outside utf8.c, but inside the core, so it is not static,
but its name begins with an underscore.
Karl Williamson [Wed, 9 Nov 2011 05:05:25 +0000 (22:05 -0700)]
utf8.c: Expand use of refactored to_uni_lower
The new function split out from to_uni_lower is now called when
appropriate from to_utf8_lower.
And to_uni_lower no longer calls to_utf8_lower, using the macro instead,
saving a function call and duplicate work
Karl Williamson [Wed, 9 Nov 2011 01:55:09 +0000 (18:55 -0700)]
utf8.c: Refactor to_uni_lower()
The portion that deals with Latin1 range characters is refactored into a
separate (static) function, so that it can be called from more than one place.
Karl Williamson [Wed, 9 Nov 2011 01:30:12 +0000 (18:30 -0700)]
utf8.c: Refactor case-changing calls into macros
Future commits will use these in additional places, so macroize
Karl Williamson [Sat, 5 Nov 2011 18:16:42 +0000 (12:16 -0600)]
regcomp.c: Silence compiler warning about longjump
I believe that there isn't a code path that can screw this up, but one
compiler at least believes otherwise. Declaring it volatile should fix
that.
Karl Williamson [Tue, 8 Nov 2011 15:14:38 +0000 (08:14 -0700)]
Add functions to Unicode::UCD
This merges a topic branch whose primary purpose is to add 4 functions
to Unicode::UCD that allow complete programmatic access to the Unicode
character data base
Karl Williamson [Sun, 6 Nov 2011 23:59:23 +0000 (16:59 -0700)]
README.perl: Add step to new-Unicode release activities
Karl Williamson [Sun, 6 Nov 2011 23:58:03 +0000 (16:58 -0700)]
mktables: Remove blanks in files for non-DEBUGGING builds
This will save some disk space
Karl Williamson [Sun, 6 Nov 2011 23:41:35 +0000 (16:41 -0700)]
Deprecate direct use of Unicode db files
Unicode::UCD has been enhanced to provide a stable API for accessing the
Unicode data base. Some applications have needed data that has
previously only effectively been available from files stored on disk.
It may be that some time in the future we will want to change or remove
these files, so a warning has been added to their headers to that
effect.
Already, it would have been more convenient to change the
formats somewhat in some of these, and I have had to do some hoop
jumping to avoid this.
I don't see any call to change them now for many releases down the road,
but for example, we may choose to store more of the db in memory, and
would no longer need these.
Karl Williamson [Sun, 6 Nov 2011 23:34:46 +0000 (16:34 -0700)]
mktables: Use re /aa
All the data this operates on is ASCII, and this speeds things up just
a bit for that, as well as avoid stressing miniperl in potentially
trying to load utf8_heavy.pl
Karl Williamson [Sun, 6 Nov 2011 23:29:20 +0000 (16:29 -0700)]
mktables: Remove outdated documentation notes
The core_access member is removed. Everything is now documented
via other means.
Karl Williamson [Sun, 6 Nov 2011 23:22:38 +0000 (16:22 -0700)]
mktables: Add notes about new access to properties
Unicode::UCD can now access these properties. Indicate that in the
comments in the files.
Karl Williamson [Sun, 6 Nov 2011 23:15:25 +0000 (16:15 -0700)]
Unicode::UCD.pm: Localize $_, $/
So that they don't affect code outside the module and vice-versa.
Karl Williamson [Sun, 6 Nov 2011 23:13:34 +0000 (16:13 -0700)]
perluniprops: Change section name
This is no longer about just regular expression properties, but about
character properties.
Karl Williamson [Sun, 6 Nov 2011 23:03:50 +0000 (16:03 -0700)]
perluniprops: Remove Unicode db files section
Now that Unicode::UCD presents an API for accessing all the files, there
is no need to document this less-favored method.
Karl Williamson [Sun, 6 Nov 2011 22:51:27 +0000 (15:51 -0700)]
perluniprops: Document prop_invmap() properties
mktables is changed to add a section to perluniprops to document the
Unicode properties accessible via Unicode::UCD
Karl Williamson [Sun, 6 Nov 2011 21:31:26 +0000 (14:31 -0700)]
Unicode::UCD: Add prop_invmap()
Karl Williamson [Sun, 6 Nov 2011 22:06:42 +0000 (15:06 -0700)]
mktables: Generate file for NameAlias property
This is needed for future commits in Unicode::UCD. The contents of file
could be figured out from the Name.pl file, but that would be slow, and
the file is quite small.
Karl Williamson [Sun, 6 Nov 2011 04:09:41 +0000 (22:09 -0600)]
utf8_heavy: Return values for binary property requested as map
Future commits will make Unicode::UCD return maps of all properties.
Instead of storing these maps on disk, they can be inferred from the
files that are already there that give the code points that match the
property.
This commit causes a request for the mapping of such a property to
instead return the data from the binary definition file.
It is left for the caller to convert this data into a map. These files
do not have SwashInfo defined; and the returned BITS field in the swash
will be 1.
Karl Williamson [Sun, 6 Nov 2011 03:52:28 +0000 (21:52 -0600)]
utf8_heavy.pl: Turn on $| if debugging
Karl Williamson [Sun, 6 Nov 2011 03:51:41 +0000 (21:51 -0600)]
utf8_heavy: add comments
Karl Williamson [Sat, 5 Nov 2011 17:50:59 +0000 (11:50 -0600)]
utf8_heavy.pl: Remove no longer needed code
Previous commits have extended the capabilities of utf8_heavy to handle
any mapping file generated by mktables, and have changed the names of
the maps to look up in utf8.c to correspond to the mktables names. We
thus, no longer need code that knows the names of those properties,
using the more general mechanism instead
Karl Williamson [Sat, 5 Nov 2011 17:31:47 +0000 (11:31 -0600)]
utf8.c: Use proper Unicode property names
There are five functions in utf8.c that look up Unicode maps--the case
changing functions. They look up these maps under the names ToDigit,
ToFold, ToLower, ToTitle, and ToUpper. The imminent expansion of Unicode::UCD
to return the mappings for all properties creates a naming conflict, as
three of those names are the same as other properties, Upper, Lower, and
Title.
It was an unfortunate choice of names originally. Now mktables has been
changed to create a list of mapping properties that utf8_heavy.pl reads.
It uses the official names of those properties, so change utf8.c to
correspond.
Karl Williamson [Sat, 5 Nov 2011 17:12:39 +0000 (11:12 -0600)]
utf8_heavy.pl: Find mapping files from table
Previously, utf8_heavy.pl only returned 4 mapping files, the ones that
change case, and their names are known to it. mktables now generates a
list of mapping files that it outputs. This adds these to utf8_heavy's
repertoire.
Karl Williamson [Sat, 5 Nov 2011 16:52:45 +0000 (10:52 -0600)]
utf8_heavy.pl: white-space only
Indenting to reflect being in a new block
Karl Williamson [Sat, 5 Nov 2011 16:34:01 +0000 (10:34 -0600)]
utf8_heavy: Reorder 2 if's
This saves a little redundant code, and will be useful in future commits
Karl Williamson [Sat, 5 Nov 2011 16:18:48 +0000 (10:18 -0600)]
mktables: Add %file_to_swash_name to utf8_heavy.pl
Karl Williamson [Sat, 5 Nov 2011 16:11:02 +0000 (10:11 -0600)]
mktables: Fix comment
utf8_heavy.pl is now being used by Unicode::UCD.
Karl Williamson [Sat, 5 Nov 2011 16:07:46 +0000 (10:07 -0600)]
mktables: Add %loose_property_to_file_of to utf8_heavy.pl
Karl Williamson [Sat, 5 Nov 2011 15:50:04 +0000 (09:50 -0600)]
mktables: Add comment to generated files
Karl Williamson [Sat, 5 Nov 2011 15:25:46 +0000 (09:25 -0600)]
mktables: Add %algorithmic_named_code_points to UCD.pl
Karl Williamson [Sat, 5 Nov 2011 15:17:07 +0000 (09:17 -0600)]
Unicode::UCD: pod: document new/old style block property names
Karl Williamson [Sat, 5 Nov 2011 15:08:19 +0000 (09:08 -0600)]
Unicode::UCD: Add prop_invlist()
This function returns a data structure of all the code points matching
a binary Unicode property or property-value
Karl Williamson [Sat, 5 Nov 2011 14:56:37 +0000 (08:56 -0600)]
mktables: Add %loose_defaults to UCD.pl
Karl Williamson [Sat, 5 Nov 2011 14:49:48 +0000 (08:49 -0600)]
utf8_heavy.pl: Correct debugging statement
This was printing out the value before setting it (hence getting the old
value).
Karl Williamson [Sat, 5 Nov 2011 14:34:00 +0000 (08:34 -0600)]
utf8_heavy.pl: Return that property is user-defined
This adds an element to the returned hash that is a boolean as to
whether or not the property is user-defined.
Karl Williamson [Mon, 31 Oct 2011 20:07:25 +0000 (14:07 -0600)]
Unicode::UCD: add prop_aliases(), prop_value_aliases()
Karl Williamson [Fri, 4 Nov 2011 22:50:20 +0000 (16:50 -0600)]
mktables: Add %suppressed_properties to UCD.pl
Karl Williamson [Fri, 4 Nov 2011 22:28:27 +0000 (16:28 -0600)]
mktables: Add %ambiguous_names to UCD.pl
Karl Williamson [Fri, 4 Nov 2011 22:39:05 +0000 (16:39 -0600)]
mktables: Output ISO_Comment table
This is done for the reasons cited in the comment. The table is trivial
in size.
Karl Williamson [Fri, 4 Nov 2011 22:05:33 +0000 (16:05 -0600)]
mktables: Add %prop_value_aliases to UCD.pl
Karl Williamson [Fri, 4 Nov 2011 21:46:42 +0000 (15:46 -0600)]
mktables: Add %prop_aliases in UCD.pl
Karl Williamson [Fri, 4 Nov 2011 21:32:47 +0000 (15:32 -0600)]
mktables: Add %string_property_loose_to_name for UCD.pl
Karl Williamson [Fri, 4 Nov 2011 21:16:24 +0000 (15:16 -0600)]
mktables: White-space only
Earlier commits removed and inserted blocks. This changes the
indentation to correspond
Karl Williamson [Fri, 4 Nov 2011 21:06:52 +0000 (15:06 -0600)]
mktables: store method value in variable
This variable will have later use. This also changes 'table' to
'property' for more clarity. Here they point to the same thing