Behdad Esfahbod [Wed, 14 Nov 2012 23:07:36 +0000 (15:07 -0800)]
[Indic] Exchange abort() for assert()
Behdad Esfahbod [Wed, 14 Nov 2012 23:05:19 +0000 (15:05 -0800)]
Don't route Kharoshthi through the Indic shaper
It's a simple, right-to-left, script.
Behdad Esfahbod [Wed, 14 Nov 2012 23:00:53 +0000 (15:00 -0800)]
[Indic] Handle overstruck matra position
Behdad Esfahbod [Wed, 14 Nov 2012 22:09:46 +0000 (14:09 -0800)]
Reposition Lao marks
Lao marks are center-aligned, unlike Thai ones.
Behdad Esfahbod [Wed, 14 Nov 2012 21:48:26 +0000 (13:48 -0800)]
Don't do fallback positioning for Indic and Thai shapers
Behdad Esfahbod [Wed, 14 Nov 2012 21:38:16 +0000 (13:38 -0800)]
[Indic] If Khmer fonts have a 'liga' feature, use generic shaper
Seems to produce more coherent results than trying the Indic shaper on
them. I'm looking at you, Kh-* fonts...
Behdad Esfahbod [Wed, 14 Nov 2012 19:38:50 +0000 (11:38 -0800)]
Adjust diff rule for the new hb-shape output format
Behdad Esfahbod [Wed, 14 Nov 2012 19:37:04 +0000 (11:37 -0800)]
[Indic] Don't move virama with left matra
This is important for the Sinhala U+0DDA split matra since it decomposes
to U+0DD9,U+0DCA where U+0DD9 is a left matra and U+0DCA is the virama.
We don't want to move the virama with the left matra.
TEST: U+0D9A,U+0DDA
Note that we were already doing this in the Uniscribe bug compatibility
mode. We now do it all the time.
Behdad Esfahbod [Wed, 14 Nov 2012 18:56:02 +0000 (10:56 -0800)]
Add Sinhala test case for split matra U+0DDA
Behdad Esfahbod [Wed, 14 Nov 2012 18:53:10 +0000 (10:53 -0800)]
Fix test
Behdad Esfahbod [Wed, 14 Nov 2012 00:50:45 +0000 (16:50 -0800)]
Minor
Behdad Esfahbod [Wed, 14 Nov 2012 00:26:32 +0000 (16:26 -0800)]
API change: Remove "mask" from hb_buffer_add()
I don't expect anybody using hb_buffer_add(), so this shouldn't break
anyone's code.
Behdad Esfahbod [Tue, 13 Nov 2012 23:33:27 +0000 (15:33 -0800)]
[util] Add --bot / --eot / --preserve-default-ignorables
Behdad Esfahbod [Tue, 13 Nov 2012 23:15:09 +0000 (15:15 -0800)]
Minor
Behdad Esfahbod [Tue, 13 Nov 2012 23:12:24 +0000 (15:12 -0800)]
[util] Add --text-before and --text-after to hb-shape / hb-view
Use with Arabic, for example, to see the effect on joining.
Behdad Esfahbod [Tue, 13 Nov 2012 23:12:06 +0000 (15:12 -0800)]
Fix UTF-8 backward iteration
Ouch!
Behdad Esfahbod [Tue, 13 Nov 2012 23:11:51 +0000 (15:11 -0800)]
[Arabic] Fix post-context handling
Ouch!
Behdad Esfahbod [Tue, 13 Nov 2012 22:42:35 +0000 (14:42 -0800)]
Add buffer flags
New API:
hb_buffer_flags_t
HB_BUFFER_FLAGS_DEFAULT
HB_BUFFER_FLAG_BOT
HB_BUFFER_FLAG_EOT
HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES
hb_buffer_set_flags()
hb_buffer_get_flags()
We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.
The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
Behdad Esfahbod [Tue, 13 Nov 2012 22:42:22 +0000 (14:42 -0800)]
Minor fix
Ouch
Behdad Esfahbod [Tue, 13 Nov 2012 22:10:19 +0000 (14:10 -0800)]
Minor
Behdad Esfahbod [Tue, 13 Nov 2012 21:57:52 +0000 (13:57 -0800)]
Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
Behdad Esfahbod [Tue, 13 Nov 2012 21:48:26 +0000 (13:48 -0800)]
0.9.6
Behdad Esfahbod [Tue, 13 Nov 2012 20:35:35 +0000 (12:35 -0800)]
[Indic] Decompose Sinhala split matras the way old HarfBuzz / Pango did
Had to do some refactoring to make this happen...
Under uniscribe bug compatibility mode, we still plit them
Uniscrie-style, but Jonathan and I convinced ourselves that there is no
harm doing this the Unicode way. This change makes that happen, and
unbreaks free Sinhala fonts.
Behdad Esfahbod [Tue, 13 Nov 2012 19:07:20 +0000 (11:07 -0800)]
[hb-shape] Adjust postioning output format
1. If there is any offset (x or y), print out both x and y offsets.
2. Always print out the advance in the major direction of the buffer.
Ie. even for zero-advance glyphs, print a "+0". This is more intuitive.
Behdad Esfahbod [Tue, 13 Nov 2012 02:42:18 +0000 (18:42 -0800)]
[Indic] Update auto-generated Indic machine to reflect previous commit
Behdad Esfahbod [Tue, 13 Nov 2012 02:41:22 +0000 (18:41 -0800)]
[Indic] Allow Consonant_Medial's after Consonant's
Mostly affects Myanmar, but also Tai Tham, Javanese, and Cham. The
latter three are untested (no fonts!).
Behdad Esfahbod [Tue, 13 Nov 2012 02:38:06 +0000 (18:38 -0800)]
[Indic] Categorize Myanmar "tone marks" as nuktas
Behdad Esfahbod [Tue, 13 Nov 2012 02:37:20 +0000 (18:37 -0800)]
[Indic] Add config for Myanmar
Behdad Esfahbod [Tue, 13 Nov 2012 02:36:10 +0000 (18:36 -0800)]
[Indic] Route "new" Myanmar tag through the Indic shaper
Windows 8 adds a Myanmar shaper using the 'mym2' tag. Route that
through the Indic shaper. It's still very broken, but at least this
does NOT break old-style Myanmar shaping using the generic shaper.
Behdad Esfahbod [Tue, 13 Nov 2012 02:27:42 +0000 (18:27 -0800)]
Choose shaper based on chosen OT script tag
For Arabic and Indic shapers, if the font doesn't have a script system
for the script, use default shaper.
Make an exception for Arabic script since we have fallback logic for
that one.
Behdad Esfahbod [Tue, 13 Nov 2012 02:23:38 +0000 (18:23 -0800)]
Make planner available to complex shaper choosing logic
Behdad Esfahbod [Tue, 13 Nov 2012 01:57:24 +0000 (17:57 -0800)]
Refactoring ot-map building to make chosen script available earlier
Behdad Esfahbod [Tue, 13 Nov 2012 01:48:26 +0000 (17:48 -0800)]
Minor TODO
Behdad Esfahbod [Tue, 13 Nov 2012 01:27:51 +0000 (17:27 -0800)]
Add "new" Myanmar OT Script tag
Windows 8 added support for Myanmar shaping using the "mym2" script tag,
even though Windows never supported the old "mymr" tag.
Behdad Esfahbod [Tue, 13 Nov 2012 00:54:03 +0000 (16:54 -0800)]
Add Myanmar tests from UTN#11
Behdad Esfahbod [Mon, 12 Nov 2012 22:57:02 +0000 (14:57 -0800)]
Break build when ragel is needed and missing
Behdad Esfahbod [Mon, 12 Nov 2012 22:48:33 +0000 (14:48 -0800)]
[Indic] Make more room in the table
To be used in upcoming commits.
Behdad Esfahbod [Mon, 12 Nov 2012 22:27:33 +0000 (14:27 -0800)]
Typo
Behdad Esfahbod [Mon, 12 Nov 2012 22:09:40 +0000 (14:09 -0800)]
[Indic] Port 'pref' logic to look into font tables
...instead of using a hardcoded list of Ra characters.
Behdad Esfahbod [Mon, 12 Nov 2012 22:02:02 +0000 (14:02 -0800)]
[Indic] Port reph handling logic to look into font features
...instead of using a hardcoded list of Ra characters.
Behdad Esfahbod [Mon, 12 Nov 2012 21:34:17 +0000 (13:34 -0800)]
Route MEETEI_MAYEK through the Indic shaper
Since it has a couple of left-"matras".
Behdad Esfahbod [Mon, 12 Nov 2012 21:02:20 +0000 (13:02 -0800)]
Minor
Behdad Esfahbod [Mon, 12 Nov 2012 19:16:57 +0000 (11:16 -0800)]
Work around older compilers
As reported on the list:
I am seeing a similar problem building harfbuzz 0.9.5 with Apple gcc
4.0.1 on OS X 10.5 Leopard:
hb-ot-layout-common-private.hh:406: error: 'struct
OT::CoverageFormat1::Iter' is private
hb-ot-layout-common-private.hh:646: error: within this context
hb-ot-layout-common-private.hh:500: error: 'struct
OT::CoverageFormat2::Iter' is private
hb-ot-layout-common-private.hh:647: error: within this context
make[4]: *** [libharfbuzz_la-hb-ot-layout.lo] Error 1
Also reported as happening with MSVC 2005.
Behdad Esfahbod [Mon, 12 Nov 2012 19:02:56 +0000 (11:02 -0800)]
[Indic] Don't apply 'liga'
Uniscribe doesn't. And some fonts abuse this feature to get Indic
shaping working in non-complex applications like Adobe's apps.
No change in numbers:
BENGALI: 353897 out of 354188 tests passed. 291 failed (0.0821598%)
DEVANAGARI: 707337 out of 707394 tests passed. 57 failed (0.
00805774%)
GUJARATI: 366440 out of 366457 tests passed. 17 failed (0.
00463902%)
GURMUKHI: 60704 out of 60747 tests passed. 43 failed (0.0707854%)
KANNADA: 951046 out of 951913 tests passed. 867 failed (0.0910798%)
KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048011 out of 1048334 tests passed. 323 failed (0.0308108%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091754 out of 1091754 tests passed. 0 failed (0%)
TELUGU: 970557 out of 970573 tests passed. 16 failed (0.
00164851%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Behdad Esfahbod [Mon, 12 Nov 2012 18:26:50 +0000 (10:26 -0800)]
Fix hb-ft glyph name for broken fonts that return empty glyph names
Behdad Esfahbod [Mon, 12 Nov 2012 18:07:28 +0000 (10:07 -0800)]
Minor
Behdad Esfahbod [Thu, 8 Nov 2012 23:08:26 +0000 (15:08 -0800)]
U+A872 PHAGS-PA SUPERFIXED LETTER RA is "Right"-Joining
Behdad Esfahbod [Mon, 5 Nov 2012 23:20:10 +0000 (15:20 -0800)]
Adjust Mongolian shaping
For U+1880..U+1886 Uniscribe thinks they are non-joining.
For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
For now, match Uniscribe.
Behdad Esfahbod [Mon, 5 Nov 2012 23:18:49 +0000 (15:18 -0800)]
Add test for non-joining Mongolian letters
For U+1880..U+1886 Uniscribe thinks they are non-joining.
For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
Behdad Esfahbod [Mon, 5 Nov 2012 00:48:45 +0000 (16:48 -0800)]
Minor
Behdad Esfahbod [Mon, 5 Nov 2012 00:44:47 +0000 (16:44 -0800)]
Minor
Behdad Esfahbod [Fri, 2 Nov 2012 20:53:18 +0000 (13:53 -0700)]
Add Tifinagh test data
Behdad Esfahbod [Fri, 2 Nov 2012 20:38:55 +0000 (13:38 -0700)]
Minor
Behdad Esfahbod [Fri, 2 Nov 2012 17:21:26 +0000 (10:21 -0700)]
Add Mongolian and 'Phags-pa joining test cases
Behdad Esfahbod [Fri, 2 Nov 2012 03:05:04 +0000 (20:05 -0700)]
Implement 'Phags-pa shaping
Through the Arabic shaper. It's similar to Mongolian.
Behdad Esfahbod [Thu, 1 Nov 2012 23:26:01 +0000 (16:26 -0700)]
Minor build fix
Behdad Esfahbod [Wed, 31 Oct 2012 20:45:30 +0000 (13:45 -0700)]
Don't clear buffer pre-context if no new context is being provided
Patch from Jonathan Kew.
Part of fixing:
Mozilla Bug 801410 - avoid inserting dotted-circle for run-initial
Unicode combining characters in "simple" scripts such as Latin
https://bugzilla.mozilla.org/show_bug.cgi?id=801410
Behdad Esfahbod [Tue, 30 Oct 2012 05:02:45 +0000 (22:02 -0700)]
[OT] Fix ReverseChainingSubst
We should make it clear that we don't want output buffer in this case,
otherwise buffer->backtrack_len() would be wrong.
Behdad Esfahbod [Tue, 30 Oct 2012 04:51:56 +0000 (21:51 -0700)]
More tracing fixups
Behdad Esfahbod [Tue, 30 Oct 2012 04:49:33 +0000 (21:49 -0700)]
[Arabic] Enable dlig and mset for Arabic
That's what the spec says, and what Uniscribe does.
Behdad Esfahbod [Tue, 30 Oct 2012 02:42:19 +0000 (19:42 -0700)]
Ignore gid0 in test results
Behdad Esfahbod [Tue, 30 Oct 2012 02:03:55 +0000 (19:03 -0700)]
Add missing TRACE_RETURN
Behdad Esfahbod [Tue, 30 Oct 2012 01:18:24 +0000 (18:18 -0700)]
Add Ethiopic test case
This sequence: U+120B,U+135F,U+120B with the Nyala font from Win7
exposes a GPOS bug in Uniscribe, in that the positioned mark is wrongly
moved as a result a following kern.
This is the one "failure" in the Ethiopic test suite :-).
ETHIOPIC: 118900 out of 118901 tests passed. 1 failed (0.
000841036%)
Behdad Esfahbod [Mon, 29 Oct 2012 23:27:02 +0000 (16:27 -0700)]
[Indic] Position pre-base reordering Ra after Chillus in Malayalam
The logic for pre-base reordering follows the left matra logic.
We had an exception for Malayalam/Tamil in the left matra repositioning
which was not reflected in pre-base reordering.
Malayalam failures down from 337 to 323.
BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.
00777502%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299106 out of 299124 tests passed. 18 failed (0.
00601757%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048011 out of 1048334 tests passed. 323 failed (0.0308108%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271726 out of 271847 tests passed. 121 failed (0.0445103%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970558 out of 970573 tests passed. 15 failed (0.
00154548%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Behdad Esfahbod [Mon, 29 Oct 2012 21:21:09 +0000 (14:21 -0700)]
Add missed file
Behdad Esfahbod [Mon, 29 Oct 2012 17:56:04 +0000 (10:56 -0700)]
Include config.h.in in tree
I typically don't like including generating files in tree. But like to
make an exception for this, since this forms the canonical list of
options one would need to go through when building with alternative
build systems.
Behdad Esfahbod [Mon, 29 Oct 2012 04:26:19 +0000 (21:26 -0700)]
Improve license information
Behdad Esfahbod [Mon, 29 Oct 2012 03:27:25 +0000 (20:27 -0700)]
Minor
Behdad Esfahbod [Mon, 29 Oct 2012 03:11:47 +0000 (20:11 -0700)]
Fix hb_buffer_set_length(buffer, 0)
Was causing invalid realloc()s.
Behdad Esfahbod [Mon, 29 Oct 2012 03:11:42 +0000 (20:11 -0700)]
Add XXX
Behdad Esfahbod [Mon, 29 Oct 2012 02:18:11 +0000 (19:18 -0700)]
Port to ICU LayoutEngine C API
Incidentally, this makes it not crash with icu-le-hb anymore...
I'm not smart / stupid enough to spend two more days debugging C++
linking issues, and this is ABI-stable at least.
Behdad Esfahbod [Fri, 26 Oct 2012 20:48:06 +0000 (13:48 -0700)]
Remove unused members
Behdad Esfahbod [Thu, 25 Oct 2012 23:32:54 +0000 (16:32 -0700)]
Rename and revamp is_zero_width() to be is_default_ignorable()
That's really the logic desired. Except that MONGOLIAN VOWEL SEPARATOR
is not default_ignorable but it really should be. Reported to Unicode.
Based on suggestion from Konstantin Ritt.
Behdad Esfahbod [Wed, 24 Oct 2012 21:02:15 +0000 (14:02 -0700)]
Update TODO
Behdad Esfahbod [Sun, 14 Oct 2012 23:37:09 +0000 (18:37 -0500)]
0.9.5
Behdad Esfahbod [Sun, 7 Oct 2012 21:19:58 +0000 (17:19 -0400)]
Fixup hb_ot_shape_closure()
Broke it when merged cmap mapping and normalizer. Ouch!
Behdad Esfahbod [Sun, 7 Oct 2012 21:13:46 +0000 (17:13 -0400)]
Mark debug message functions static
Behdad Esfahbod [Wed, 3 Oct 2012 00:44:43 +0000 (20:44 -0400)]
Update UCDN to upstream commit
3f159c87824230b59af56e40e2db32caf6afa51a
- Unicode 6.2.0 goodness,
- Unassigned codepoints now have correct properties. Passes test suite.
Behdad Esfahbod [Tue, 2 Oct 2012 21:42:13 +0000 (17:42 -0400)]
Fix visibility of UCDN symbols
Behdad Esfahbod [Tue, 2 Oct 2012 20:03:18 +0000 (16:03 -0400)]
Import UCDN into source tree
https://github.com/grigorig/ucdn
Behdad Esfahbod [Tue, 2 Oct 2012 18:59:00 +0000 (14:59 -0400)]
Remove Glib thread-safety support
Now that we have pthread detection in configure, we don't need Glib
anymore. Glib will only be a Unicode data provider.
Behdad Esfahbod [Tue, 2 Oct 2012 18:55:32 +0000 (14:55 -0400)]
Check for pthreads
Behdad Esfahbod [Tue, 2 Oct 2012 18:46:34 +0000 (14:46 -0400)]
Add ax_pthread.m4
Behdad Esfahbod [Tue, 2 Oct 2012 18:46:04 +0000 (14:46 -0400)]
Add pkg.m4 to git repo
Behdad Esfahbod [Tue, 2 Oct 2012 18:44:47 +0000 (14:44 -0400)]
Add AC_CONFIG_MACRODIR
Behdad Esfahbod [Wed, 26 Sep 2012 01:35:35 +0000 (21:35 -0400)]
[OT] Only insert dottedcircle if at the beginning of paragraph
If the first char in the run is a combining mark, but there is text
before the run, don't insert dottedcircle.
Part of addressing:
https://bugzilla.redhat.com/show_bug.cgi?id=858736
Behdad Esfahbod [Wed, 26 Sep 2012 01:32:35 +0000 (21:32 -0400)]
[Arabic] Respect Arabic joining from neighboring context
Now we respect Arabic joining across runs.
Behdad Esfahbod [Tue, 25 Sep 2012 21:44:53 +0000 (17:44 -0400)]
[buffer] Save pre/post textual context
To be used for a variety of purposes. We save up to five characters
in each direction. No public API changes, everything is taken care
of already. All clients need to do is to call hb_buffer_add_utf* with
the full text + segment info (or at least some context) instead of
just passing in the segment.
Various operations (hb_buffer_reset, hb_buffer_set_length,
hb_buffer_add*) automatically reset the relevant contexts.
Behdad Esfahbod [Tue, 25 Sep 2012 17:59:24 +0000 (13:59 -0400)]
Add hb_utf_prev()
Behdad Esfahbod [Tue, 25 Sep 2012 16:30:16 +0000 (12:30 -0400)]
Slightly optimize UTF-8 parsing
Behdad Esfahbod [Tue, 25 Sep 2012 16:26:12 +0000 (12:26 -0400)]
[buffer] Cleanup / optimize UTF-16 parsing a bit
Behdad Esfahbod [Tue, 25 Sep 2012 15:42:16 +0000 (11:42 -0400)]
Add hb_utf_strlen()
Speeds up UTF-8 parsing by calling strlen().
Behdad Esfahbod [Tue, 25 Sep 2012 15:22:28 +0000 (11:22 -0400)]
[buffer] Templatize UTF handling
Also move UTF routines into a separate file, to be reused from shapers
that need it.
Behdad Esfahbod [Tue, 25 Sep 2012 15:09:04 +0000 (11:09 -0400)]
[buffer] Towards template'izing different UTF adders
Behdad Esfahbod [Tue, 25 Sep 2012 15:04:41 +0000 (11:04 -0400)]
Minor
Behdad Esfahbod [Tue, 25 Sep 2012 14:50:41 +0000 (10:50 -0400)]
Remove unused indic.cc
Behdad Esfahbod [Tue, 25 Sep 2012 01:51:13 +0000 (21:51 -0400)]
[Indic] Import ragel-generated Indic machine in git
I don't expect ragel to be creating too much noise in its generated
output, and including this in-tree helps users right now. We can
revisit this later if it proved to be too much trouble.
Behdad Esfahbod [Tue, 25 Sep 2012 00:23:00 +0000 (20:23 -0400)]
Use a C++ linker on Windows
On Windows we don't care whether or not we link to libstdc++.
Seems to fix build with mingw32 on msys, as reported by Werner.
Behdad Esfahbod [Tue, 18 Sep 2012 23:42:06 +0000 (19:42 -0400)]
Better autofoo
Behdad Esfahbod [Tue, 18 Sep 2012 00:59:09 +0000 (20:59 -0400)]
Fix dependencies