<title>The distinction between levels 0 and 1: HarfBuzz Manual</title>
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="index.html" title="HarfBuzz Manual">
-<link rel="up" href="clusters.html" title="">
+<link rel="up" href="clusters.html" title="Clusters">
<link rel="prev" href="reordering-in-levels-0-and-1.html" title="Reordering in levels 0 and 1">
<link rel="next" href="level-2.html" title="Level 2">
-<meta name="generator" content="GTK-Doc V1.27.1 (XML mode)">
+<meta name="generator" content="GTK-Doc V1.25 (XML mode)">
<link rel="stylesheet" href="style.css" type="text/css">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<td><a accesskey="p" href="reordering-in-levels-0-and-1.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td>
<td><a accesskey="n" href="level-2.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td>
</tr></table>
-<div class="sect1">
+<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="the-distinction-between-levels-0-and-1"></a>The distinction between levels 0 and 1</h2></div></div></div>
<p>
- So, the above is pretty much what cluster levels 0 and 1 do. The
- only difference between the two is this: in level 0, at the very
- beginning of the shaping process, we also merge clusters between
- base characters and all Unicode marks (combining or not) following
- them. E.g.:
- </p>
+ The preceding examples demonstrate the main effects of using
+ cluster levels 0 and 1. The only difference between the two
+ levels is this: in level 0, at the very beginning of the shaping
+ process, HarfBuzz merges the cluster of each base character
+ with the clusters of all Unicode marks (combining or not) and
+ modifiers that follow it.
+ </p>
+<p>
+ For example, let us start with the following character sequence
+ (top row) and accompanying initial cluster values (bottom row):
+ </p>
<pre class="programlisting">
- A,acute,B
- 0,1 ,2
-</pre>
+ A,acute,B
+ 0,1 ,2
+ </pre>
<p>
- will become:
- </p>
+ The <code class="literal">acute</code> is a Unicode mark. If HarfBuzz is
+ using cluster level 0 on this sequence, then the
+ <code class="literal">A</code> and <code class="literal">acute</code> clusters will
+ merge, and the result will become:
+ </p>
<pre class="programlisting">
- A,acute,B
- 0,0 ,2
-</pre>
+ A,acute,B
+ 0,0 ,2
+ </pre>
+<p>
+ This merger is performed before any other script-shaping
+ steps.
+ </p>
+<p>
+ This initial cluster merging is the default behavior of the
+ Windows shaping engine, and the old HarfBuzz codebase copied
+ that behavior to maintain compatibility. Consequently, it has
+ remained the default behavior in the new HarfBuzz codebase.
+ </p>
+<p>
+ But this initial cluster-merging behavior makes it impossible
+ client programs to implement some features (such as to
+ color diacritic marks differently from their base
+ characters). That is why, in level 1, HarfBuzz does not perform
+ the initial merging step.
+ </p>
<p>
- This is the default behavior. We do it because Windows did it and
- old HarfBuzz did it, so this remained the default. But this behavior
- makes it impossible to color diacritic marks differently from their
- base characters. That's why in level 1 we do not perform this
- initial merging step.
- </p>
+ For client programs that rely on HarfBuzz cluster values to
+ perform cursor positioning, level 0 is more convenient. But
+ relying on cluster boundaries for cursor positioning is wrong: cursor
+ positions should be determined based on Unicode grapheme
+ boundaries, not on shaping-cluster boundaries. As such, using
+ level 1 clustering behavior is recommended.
+ </p>
<p>
- For clients, level 0 is more convenient if they rely on HarfBuzz
- clusters for cursor positioning. But that's wrong anyway: cursor
- positions should be determined based on Unicode grapheme boundaries,
- NOT shaping clusters. As such, level 1 clusters are preferred.
- </p>
+ One final facet of levels 0 and 1 is worth noting. HarfBuzz
+ currently does not allow any
+ <span class="emphasis"><em>multiple-substitution</em></span> GSUB lookups to
+ replace a glyph with zero glyphs (in other words, to delete a
+ glyph).
+ </p>
<p>
- One last note about levels 0 and 1. We currently don't allow a
- <code class="literal">MultipleSubst</code> lookup to replace a glyph with zero
- glyphs (i.e., to delete a glyph). But in some other situations,
- glyphs can be deleted. In those cases, if the glyph being deleted is
- the last glyph of its cluster, we make sure to merge the cluster
- with a neighboring cluster.
- </p>
+ But, in some other situations, glyphs can be deleted. In
+ those cases, if the glyph being deleted is the last glyph of its
+ cluster, HarfBuzz makes sure to merge the deleted glyph's
+ cluster with a neighboring cluster.
+ </p>
<p>
- This is, primarily, to make sure that the starting cluster of the
- text always has the cluster index pointing to the start of the text
- for the run; more than one client currently relies on this
- guarantee.
- </p>
+ This is done primarily to make sure that the starting cluster of the
+ text always has the cluster index pointing to the start of the text
+ for the run; more than one client program currently relies on this
+ guarantee.
+ </p>
<p>
- Incidentally, Apple's CoreText does something else to maintain the
- same promise: it inserts a glyph with id 65535 at the beginning of
- the glyph string if the glyph corresponding to the first character
- in the run was deleted. HarfBuzz might do something similar in the
- future.
- </p>
+ Incidentally, Apple's CoreText does something different to
+ maintain the same promise: it inserts a glyph with id 65535 at
+ the beginning of the glyph string if the glyph corresponding to
+ the first character in the run was deleted. HarfBuzz might do
+ something similar in the future.
+ </p>
</div>
<div class="footer">
-<hr>Generated by GTK-Doc V1.27.1</div>
+<hr>Generated by GTK-Doc V1.25</div>
</body>
</html>
\ No newline at end of file