doc/libunistring.info

   1 This is libunistring.info, produced by makeinfo version 4.13 from
   2 libunistring.texi.
   3
   4 INFO-DIR-SECTION Software development
   5 START-INFO-DIR-ENTRY
   6 * GNU libunistring: (libunistring).     Unicode string library.
   7 END-INFO-DIR-ENTRY
   8
   9    This manual is for GNU libunistring.
  10
  11 \1f
  12 File: libunistring.info,  Node: Top,  Next: Introduction,  Up: (dir)
  13
  14 GNU libunistring
  15 ****************
  16
  17 * Menu:
  18
  19 * Introduction::                Who may need Unicode strings?
  20 * Conventions::                 Conventions used in this manual
  21 * unitypes.h::                  Elementary types
  22 * unistr.h::                    Elementary Unicode string functions
  23 * uniconv.h::                   Conversions between Unicode and encodings
  24 * unistdio.h::                  Output with Unicode strings
  25 * uniname.h::                   Names of Unicode characters
  26 * unictype.h::                  Unicode character classification and properties
  27 * uniwidth.h::                  Display width
  28 * uniwbrk.h::                   Word breaks in strings
  29 * unilbrk.h::                   Line breaking
  30 * uninorm.h::                   Normalization forms
  31 * unicase.h::                   Case mappings
  32 * uniregex.h::                  Regular expressions
  33 * Using the library::           How to link with the library and use it?
  34 * More functionality::          More advanced functionality
  35 * Licenses::                    Licenses
  36
  37 * Index::                       General Index
  38
  39  --- The Detailed Node Listing ---
  40
  41 Introduction
  42
  43 * Unicode::                     What is Unicode?
  44 * Unicode and i18n::            Unicode and internationalization
  45 * Locale encodings::            What is a locale encoding?
  46 * In-memory representation::    How to represent strings in memory?
  47 * char * strings::              What to keep in mind with `char *' strings
  48 * The wchar_t mess::            Why `wchar_t *' strings are useless
  49 * Unicode strings::             How are Unicode strings represented?
  50
  51 unistr.h
  52
  53 * Elementary string checks::
  54 * Elementary string conversions::
  55 * Elementary string functions::
  56 * Elementary string functions with memory allocation::
  57 * Elementary string functions on NUL terminated strings::
  58
  59 unictype.h
  60
  61 * General category::
  62 * Canonical combining class::
  63 * Bidirectional category::
  64 * Decimal digit value::
  65 * Digit value::
  66 * Numeric value::
  67 * Mirrored character::
  68 * Properties::
  69 * Scripts::
  70 * Blocks::
  71 * ISO C and Java syntax::
  72 * Classifications like in ISO C::
  73
  74 General category
  75
  76 * Object oriented API::
  77 * Bit mask API::
  78
  79 Properties
  80
  81 * Properties as objects::
  82 * Properties as functions::
  83
  84 uniwbrk.h
  85
  86 * Word breaks in a string::
  87 * Word break property::
  88
  89 uninorm.h
  90
  91 * Decomposition of characters::
  92 * Composition of characters::
  93 * Normalization of strings::
  94 * Normalizing comparisons::
  95 * Normalization of streams::
  96
  97 unicase,h
  98
  99 * Case mappings of characters::
 100 * Case mappings of strings::
 101 * Case mappings of substrings::
 102 * Case insensitive comparison::
 103 * Case detection::
 104
 105 Using the library
 106
 107 * Installation::
 108 * Compiler options::
 109 * Include files::
 110 * Autoconf macro::
 111 * Reporting problems::
 112
 113 Licenses
 114
 115 * GNU GPL::                     GNU General Public License
 116 * GNU LGPL::                    GNU Lesser General Public License
 117 * GNU FDL::                     GNU Free Documentation License
 118
 119 \1f
 120 File: libunistring.info,  Node: Introduction,  Next: Conventions,  Prev: Top,  Up: Top
 121
 122 1 Introduction
 123 **************
 124
 125    This library provides functions for manipulating Unicode strings and
 126 for manipulating C strings according to the Unicode standard.
 127
 128    It consists of the following parts:
 129
 130 `<unistr.h>'
 131      elementary string functions
 132
 133 `<uniconv.h>'
 134      conversion from/to legacy encodings
 135
 136 `<unistdio.h>'
 137      formatted output to strings
 138
 139 `<uniname.h>'
 140      character names
 141
 142 `<unictype.h>'
 143      character classification and properties
 144
 145 `<uniwidth.h>'
 146      string width when using nonproportional fonts
 147
 148 `<uniwbrk.h>'
 149      word breaks
 150
 151 `<unilbrk.h>'
 152      line breaking algorithm
 153
 154 `<uninorm.h>'
 155      normalization (composition and decomposition)
 156
 157 `<unicase.h>'
 158      case folding
 159
 160 `<uniregex.h>'
 161      regular expressions (not yet implemented)
 162
 163    libunistring is for you if your application involves non-trivial text
 164 processing, such as upper/lower case conversions, line breaking,
 165 operations on words, or more advanced analysis of text.  Text provided
 166 by the user can, in general, contain characters of all kinds of
 167 scripts.  The text processing functions provided by this library handle
 168 all scripts and all languages.
 169
 170    libunistring is for you if your application already uses the ISO C /
 171 POSIX `<ctype.h>', `<wctype.h>' functions and the text it operates on is
 172 provided by the user and can be in any language.
 173
 174    libunistring is also for you if your application uses Unicode
 175 strings as internal in-memory representation.
 176
 177 * Menu:
 178
 179 * Unicode::                     What is Unicode?
 180 * Unicode and i18n::            Unicode and internationalization
 181 * Locale encodings::            What is a locale encoding?
 182 * In-memory representation::    How to represent strings in memory?
 183 * char * strings::              What to keep in mind with `char *' strings
 184 * The wchar_t mess::            Why `wchar_t *' strings are useless
 185 * Unicode strings::             How are Unicode strings represented?
 186
 187 \1f
 188 File: libunistring.info,  Node: Unicode,  Next: Unicode and i18n,  Up: Introduction
 189
 190 1.1 Unicode
 191 ===========
 192
 193    Unicode is a standardized repertoire of characters that contains
 194 characters from all scripts of the world, from Latin letters to Chinese
 195 ideographs and Babylonian cuneiform glyphs.  It also specifies how
 196 these characters are to be rendered on a screen or on paper, and how
 197 common text processing (word selection, line breaking, uppercasing of
 198 page titles etc.) is supposed to behave on Unicode text.
 199
 200    Unicode also specifies three ways of storing sequences of Unicode
 201 characters in a computer whose basic unit of data is an 8-bit byte:
 202 UTF-8
 203      Every character is represented as 1 to 4 bytes.
 204
 205 UTF-16
 206      Every character is represented as 1 to 2 units of 16 bits.
 207
 208 UTF-32, a.k.a. UCS-4
 209      Every character is represented as 1 unit of 32 bits.
 210
 211    For encoding Unicode text in a file, UTF-8 is usually used.  For
 212 encoding Unicode strings in memory for a program, either of the three
 213 encoding forms can be reasonably used.
 214
 215    Unicode is widely used on the web.  Prior to the use of Unicode, web
 216 pages were in many different encodings (ISO-8859-1 for English, French,
 217 Spanish, ISO-8859-2 for Polish, ISO-8859-7 for Greek, KOI8-R for
 218 Russian, GB2312 or BIG5 for Chinese, ISO-2022-JP-2 or EUC-JP or
 219 Shift_JIS for Japanese, and many many others).  It was next to
 220 impossible to create a document that contained Chinese and Polish text
 221 in the same document.  Due to the many encodings for Japanese, even the
 222 processing of pure Japanese text was error prone.
 223
 224    References:
 225    * The Unicode standard: `http://www.unicode.org/'
 226
 227    * Definition of UTF-8: `http://www.rfc-editor.org/rfc/rfc3629.txt'
 228
 229    * Definition of UTF-16: `http://www.rfc-editor.org/rfc/rfc2781.txt'
 230
 231    * Markus Kuhn's UTF-8 and Unicode FAQ:
 232      `http://www.cl.cam.ac.uk/~mgk25/unicode.html'
 233
 234 \1f
 235 File: libunistring.info,  Node: Unicode and i18n,  Next: Locale encodings,  Prev: Unicode,  Up: Introduction
 236
 237 1.2 Unicode and Internationalization
 238 ====================================
 239
 240    Internationalization is the process of changing the source code of a
 241 program so that it can meet the expectations of users in any culture,
 242 if culture specific data (translations, images etc.) are provided.
 243
 244    Use of Unicode is not strictly required for internationalization,
 245 but it makes internationalization much easier, because operations that
 246 need to look at specific characters (like hyphenation, spell checking,
 247 or the automatic conversion of double-quotes to opening and closing
 248 double-quote characters) don't need to consider multiple possible
 249 encodings of the text.
 250
 251    Use of Unicode also enables multilingualization: the ability of
 252 having text in multiple languages present in the same document or even
 253 in the same line of text.
 254
 255    But use of Unicode is not everything.  Internationalization usually
 256 consists of three features:
 257    * Use of Unicode where needed for text processing.  This is what
 258      this library is for.
 259
 260    * Use of message catalogs for messages shown to the user, This is
 261      what GNU gettext is about.
 262
 263    * Use of locale specific conventions for date and time formats, for
 264      numeric formatting, or for sorting of text.  This can be done
 265      adequately with the POSIX APIs and the implementation of locales
 266      in the GNU C library.
 267
 268 \1f
 269 File: libunistring.info,  Node: Locale encodings,  Next: In-memory representation,  Prev: Unicode and i18n,  Up: Introduction
 270
 271 1.3 Locale encodings
 272 ====================
 273
 274    A locale is a set of cultural conventions.  According to POSIX, for
 275 a program, at any moment, there is one locale being designated as the
 276 "current locale".  (Actually, POSIX supports also one locale per
 277 thread, but this feature is not yet universally implemented and not
 278 widely used.)  The locale is partitioned into several aspects, called
 279 the "categories" of the locale.  The main various aspects are:
 280    * The character encoding and the character properties.  This is the
 281      `LC_CTYPE' category.
 282
 283    * The sorting rules for text.  This is the `LC_COLLATE' category.
 284
 285    * The language specific translations of messages.  This is the
 286      `LC_MESSAGES' category.
 287
 288    * The formatting rules for numbers, such as the decimal separator.
 289      This is the `LC_NUMERIC' category.
 290
 291    * The formatting rules for amounts of money.  This is the
 292      `LC_MONETARY' category.
 293
 294    * The formatting of date and time.  This is the `LC_TIME' category.
 295
 296    In particular, the `LC_CTYPE' category of the current locale
 297 determines the character encoding.  This is the encoding of `char *'
 298 strings.  We also call it the "locale encoding".  GNU libunistring has
 299 a function, `locale_charset', that returns a standardized (platform
 300 independent) name for this encoding.
 301
 302    All locale encodings used on glibc systems are essentially ASCII
 303 compatible: Most graphic ASCII characters have the same representation,
 304 as a single byte, in that encoding as in ASCII.
 305
 306    Among the possible locale encodings are UTF-8 and GB18030.  Both
 307 allow to represent any Unicode character as a sequence of bytes.  UTF-8
 308 is used in most of the world, whereas GB18030 is used in the People's
 309 Republic of China, because it is backward compatible with the GB2312
 310 encoding that was used in this country earlier.
 311
 312    The legacy locale encodings, ISO-8859-15 (which supplanted
 313 ISO-8859-1 in most of Europe), ISO-8859-2, KOI8-R, EUC-JP, etc., are
 314 still in use in many places, though.
 315
 316    UTF-16 and UTF-32 are not used as locale encodings, because they are
 317 not ASCII compatible.
 318
 319 \1f
 320 File: libunistring.info,  Node: In-memory representation,  Next: char * strings,  Prev: Locale encodings,  Up: Introduction
 321
 322 1.4 Choice of in-memory representation of strings
 323 =================================================
 324
 325    There are three ways of representing strings in memory of a running
 326 program.
 327    * As `char *' strings.  Such strings are represented in locale
 328      encoding.  This approach is employed when not much text processing
 329      is done by the program.  When some Unicode aware processing is to
 330      be done, a string is converted to Unicode on the fly and back to
 331      locale encoding afterwards.
 332
 333    * As UTF-8 or UTF-16 or UTF-32 strings.  This implies that
 334      conversion from locale encoding to Unicode is performed on input,
 335      and in the opposite direction on output.  This approach is
 336      employed when the program does a significant amount of text
 337      processing, or when the program has multiple threads operating on
 338      the same data but in different locales.
 339
 340    * As `wchar_t *', a.k.a. "wide strings".  This approach is misguided,
 341      see *note The wchar_t mess::.
 342
 343 \1f
 344 File: libunistring.info,  Node: char * strings,  Next: The wchar_t mess,  Prev: In-memory representation,  Up: Introduction
 345
 346 1.5 `char *' strings
 347 ====================
 348
 349    The classical C strings, with its C library support standardized by
 350 ISO C and POSIX, can be used in internationalized programs with some
 351 precautions.  The problem with this API is that many of the C library
 352 functions for strings don't work correctly on strings in locale
 353 encodings, leading to bugs that only people in some cultures of the
 354 world will experience.
 355
 356    The first problem with the C library API is the support of multibyte
 357 locales.  According to the locale encoding, in general, every character
 358 is represented by one or more bytes (up to 4 bytes in practice -- but
 359 use `MB_LEN_MAX' instead of the number 4 in the code).  When every
 360 character is represented by only 1 byte, we speak of an "unibyte
 361 locale", otherwise of a "multibyte locale".  It is important to realize
 362 that the majority of Unix installations nowadays use UTF-8 or GB18030
 363 as locale encoding; therefore, the majority of users are using
 364 multibyte locales.
 365
 366    The important fact to remember is: _A `char' is a byte, not a
 367 character._
 368
 369    As a consequence:
 370    * The `<ctype.h>' API is useless in this context; it does not work in
 371      multibyte locales.
 372
 373    * The `strlen' function does not return the number of characters in
 374      a string.  Nor does it return the number of screen columns occupied
 375      by a string after it is output.  It merely returns the number of
 376      _bytes_ occupied by a string.
 377
 378    * Truncating a string, for example, with `strncpy', can have the
 379      effect of truncating it in the middle of a multibyte character.
 380      Such a string will, when output, have a garbled character at its
 381      end, often represented by a hollow box.
 382
 383    * `strchr' and `strrchr' do not work with multibyte strings if the
 384      locale encoding is GB18030 and the character to be searched is a
 385      digit.
 386
 387    * `strstr' does not work with multibyte strings if the locale
 388      encoding is different from UTF-8.
 389
 390    * `strcspn', `strpbrk', `strspn' cannot work correctly in multibyte
 391      locales: they assume the second argument is a list of single-byte
 392      characters.  Even in this simple case, they do not work with
 393      multibyte strings if the locale encoding is GB18030 and one of the
 394      characters to be searched is a digit.
 395
 396    * `strsep' and `strtok_r' do not work with multibyte strings unless
 397      all of the delimiter characters are ASCII characters < 0x30.
 398
 399    * The `strcasecmp', `strncasecmp', and `strcasestr' functions do not
 400      work with multibyte strings.
 401
 402    The workarounds can be found in GNU gnulib
 403 `http://www.gnu.org/software/gnulib/'.
 404    * gnulib has modules `mbchar', `mbiter', `mbuiter' that represent
 405      multibyte characters and allow to iterate across a multibyte
 406      string with the same ease as through a unibyte string.
 407
 408    * gnulib has functions `mbslen' and `mbswidth' that can be used
 409      instead of `strlen' when the number of characters or the number of
 410      screen columns of a string is requested.
 411
 412    * gnulib has functions `mbschr' and `mbsrrchr' that are like
 413      `strchr' and `strrchr', but work in multibyte locales.
 414
 415    * gnulib has a function `mbsstr', like `strstr', but works in
 416      multibyte locales.
 417
 418    * gnulib has functions `mbscspn', `mbspbrk', `mbsspn' that are like
 419      `strcspn', `strpbrk', `strspn', but work in multibyte locales.
 420
 421    * gnulib has functions `mbssep' and `mbstok_r' that are like
 422      `strsep' and `strtok_r' but work in multibyte locales.
 423
 424    * gnulib has functions `mbscasecmp', `mbsncasecmp', `mbspcasecmp',
 425      and `mbscasestr' that are like `strcasecmp', `strncasecmp', and
 426      `strcasestr', but work in multibyte locales.  Still, the function
 427      `ulc_casecmp' is preferable to these functions; see below.
 428
 429    The second problem with the C library API is that it has some
 430 assumptions built-in that are not valid in some languages:
 431    * It assumes that there are only two forms of every character:
 432      uppercase and lowercase.  This is not true for Croatian, where the
 433      character LETTER DZ WITH CARON comes in three forms: LATIN CAPITAL
 434      LETTER DZ WITH CARON (DZ), LATIN CAPITAL LETTER D WITH SMALL
 435      LETTER Z WITH CARON (Dz), LATIN SMALL LETTER DZ WITH CARON (dz).
 436
 437    * It assumes that uppercasing of 1 character leads to 1 character.
 438      This is not true for German, where the LATIN SMALL LETTER SHARP S,
 439      when uppercased, becomes `SS'.
 440
 441    * It assumes that there is 1:1 mapping between uppercase and
 442      lowercase forms.  This is not true for the Greek sigma: GREEK
 443      CAPITAL LETTER SIGMA is the uppercase of both GREEK SMALL LETTER
 444      SIGMA and GREEK SMALL LETTER FINAL SIGMA.
 445
 446    * It assumes that the upper/lowercase mappings are position
 447      independent.  This is not true for the Greek sigma and the
 448      Lithuanian i.
 449
 450    The correct way to deal with this problem is
 451   1. to provide functions for titlecasing, as well as for upper- and
 452      lowercasing,
 453
 454   2. to view case transformations as functions that operates on strings,
 455      rather than on characters.
 456
 457    This is implemented in this library, through the functions declared
 458 in `<unicase.h>', see *note unicase.h::.
 459
 460 \1f
 461 File: libunistring.info,  Node: The wchar_t mess,  Next: Unicode strings,  Prev: char * strings,  Up: Introduction
 462
 463 1.6 The `wchar_t' mess
 464 ======================
 465
 466    The ISO C and POSIX standard creators made an attempt to fix the
 467 first problem mentioned in the previous section.  They introduced
 468    * a type `wchar_t', designed to encapsulate an entire character,
 469
 470    * a "wide string" type `wchar_t *', and
 471
 472    * functions declared in `<wctype.h>' that were meant to supplant the
 473      ones in `<ctype.h>'.
 474
 475    Unfortunately, this API and its implementation has numerous problems:
 476
 477    * On AIX and Windows platforms, `wchar_t' is a 16-bit type.  This
 478      means that it can never accommodate an entire Unicode character.
 479      Either the `wchar_t *' strings are limited to characters in UCS-2
 480      (the "Basic Multilingual Plane" of Unicode), or -- if `wchar_t *'
 481      strings are encoded in UTF-16 -- a `wchar_t' represents only half
 482      of a character in the worst case, making the `<wctype.h>' functions
 483      pointless.
 484
 485    * On Solaris and FreeBSD, the `wchar_t' encoding is locale dependent
 486      and undocumented.  This means, if you want to know any property of
 487      a `wchar_t' character, other than the properties defined by
 488      `<wctype.h>' -- such as whether it's a dash, currency symbol,
 489      paragraph separator, or similar --, you have to convert it to
 490      `char *' encoding first, by use of the function `wctomb'.
 491
 492    * When you read a stream of wide characters, through the functions
 493      `fgetwc' and `fgetws', and when the input stream/file is not in
 494      the expected encoding, you have no way to determine the invalid
 495      byte sequence and do some corrective action.  If you use these
 496      functions, your program becomes "garbage in - more garbage out" or
 497      "garbage in - abort".
 498
 499    As a consequence, it is better to use multibyte strings, as
 500 explained in the previous section.  Such multibyte strings can bypass
 501 limitations of the `wchar_t' type, if you use functions defined in
 502 gnulib and libunistring for text processing.  They can also faithfully
 503 transport malformed characters that were present in the input, without
 504 requiring the program to produce garbage or abort.
 505
 506 \1f
 507 File: libunistring.info,  Node: Unicode strings,  Prev: The wchar_t mess,  Up: Introduction
 508
 509 1.7 Unicode strings
 510 ===================
 511
 512    libunistring supports Unicode strings in three representations:
 513    * UTF-8 strings, through the type `uint8_t *'.  The units are bytes
 514      (`uint8_t').
 515
 516    * UTF-16 strings, through the type `uint16_t *',  The units are
 517      16-bit memory words (`uint16_t').
 518
 519    * UTF-32 strings, through the type `uint32_t *'.  The units are
 520      32-bit memory words (`uint32_t').
 521
 522    As with C strings, there are two variants:
 523    * Unicode strings with a terminating NUL character are represented as
 524      a pointer to the first unit of the string.  There is a unit
 525      containing a 0 value at the end.  It is considered part of the
 526      string for all memory allocation purposes, but is not considered
 527      part of the string for all other logical purposes.
 528
 529    * Unicode strings where embedded NUL characters are allowed.  These
 530      are represented by a pointer to the first unit and the number of
 531      units (not bytes!) of the string.  In this setting, there is no
 532      trailing zero-valued unit used as "end marker".
 533
 534 \1f
 535 File: libunistring.info,  Node: Conventions,  Next: unitypes.h,  Prev: Introduction,  Up: Top
 536
 537 2 Conventions
 538 *************
 539
 540    This chapter explains conventions valid throughout the libunistring
 541 library.
 542
 543    Variables of type `char *' denote C strings in locale encoding.  See
 544 *note Locale encodings::.
 545
 546    Variables of type `uint8_t *' denote UTF-8 strings.  Their units are
 547 bytes.
 548
 549    Variables of type `uint16_t *' denote UTF-16 strings, without byte
 550 order mark.  Their units are 2-byte words.
 551
 552    Variables of type `uint32_t *' denote UTF-32 strings, without byte
 553 order mark.  Their units are 4-byte words.
 554
 555    Argument pairs `(S, N)' denote a string `S[0..N-1]' with exactly N
 556 units.
 557
 558    All functions with prefix `ulc_' operate on C strings in locale
 559 encoding.
 560
 561    All functions with prefix `u8_' operate on UTF-8 strings.
 562
 563    All functions with prefix `u16_' operate on UTF-16 strings.
 564
 565    All functions with prefix `u32_' operate on UTF-32 strings.
 566
 567    For every function with prefix `u8_', operating on UTF-8 strings,
 568 there is also a corresponding function with prefix `u16_', operating on
 569 UTF-16 strings, and a corresponding function with prefix `u32_',
 570 operating on UTF-32 strings.  Their description is analogous; in this
 571 documentation we describe only the function that operates on UTF-8
 572 strings, for brevity.
 573
 574    A declaration with a variable N denotes the three concrete
 575 declarations with N = 8, N = 16, N = 32.
 576
 577    All parameters starting with `str' and the parameters of functions
 578 starting with `u8_str'/`u16_str'/`u32_str' denote a NUL terminated
 579 string.
 580
 581    Error values are always returned through the `errno' variable,
 582 usually with a return value that indicates the presence of an error
 583 (NULL for functions that return an pointer, or -1 for functions that
 584 return an `int').
 585
 586    Functions returning a string result take a `(RESULTBUF, LENGTHP)'
 587 argument pair.  If RESULTBUF is not NULL and the result fits into
 588 `*LENGTHP' units, it is put in RESULTBUF, and RESULTBUF is returned.
 589 Otherwise, a freshly allocated string is returned.  In both cases,
 590 `*LENGTHP' is set to the length (number of units) of the returned
 591 string.  In case of error, NULL is returned and `errno' is set.
 592
 593 \1f
 594 File: libunistring.info,  Node: unitypes.h,  Next: unistr.h,  Prev: Conventions,  Up: Top
 595
 596 3 Elementary types `<unitypes.h>'
 597 *********************************
 598
 599    The include file `<unitypes.h>' provides the following basic types.
 600
 601  -- Type: uint8_t
 602  -- Type: uint16_t
 603  -- Type: uint32_t
 604      These are the storage units of UTF-8/16/32 strings, respectively.
 605      The definitions are taken from `<stdint.h>', on platforms where
 606      this include file is present.
 607
 608  -- Type: ucs4_t
 609      This type represents a single Unicode character, outside of an
 610      UTF-32 string.
 611
 612 \1f
 613 File: libunistring.info,  Node: unistr.h,  Next: uniconv.h,  Prev: unitypes.h,  Up: Top
 614
 615 4 Elementary Unicode string functions `<unistr.h>'
 616 **************************************************
 617
 618    This include file declares elementary functions for Unicode strings.
 619 It is essentially the equivalent of what `<string.h>' is for C strings.
 620
 621 * Menu:
 622
 623 * Elementary string checks::
 624 * Elementary string conversions::
 625 * Elementary string functions::
 626 * Elementary string functions with memory allocation::
 627 * Elementary string functions on NUL terminated strings::
 628
 629 \1f
 630 File: libunistring.info,  Node: Elementary string checks,  Next: Elementary string conversions,  Up: unistr.h
 631
 632 4.1 Elementary string checks
 633 ============================
 634
 635    The following function is available to verify the integrity of a
 636 Unicode string.
 637
 638  -- Function: const uint8_t * u8_check (const uint8_t *S, size_t N)
 639  -- Function: const uint16_t * u16_check (const uint16_t *S, size_t N)
 640  -- Function: const uint32_t * u32_check (const uint32_t *S, size_t N)
 641      This function checks whether a Unicode string is well-formed.  It
 642      returns NULL if valid, or a pointer to the first invalid unit
 643      otherwise.
 644
 645 \1f
 646 File: libunistring.info,  Node: Elementary string conversions,  Next: Elementary string functions,  Prev: Elementary string checks,  Up: unistr.h
 647
 648 4.2 Elementary string conversions
 649 =================================
 650
 651    The following functions perform conversions between the different
 652 forms of Unicode strings.
 653
 654  -- Function: uint16_t * u8_to_u16 (const uint8_t *S, size_t N,
 655           uint16_t *RESULTBUF, size_t *LENGTHP)
 656      Converts an UTF-8 string to an UTF-16 string.
 657
 658  -- Function: uint32_t * u8_to_u32 (const uint8_t *S, size_t N,
 659           uint32_t *RESULTBUF, size_t *LENGTHP)
 660      Converts an UTF-8 string to an UTF-32 string.
 661
 662  -- Function: uint8_t * u16_to_u8 (const uint16_t *S, size_t N, uint8_t
 663           *RESULTBUF, size_t *LENGTHP)
 664      Converts an UTF-16 string to an UTF-8 string.
 665
 666  -- Function: uint32_t * u16_to_u32 (const uint16_t *S, size_t N,
 667           uint32_t *RESULTBUF, size_t *LENGTHP)
 668      Converts an UTF-16 string to an UTF-32 string.
 669
 670  -- Function: uint8_t * u32_to_u8 (const uint32_t *S, size_t N, uint8_t
 671           *RESULTBUF, size_t *LENGTHP)
 672      Converts an UTF-32 string to an UTF-8 string.
 673
 674  -- Function: uint16_t * u32_to_u16 (const uint32_t *S, size_t N,
 675           uint16_t *RESULTBUF, size_t *LENGTHP)
 676      Converts an UTF-32 string to an UTF-16 string.
 677
 678 \1f
 679 File: libunistring.info,  Node: Elementary string functions,  Next: Elementary string functions with memory allocation,  Prev: Elementary string conversions,  Up: unistr.h
 680
 681 4.3 Elementary string functions
 682 ===============================
 683
 684    The following functions inspect and return details about the first
 685 character in a Unicode string.
 686
 687  -- Function: int u8_mblen (const uint8_t *S, size_t N)
 688  -- Function: int u16_mblen (const uint16_t *S, size_t N)
 689  -- Function: int u32_mblen (const uint32_t *S, size_t N)
 690      Returns the length (number of units) of the first character in S,
 691      which is no longer than N.  Returns 0 if it is the NUL character.
 692      Returns -1 upon failure.
 693
 694      This function is similar to `mblen', except that it operates on a
 695      Unicode string and that S must not be NULL.
 696
 697  -- Function: int u8_mbtouc_unsafe (ucs4_t *PUC, const uint8_t *S,
 698           size_t N)
 699  -- Function: int u16_mbtouc_unsafe (ucs4_t *PUC, const uint16_t *S,
 700           size_t N)
 701  -- Function: int u32_mbtouc_unsafe (ucs4_t *PUC, const uint32_t *S,
 702           size_t N)
 703      Returns the length (number of units) of the first character in S,
 704      putting its `ucs4_t' representation in `*PUC'.  Upon failure,
 705      `*PUC' is set to `0xfffd', and an appropriate number of units is
 706      returned.
 707
 708      The number of available units, N, must be > 0.
 709
 710      This function is similar to `mbtowc', except that it operates on a
 711      Unicode string, PUC and S must not be NULL, N must be > 0, and the
 712      NUL character is not treated specially.
 713
 714  -- Function: int u8_mbtouc (ucs4_t *PUC, const uint8_t *S, size_t N)
 715  -- Function: int u16_mbtouc (ucs4_t *PUC, const uint16_t *S, size_t N)
 716  -- Function: int u32_mbtouc (ucs4_t *PUC, const uint32_t *S, size_t N)
 717      This function is like `u8_mbtouc_unsafe', except that it will
 718      detect an invalid UTF-8 character, even if the library is compiled
 719      without `--enable-safety'.
 720
 721  -- Function: int u8_mbtoucr (ucs4_t *PUC, const uint8_t *S, size_t N)
 722  -- Function: int u16_mbtoucr (ucs4_t *PUC, const uint16_t *S, size_t N)
 723  -- Function: int u32_mbtoucr (ucs4_t *PUC, const uint32_t *S, size_t N)
 724      Returns the length (number of units) of the first character in S,
 725      putting its `ucs4_t' representation in `*PUC'.  Upon failure,
 726      `*PUC' is set to `0xfffd', and -1 is returned for an invalid
 727      sequence of units, -2 is returned for an incomplete sequence of
 728      units.
 729
 730      The number of available units, N, must be > 0.
 731
 732      This function is similar to `u8_mbtouc', except that the return
 733      value gives more details about the failure, similar to `mbrtowc'.
 734
 735    The following function stores a Unicode character as a Unicode
 736 string in memory.
 737
 738  -- Function: int u8_uctomb (uint8_t *S, ucs4_t UC, int N)
 739  -- Function: int u16_uctomb (uint16_t *S, ucs4_t UC, int N)
 740  -- Function: int u32_uctomb (uint32_t *S, ucs4_t UC, int N)
 741      Puts the multibyte character represented by UC in S, returning its
 742      length.  Returns -1 upon failure, -2 if the number of available
 743      units, N, is too small.  The latter case cannot occur if N >=
 744      6/2/1, respectively.
 745
 746      This function is similar to `wctomb', except that it operates on a
 747      Unicode strings, S must not be NULL, and the argument N must be
 748      specified.
 749
 750    The following functions copy Unicode strings in memory.
 751
 752  -- Function: uint8_t * u8_cpy (uint8_t *DEST, const uint8_t *SRC,
 753           size_t N)
 754  -- Function: uint16_t * u16_cpy (uint16_t *DEST, const uint16_t *SRC,
 755           size_t N)
 756  -- Function: uint32_t * u32_cpy (uint32_t *DEST, const uint32_t *SRC,
 757           size_t N)
 758      Copies N units from SRC to DEST.
 759
 760      This function is similar to `memcpy', except that it operates on
 761      Unicode strings.
 762
 763  -- Function: uint8_t * u8_move (uint8_t *DEST, const uint8_t *SRC,
 764           size_t N)
 765  -- Function: uint16_t * u16_move (uint16_t *DEST, const uint16_t *SRC,
 766           size_t N)
 767  -- Function: uint32_t * u32_move (uint32_t *DEST, const uint32_t *SRC,
 768           size_t N)
 769      Copies N units from SRC to DEST, guaranteeing correct behavior for
 770      overlapping memory areas.
 771
 772      This function is similar to `memmove', except that it operates on
 773      Unicode strings.
 774
 775    The following function fills a Unicode string.
 776
 777  -- Function: uint8_t * u8_set (uint8_t *S, ucs4_t UC, size_t N)
 778  -- Function: uint16_t * u16_set (uint16_t *S, ucs4_t UC, size_t N)
 779  -- Function: uint32_t * u32_set (uint32_t *S, ucs4_t UC, size_t N)
 780      Sets the first N characters of S to UC.  UC should be a character
 781      that occupies only 1 unit.
 782
 783      This function is similar to `memset', except that it operates on
 784      Unicode strings.
 785
 786    The following function compares two Unicode strings of the same
 787 length.
 788
 789  -- Function: int u8_cmp (const uint8_t *S1, const uint8_t *S2, size_t
 790           N)
 791  -- Function: int u16_cmp (const uint16_t *S1, const uint16_t *S2,
 792           size_t N)
 793  -- Function: int u32_cmp (const uint32_t *S1, const uint32_t *S2,
 794           size_t N)
 795      Compares S1 and S2, each of length N, lexicographically.  Returns
 796      a negative value if S1 compares smaller than S2, a positive value
 797      if S1 compares larger than S2, or 0 if they compare equal.
 798
 799      This function is similar to `memcmp', except that it operates on
 800      Unicode strings.
 801
 802    The following function compares two Unicode strings of possibly
 803 different lengths.
 804
 805  -- Function: int u8_cmp2 (const uint8_t *S1, size_t N1, const uint8_t
 806           *S2, size_t N2)
 807  -- Function: int u16_cmp2 (const uint16_t *S1, size_t N1, const
 808           uint16_t *S2, size_t N2)
 809  -- Function: int u32_cmp2 (const uint32_t *S1, size_t N1, const
 810           uint32_t *S2, size_t N2)
 811      Compares S1 and S2, lexicographically.  Returns a negative value
 812      if S1 compares smaller than S2, a positive value if S1 compares
 813      larger than S2, or 0 if they compare equal.
 814
 815      This function is similar to the gnulib function `memcmp2', except
 816      that it operates on Unicode strings.
 817
 818    The following function searches for a given Unicode character.
 819
 820  -- Function: uint8_t * u8_chr (const uint8_t *S, size_t N, ucs4_t UC)
 821  -- Function: uint16_t * u16_chr (const uint16_t *S, size_t N, ucs4_t
 822           UC)
 823  -- Function: uint32_t * u32_chr (const uint32_t *S, size_t N, ucs4_t
 824           UC)
 825      Searches the string at S for UC.  Returns a pointer to the first
 826      occurrence of UC in S, or NULL if UC does not occur in S.
 827
 828      This function is similar to `memchr', except that it operates on
 829      Unicode strings.
 830
 831    The following function counts the number of Unicode characters.
 832
 833  -- Function: size_t u8_mbsnlen (const uint8_t *S, size_t N)
 834  -- Function: size_t u16_mbsnlen (const uint16_t *S, size_t N)
 835  -- Function: size_t u32_mbsnlen (const uint32_t *S, size_t N)
 836      Counts and returns the number of Unicode characters in the N units
 837      from S.
 838
 839      This function is similar to the gnulib function `mbsnlen', except
 840      that it operates on Unicode strings.
 841
 842 \1f
 843 File: libunistring.info,  Node: Elementary string functions with memory allocation,  Next: Elementary string functions on NUL terminated strings,  Prev: Elementary string functions,  Up: unistr.h
 844
 845 4.4 Elementary string functions with memory allocation
 846 ======================================================
 847
 848    The following function copies a Unicode string.
 849
 850  -- Function: uint8_t * u8_cpy_alloc (const uint8_t *S, size_t N)
 851  -- Function: uint16_t * u16_cpy_alloc (const uint16_t *S, size_t N)
 852  -- Function: uint32_t * u32_cpy_alloc (const uint32_t *S, size_t N)
 853      Makes a freshly allocated copy of S, of length N.
 854
 855 \1f
 856 File: libunistring.info,  Node: Elementary string functions on NUL terminated strings,  Prev: Elementary string functions with memory allocation,  Up: unistr.h
 857
 858 4.5 Elementary string functions on NUL terminated strings
 859 =========================================================
 860
 861    The following functions inspect and return details about the first
 862 character in a Unicode string.
 863
 864  -- Function: int u8_strmblen (const uint8_t *S)
 865  -- Function: int u16_strmblen (const uint16_t *S)
 866  -- Function: int u32_strmblen (const uint32_t *S)
 867      Returns the length (number of units) of the first character in S.
 868      Returns 0 if it is the NUL character.  Returns -1 upon failure.
 869
 870  -- Function: int u8_strmbtouc (ucs4_t *PUC, const uint8_t *S)
 871  -- Function: int u16_strmbtouc (ucs4_t *PUC, const uint16_t *S)
 872  -- Function: int u32_strmbtouc (ucs4_t *PUC, const uint32_t *S)
 873      Returns the length (number of units) of the first character in S,
 874      putting its `ucs4_t' representation in `*PUC'.  Returns 0 if it is
 875      the NUL character.  Returns -1 upon failure.
 876
 877  -- Function: const uint8_t * u8_next (ucs4_t *PUC, const uint8_t *S)
 878  -- Function: const uint16_t * u16_next (ucs4_t *PUC, const uint16_t *S)
 879  -- Function: const uint32_t * u32_next (ucs4_t *PUC, const uint32_t *S)
 880      Forward iteration step.  Advances the pointer past the next
 881      character, or returns NULL if the end of the string has been
 882      reached.  Puts the character's `ucs4_t' representation in `*PUC'.
 883
 884    The following function inspects and returns details about the
 885 previous character in a Unicode string.
 886
 887  -- Function: const uint8_t * u8_prev (ucs4_t *PUC, const uint8_t *S,
 888           const uint8_t *START)
 889  -- Function: const uint16_t * u16_prev (ucs4_t *PUC, const uint16_t
 890           *S, const uint16_t *START)
 891  -- Function: const uint32_t * u32_prev (ucs4_t *PUC, const uint32_t
 892           *S, const uint32_t *START)
 893      Backward iteration step.  Advances the pointer to point to the
 894      previous character, or returns NULL if the beginning of the string
 895      had been reached.  Puts the character's `ucs4_t' representation in
 896      `*PUC'.
 897
 898    The following functions determine the length of a Unicode string.
 899
 900  -- Function: size_t u8_strlen (const uint8_t *S)
 901  -- Function: size_t u16_strlen (const uint16_t *S)
 902  -- Function: size_t u32_strlen (const uint32_t *S)
 903      Returns the number of units in S.
 904
 905      This function is similar to `strlen' and `wcslen', except that it
 906      operates on Unicode strings.
 907
 908  -- Function: size_t u8_strnlen (const uint8_t *S, size_t MAXLEN)
 909  -- Function: size_t u16_strnlen (const uint16_t *S, size_t MAXLEN)
 910  -- Function: size_t u32_strnlen (const uint32_t *S, size_t MAXLEN)
 911      Returns the number of units in S, but at most MAXLEN.
 912
 913      This function is similar to `strnlen' and `wcsnlen', except that
 914      it operates on Unicode strings.
 915
 916    The following functions copy portions of Unicode strings in memory.
 917
 918  -- Function: uint8_t * u8_strcpy (uint8_t *DEST, const uint8_t *SRC)
 919  -- Function: uint16_t * u16_strcpy (uint16_t *DEST, const uint16_t
 920           *SRC)
 921  -- Function: uint32_t * u32_strcpy (uint32_t *DEST, const uint32_t
 922           *SRC)
 923      Copies SRC to DEST.
 924
 925      This function is similar to `strcpy' and `wcscpy', except that it
 926      operates on Unicode strings.
 927
 928  -- Function: uint8_t * u8_stpcpy (uint8_t *DEST, const uint8_t *SRC)
 929  -- Function: uint16_t * u16_stpcpy (uint16_t *DEST, const uint16_t
 930           *SRC)
 931  -- Function: uint32_t * u32_stpcpy (uint32_t *DEST, const uint32_t
 932           *SRC)
 933      Copies SRC to DEST, returning the address of the terminating NUL
 934      in DEST.
 935
 936      This function is similar to `stpcpy', except that it operates on
 937      Unicode strings.
 938
 939  -- Function: uint8_t * u8_strncpy (uint8_t *DEST, const uint8_t *SRC,
 940           size_t N)
 941  -- Function: uint16_t * u16_strncpy (uint16_t *DEST, const uint16_t
 942           *SRC, size_t N)
 943  -- Function: uint32_t * u32_strncpy (uint32_t *DEST, const uint32_t
 944           *SRC, size_t N)
 945      Copies no more than N units of SRC to DEST.
 946
 947      This function is similar to `strncpy' and `wcsncpy', except that
 948      it operates on Unicode strings.
 949
 950  -- Function: uint8_t * u8_stpncpy (uint8_t *DEST, const uint8_t *SRC,
 951           size_t N)
 952  -- Function: uint16_t * u16_stpncpy (uint16_t *DEST, const uint16_t
 953           *SRC, size_t N)
 954  -- Function: uint32_t * u32_stpncpy (uint32_t *DEST, const uint32_t
 955           *SRC, size_t N)
 956      Copies no more than N units of SRC to DEST.  Returns a pointer
 957      past the last non-NUL unit written into DEST.  In other words, if
 958      the units written into DEST include a NUL, the return value is the
 959      address of the first such NUL unit, otherwise it is `DEST + N'.
 960
 961      This function is similar to `stpncpy', except that it operates on
 962      Unicode strings.
 963
 964  -- Function: uint8_t * u8_strcat (uint8_t *DEST, const uint8_t *SRC)
 965  -- Function: uint16_t * u16_strcat (uint16_t *DEST, const uint16_t
 966           *SRC)
 967  -- Function: uint32_t * u32_strcat (uint32_t *DEST, const uint32_t
 968           *SRC)
 969      Appends SRC onto DEST.
 970
 971      This function is similar to `strcat' and `wcscat', except that it
 972      operates on Unicode strings.
 973
 974  -- Function: uint8_t * u8_strncat (uint8_t *DEST, const uint8_t *SRC,
 975           size_t N)
 976  -- Function: uint16_t * u16_strncat (uint16_t *DEST, const uint16_t
 977           *SRC, size_t N)
 978  -- Function: uint32_t * u32_strncat (uint32_t *DEST, const uint32_t
 979           *SRC, size_t N)
 980      Appends no more than N units of SRC onto DEST.
 981
 982      This function is similar to `strncat' and `wcsncat', except that
 983      it operates on Unicode strings.
 984
 985    The following functions compare two Unicode strings.
 986
 987  -- Function: int u8_strcmp (const uint8_t *S1, const uint8_t *S2)
 988  -- Function: int u16_strcmp (const uint16_t *S1, const uint16_t *S2)
 989  -- Function: int u32_strcmp (const uint32_t *S1, const uint32_t *S2)
 990      Compares S1 and S2, lexicographically.  Returns a negative value
 991      if S1 compares smaller than S2, a positive value if S1 compares
 992      larger than S2, or 0 if they compare equal.
 993
 994      This function is similar to `strcmp' and `wcscmp', except that it
 995      operates on Unicode strings.
 996
 997  -- Function: int u8_strcoll (const uint8_t *S1, const uint8_t *S2)
 998  -- Function: int u16_strcoll (const uint16_t *S1, const uint16_t *S2)
 999  -- Function: int u32_strcoll (const uint32_t *S1, const uint32_t *S2)
1000      Compares S1 and S2 using the collation rules of the current locale.
1001      Returns -1 if S1 < S2, 0 if S1 = S2, 1 if S1 > S2.  Upon failure,
1002      sets `errno' and returns any value.
1003
1004      This function is similar to `strcoll' and `wcscoll', except that
1005      it operates on Unicode strings.
1006
1007      Note that this function may consider different canonical
1008      normalizations of the same string as having a large distance.  It
1009      is therefore better to use the function `u8_normcoll' instead of
1010      this one; see *note uninorm.h::.
1011
1012  -- Function: int u8_strncmp (const uint8_t *S1, const uint8_t *S2,
1013           size_t N)
1014  -- Function: int u16_strncmp (const uint16_t *S1, const uint16_t *S2,
1015           size_t N)
1016  -- Function: int u32_strncmp (const uint32_t *S1, const uint32_t *S2,
1017           size_t N)
1018      Compares no more than N units of S1 and S2.
1019
1020      This function is similar to `strncmp' and `wcsncmp', except that
1021      it operates on Unicode strings.
1022
1023    The following function allocates a duplicate of a Unicode string.
1024
1025  -- Function: uint8_t * u8_strdup (const uint8_t *S)
1026  -- Function: uint16_t * u16_strdup (const uint16_t *S)
1027  -- Function: uint32_t * u32_strdup (const uint32_t *S)
1028      Duplicates S, returning an identical malloc'd string.
1029
1030      This function is similar to `strdup' and `wcsdup', except that it
1031      operates on Unicode strings.
1032
1033    The following functions search for a given Unicode character.
1034
1035  -- Function: uint8_t * u8_strchr (const uint8_t *STR, ucs4_t UC)
1036  -- Function: uint16_t * u16_strchr (const uint16_t *STR, ucs4_t UC)
1037  -- Function: uint32_t * u32_strchr (const uint32_t *STR, ucs4_t UC)
1038      Finds the first occurrence of UC in STR.
1039
1040      This function is similar to `strchr' and `wcschr', except that it
1041      operates on Unicode strings.
1042
1043  -- Function: uint8_t * u8_strrchr (const uint8_t *STR, ucs4_t UC)
1044  -- Function: uint16_t * u16_strrchr (const uint16_t *STR, ucs4_t UC)
1045  -- Function: uint32_t * u32_strrchr (const uint32_t *STR, ucs4_t UC)
1046      Finds the last occurrence of UC in STR.
1047
1048      This function is similar to `strrchr' and `wcsrchr', except that
1049      it operates on Unicode strings.
1050
1051    The following functions search for the first occurrence of some
1052 Unicode character in or outside a given set of Unicode characters.
1053
1054  -- Function: size_t u8_strcspn (const uint8_t *STR, const uint8_t
1055           *REJECT)
1056  -- Function: size_t u16_strcspn (const uint16_t *STR, const uint16_t
1057           *REJECT)
1058  -- Function: size_t u32_strcspn (const uint32_t *STR, const uint32_t
1059           *REJECT)
1060      Returns the length of the initial segment of STR which consists
1061      entirely of Unicode characters not in REJECT.
1062
1063      This function is similar to `strcspn' and `wcscspn', except that
1064      it operates on Unicode strings.
1065
1066  -- Function: size_t u8_strspn (const uint8_t *STR, const uint8_t
1067           *ACCEPT)
1068  -- Function: size_t u16_strspn (const uint16_t *STR, const uint16_t
1069           *ACCEPT)
1070  -- Function: size_t u32_strspn (const uint32_t *STR, const uint32_t
1071           *ACCEPT)
1072      Returns the length of the initial segment of STR which consists
1073      entirely of Unicode characters in ACCEPT.
1074
1075      This function is similar to `strspn' and `wcsspn', except that it
1076      operates on Unicode strings.
1077
1078  -- Function: uint8_t * u8_strpbrk (const uint8_t *STR, const uint8_t
1079           *ACCEPT)
1080  -- Function: uint16_t * u16_strpbrk (const uint16_t *STR, const
1081           uint16_t *ACCEPT)
1082  -- Function: uint32_t * u32_strpbrk (const uint32_t *STR, const
1083           uint32_t *ACCEPT)
1084      Finds the first occurrence in STR of any character in ACCEPT.
1085
1086      This function is similar to `strpbrk' and `wcspbrk', except that
1087      it operates on Unicode strings.
1088
1089    The following functions search whether a given Unicode string is a
1090 substring of another Unicode string.
1091
1092  -- Function: uint8_t * u8_strstr (const uint8_t *HAYSTACK, const
1093           uint8_t *NEEDLE)
1094  -- Function: uint16_t * u16_strstr (const uint16_t *HAYSTACK, const
1095           uint16_t *NEEDLE)
1096  -- Function: uint32_t * u32_strstr (const uint32_t *HAYSTACK, const
1097           uint32_t *NEEDLE)
1098      Finds the first occurrence of NEEDLE in HAYSTACK.
1099
1100      This function is similar to `strstr' and `wcsstr', except that it
1101      operates on Unicode strings.
1102
1103  -- Function: bool u8_startswith (const uint8_t *STR, const uint8_t
1104           *PREFIX)
1105  -- Function: bool u16_startswith (const uint16_t *STR, const uint16_t
1106           *PREFIX)
1107  -- Function: bool u32_startswith (const uint32_t *STR, const uint32_t
1108           *PREFIX)
1109      Tests whether STR starts with PREFIX.
1110
1111  -- Function: bool u8_endswith (const uint8_t *STR, const uint8_t
1112           *SUFFIX)
1113  -- Function: bool u16_endswith (const uint16_t *STR, const uint16_t
1114           *SUFFIX)
1115  -- Function: bool u32_endswith (const uint32_t *STR, const uint32_t
1116           *SUFFIX)
1117      Tests whether STR ends with SUFFIX.
1118
1119    The following function does one step in tokenizing a Unicode string.
1120
1121  -- Function: uint8_t * u8_strtok (uint8_t *STR, const uint8_t *DELIM,
1122           uint8_t **PTR)
1123  -- Function: uint16_t * u16_strtok (uint16_t *STR, const uint16_t
1124           *DELIM, uint16_t **PTR)
1125  -- Function: uint32_t * u32_strtok (uint32_t *STR, const uint32_t
1126           *DELIM, uint32_t **PTR)
1127      Divides STR into tokens separated by characters in DELIM.
1128
1129      This function is similar to `strtok_r' and `wcstok', except that
1130      it operates on Unicode strings.  Its interface is actually more
1131      similar to `wcstok' than to `strtok'.
1132
1133 \1f
1134 File: libunistring.info,  Node: uniconv.h,  Next: unistdio.h,  Prev: unistr.h,  Up: Top
1135
1136 5 Conversions between Unicode and encodings `<uniconv.h>'
1137 *********************************************************
1138
1139    This include file declares functions for converting between Unicode
1140 strings and `char *' strings in locale encoding or in other specified
1141 encodings.
1142
1143    The following function returns the locale encoding.
1144
1145  -- Function: const char * locale_charset ()
1146      Determines the current locale's character encoding, and
1147      canonicalizes it into one of the canonical names listed in
1148      `config.charset'.  If the canonical name cannot be determined, the
1149      result is a non-canonical name.
1150
1151      The result must not be freed; it is statically allocated.
1152
1153      The result of this function can be used as an argument to the
1154      `iconv_open' function in GNU libc, in GNU libiconv, or in the
1155      gnulib provided wrapper around the native `iconv_open' function.
1156      It may not work as an argument to the native `iconv_open' function
1157      directly.
1158
1159    The handling of unconvertible characters during the conversions can
1160 be parametrized through the following enumeration type:
1161
1162  -- Type: enum iconv_ilseq_handler
1163      This type specifies how unconvertible characters in the input are
1164      handled.
1165
1166  -- Constant: enum iconv_ilseq_handler iconveh_error
1167      This handler causes the function to return with `errno' set to
1168      `EILSEQ'.
1169
1170  -- Constant: enum iconv_ilseq_handler iconveh_question_mark
1171      This handler produces one question mark `?' per unconvertible
1172      character.
1173
1174  -- Constant: enum iconv_ilseq_handler iconveh_escape_sequence
1175      This handler produces an escape sequence `\uXXXX' or `\UXXXXXXXX'
1176      for each unconvertible character.
1177
1178    The following functions convert between strings in a specified
1179 encoding and Unicode strings.
1180
1181  -- Function: uint8_t * u8_conv_from_encoding (const char *FROMCODE,
1182           enum iconv_ilseq_handler HANDLER, const char *SRC, size_t
1183           SRCLEN, size_t *OFFSETS, uint8_t *RESULTBUF, size_t *LENGTHP)
1184  -- Function: uint16_t * u16_conv_from_encoding (const char *FROMCODE,
1185           enum iconv_ilseq_handler HANDLER, const char *SRC, size_t
1186           SRCLEN, size_t *OFFSETS, uint16_t *RESULTBUF, size_t *LENGTHP)
1187  -- Function: uint32_t * u32_conv_from_encoding (const char *FROMCODE,
1188           enum iconv_ilseq_handler HANDLER, const char *SRC, size_t
1189           SRCLEN, size_t *OFFSETS, uint32_t *RESULTBUF, size_t *LENGTHP)
1190      Converts an entire string, possibly including NUL bytes, from one
1191      encoding to UTF-8 encoding.
1192
1193      Converts a memory region given in encoding FROMCODE.  FROMCODE is
1194      as for the `iconv_open' function.
1195
1196      The input is in the memory region between SRC (inclusive) and `SRC
1197      + SRCLEN' (exclusive).
1198
1199      If OFFSETS is not NULL, it should point to an array of SRCLEN
1200      integers; this array is filled with offsets into the result, i.e.
1201      the character starting at `SRC[i]' corresponds to the character
1202      starting at `RESULT[OFFSETS[i]]', and other offsets are set to
1203      `(size_t)(-1)'.
1204
1205      `RESULTBUF' and `*LENGTHP' should be a scratch buffer and its
1206      size, or `RESULTBUF' can be NULL.
1207
1208      May erase the contents of the memory at `RESULTBUF'.
1209
1210      If successful: The resulting Unicode string (non-NULL) is returned
1211      and its length stored in `*LENGTHP'.  The resulting string is
1212      `RESULTBUF' if no dynamic memory allocation was necessary, or a
1213      freshly allocated memory block otherwise.
1214
1215      In case of error: NULL is returned and `errno' is set.  Particular
1216      `errno' values: `EINVAL', `EILSEQ', `ENOMEM'.
1217
1218  -- Function: char * u8_conv_to_encoding (const char *TOCODE, enum
1219           iconv_ilseq_handler HANDLER, const uint8_t *SRC, size_t
1220           SRCLEN, size_t *OFFSETS, char *RESULTBUF, size_t *LENGTHP)
1221  -- Function: char * u16_conv_to_encoding (const char *TOCODE, enum
1222           iconv_ilseq_handler HANDLER, const uint16_t *SRC, size_t
1223           SRCLEN, size_t *OFFSETS, char *RESULTBUF, size_t *LENGTHP)
1224  -- Function: char * u32_conv_to_encoding (const char *TOCODE, enum
1225           iconv_ilseq_handler HANDLER, const uint32_t *SRC, size_t
1226           SRCLEN, size_t *OFFSETS, char *RESULTBUF, size_t *LENGTHP)
1227      Converts an entire Unicode string, possibly including NUL units,
1228      from UTF-8 encoding to a given encoding.
1229
1230      Converts a memory region to encoding TOCODE.  TOCODE is as for the
1231      `iconv_open' function.
1232
1233      The input is in the memory region between SRC (inclusive) and `SRC
1234      + SRCLEN' (exclusive).
1235
1236      If OFFSETS is not NULL, it should point to an array of SRCLEN
1237      integers; this array is filled with offsets into the result, i.e.
1238      the character starting at `SRC[i]' corresponds to the character
1239      starting at `RESULT[OFFSETS[i]]', and other offsets are set to
1240      `(size_t)(-1)'.
1241
1242      `RESULTBUF' and `*LENGTHP' should be a scratch buffer and its
1243      size, or `RESULTBUF' can be NULL.
1244
1245      May erase the contents of the memory at `RESULTBUF'.
1246
1247      If successful: The resulting Unicode string (non-NULL) is returned
1248      and its length stored in `*LENGTHP'.  The resulting string is
1249      `RESULTBUF' if no dynamic memory allocation was necessary, or a
1250      freshly allocated memory block otherwise.
1251
1252      In case of error: NULL is returned and `errno' is set.  Particular
1253      `errno' values: `EINVAL', `EILSEQ', `ENOMEM'.
1254
1255    The following functions convert between NUL terminated strings in a
1256 specified encoding and NUL terminated Unicode strings.
1257
1258  -- Function: uint8_t * u8_strconv_from_encoding (const char *STRING,
1259           const char *FROMCODE, enum iconv_ilseq_handler HANDLER)
1260  -- Function: uint16_t * u16_strconv_from_encoding (const char *STRING,
1261           const char *FROMCODE, enum iconv_ilseq_handler HANDLER)
1262  -- Function: uint32_t * u32_strconv_from_encoding (const char *STRING,
1263           const char *FROMCODE, enum iconv_ilseq_handler HANDLER)
1264      Converts a NUL terminated string from a given encoding.
1265
1266      The result is `malloc' allocated, or NULL (with ERRNO set) in case
1267      of error.
1268
1269      Particular `errno' values: `EILSEQ', `ENOMEM'.
1270
1271  -- Function: char * u8_strconv_to_encoding (const uint8_t *STRING,
1272           const char *TOCODE, enum iconv_ilseq_handler HANDLER)
1273  -- Function: char * u16_strconv_to_encoding (const uint16_t *STRING,
1274           const char *TOCODE, enum iconv_ilseq_handler HANDLER)
1275  -- Function: char * u32_strconv_to_encoding (const uint32_t *STRING,
1276           const char *TOCODE, enum iconv_ilseq_handler HANDLER)
1277      Converts a NUL terminated string to a given encoding.
1278
1279      The result is `malloc' allocated, or NULL (with `errno' set) in
1280      case of error.
1281
1282      Particular `errno' values: `EILSEQ', `ENOMEM'.
1283
1284    The following functions are shorthands that convert between NUL
1285 terminated strings in locale encoding and NUL terminated Unicode
1286 strings.
1287
1288  -- Function: uint8_t * u8_strconv_from_locale (const char *STRING)
1289  -- Function: uint16_t * u16_strconv_from_locale (const char *STRING)
1290  -- Function: uint32_t * u32_strconv_from_locale (const char *STRING)
1291      Converts a NUL terminated string from the locale encoding.
1292
1293      The result is `malloc' allocated, or NULL (with `errno' set) in
1294      case of error.
1295
1296      Particular `errno' values: `ENOMEM'.
1297
1298  -- Function: char * u8_strconv_to_locale (const uint8_t *STRING)
1299  -- Function: char * u16_strconv_to_locale (const uint16_t *STRING)
1300  -- Function: char * u32_strconv_to_locale (const uint32_t *STRING)
1301      Converts a NUL terminated string to the locale encoding.
1302
1303      The result is `malloc' allocated, or NULL (with `errno' set) in
1304      case of error.
1305
1306      Particular `errno' values: `ENOMEM'.
1307
1308 \1f
1309 File: libunistring.info,  Node: unistdio.h,  Next: uniname.h,  Prev: uniconv.h,  Up: Top
1310
1311 6 Output with Unicode strings `<unistdio.h>'
1312 ********************************************
1313
1314    This include file declares functions for doing formatted output with
1315 Unicode strings.  It defines a set of functions similar to `fprintf' and
1316 `sprintf', which are declared in `<stdio.h>'.
1317
1318    These functions work like the `printf' function family.  In the
1319 format string:
1320    * The format directive `U' takes an UTF-8 string (`const uint8_t *').
1321
1322    * The format directive `lU' takes an UTF-16 string (`const uint16_t
1323      *').
1324
1325    * The format directive `llU' takes an UTF-32 string (`const uint32_t
1326      *').
1327
1328    A function name with an infix `v' indicates that a `va_list' is
1329 passed instead of multiple arguments.
1330
1331    The functions `*sprintf' have a BUF argument that is assumed to be
1332 large enough.  (_DANGEROUS!  Overflowing the buffer will crash the
1333 program._)
1334
1335    The functions `*snprintf' have a BUF argument that is assumed to be
1336 SIZE units large.  (_DANGEROUS!  The resulting string might be
1337 truncated in the middle of a multibyte character._)
1338
1339    The functions `*asprintf' have a RESULTP argument.  The result will
1340 be freshly allocated and stored in `*resultp'.
1341
1342    The functions `*asnprintf' have a (RESULTBUF, LENGTHP) argument
1343 pair.  If RESULTBUF is not NULL and the result fits into `*LENGTHP'
1344 units, it is put in RESULTBUF, and RESULTBUF is returned.  Otherwise, a
1345 freshly allocated string is returned.  In both cases, `*LENGTHP' is set
1346 to the length (number of units) of the returned string.  In case of
1347 error, NULL is returned and `errno' is set.
1348
1349    The following functions take an ASCII format string and return a
1350 result that is a `char *' string in locale encoding.
1351
1352  -- Function: int ulc_sprintf (char *BUF, const char *FORMAT, ...)
1353
1354  -- Function: int ulc_snprintf (char *BUF, size_t size, const char
1355           *FORMAT, ...)
1356
1357  -- Function: int ulc_asprintf (char **RESULTP, const char *FORMAT, ...)
1358
1359  -- Function: char * ulc_asnprintf (char *RESULTBUF, size_t *LENGTHP,
1360           const char *FORMAT, ...)
1361
1362  -- Function: int ulc_vsprintf (char *BUF, const char *FORMAT, va_list
1363           AP)
1364
1365  -- Function: int ulc_vsnprintf (char *BUF, size_t size, const char
1366           *FORMAT, va_list AP)
1367
1368  -- Function: int ulc_vasprintf (char **RESULTP, const char *FORMAT,
1369           va_list AP)
1370
1371  -- Function: char * ulc_vasnprintf (char *RESULTBUF, size_t *LENGTHP,
1372           const char *FORMAT, va_list AP)
1373
1374    The following functions take an ASCII format string and return a
1375 result in UTF-8 format.
1376
1377  -- Function: int u8_sprintf (uint8_t *BUF, const char *FORMAT, ...)
1378
1379  -- Function: int u8_snprintf (uint8_t *BUF, size_t SIZE, const char
1380           *FORMAT, ...)
1381
1382  -- Function: int u8_asprintf (uint8_t **RESULTP, const char *FORMAT,
1383           ...)
1384
1385  -- Function: uint8_t * u8_asnprintf (uint8_t *RESULTBUF, size_t
1386           *LENGTHP, const char *FORMAT, ...)
1387
1388  -- Function: int u8_vsprintf (uint8_t *BUF, const char *FORMAT,
1389           va_list ap)
1390
1391  -- Function: int u8_vsnprintf (uint8_t *BUF, size_t SIZE, const char
1392           *FORMAT, va_list AP)
1393
1394  -- Function: int u8_vasprintf (uint8_t **RESULTP, const char *FORMAT,
1395           va_list AP)
1396
1397  -- Function: uint8_t * u8_vasnprintf (uint8_t *resultbuf, size_t
1398           *LENGTHP, const char *FORMAT, va_list AP)
1399
1400    The following functions take an UTF-8 format string and return a
1401 result in UTF-8 format.
1402
1403  -- Function: int u8_u8_sprintf (uint8_t *BUF, const uint8_t *FORMAT,
1404           ...)
1405
1406  -- Function: int u8_u8_snprintf (uint8_t *BUF, size_t SIZE, const
1407           uint8_t *FORMAT, ...)
1408
1409  -- Function: int u8_u8_asprintf (uint8_t **RESULTP, const uint8_t
1410           *FORMAT, ...)
1411
1412  -- Function: uint8_t * u8_u8_asnprintf (uint8_t *resultbuf, size_t
1413           *LENGTHP, const uint8_t *FORMAT, ...)
1414
1415  -- Function: int u8_u8_vsprintf (uint8_t *BUF, const uint8_t *FORMAT,
1416           va_list AP)
1417
1418  -- Function: int u8_u8_vsnprintf (uint8_t *BUF, size_t SIZE, const
1419           uint8_t *FORMAT, va_list AP)
1420
1421  -- Function: int u8_u8_vasprintf (uint8_t **RESULTP, const uint8_t
1422           *FORMAT, va_list AP)
1423
1424  -- Function: uint8_t * u8_u8_vasnprintf (uint8_t *resultbuf, size_t
1425           *LENGTHP, const uint8_t *FORMAT, va_list AP)
1426
1427    The following functions take an ASCII format string and return a
1428 result in UTF-16 format.
1429
1430  -- Function: int u16_sprintf (uint16_t *BUF, const char *FORMAT, ...)
1431
1432  -- Function: int u16_snprintf (uint16_t *BUF, size_t SIZE, const char
1433           *FORMAT, ...)
1434
1435  -- Function: int u16_asprintf (uint16_t **RESULTP, const char *FORMAT,
1436           ...)
1437
1438  -- Function: uint16_t * u16_asnprintf (uint16_t *RESULTBUF, size_t
1439           *LENGTHP, const char *FORMAT, ...)
1440
1441  -- Function: int u16_vsprintf (uint16_t *BUF, const char *FORMAT,
1442           va_list ap)
1443
1444  -- Function: int u16_vsnprintf (uint16_t *BUF, size_t SIZE, const char
1445           *FORMAT, va_list AP)
1446
1447  -- Function: int u16_vasprintf (uint16_t **RESULTP, const char
1448           *FORMAT, va_list AP)
1449
1450  -- Function: uint16_t * u16_vasnprintf (uint16_t *resultbuf, size_t
1451           *LENGTHP, const char *FORMAT, va_list AP)
1452
1453    The following functions take an UTF-16 format string and return a
1454 result in UTF-16 format.
1455
1456  -- Function: int u16_u16_sprintf (uint16_t *BUF, const uint16_t
1457           *FORMAT, ...)
1458
1459  -- Function: int u16_u16_snprintf (uint16_t *BUF, size_t SIZE, const
1460           uint16_t *FORMAT, ...)
1461
1462  -- Function: int u16_u16_asprintf (uint16_t **RESULTP, const uint16_t
1463           *FORMAT, ...)
1464
1465  -- Function: uint16_t * u16_u16_asnprintf (uint16_t *resultbuf, size_t
1466           *LENGTHP, const uint16_t *FORMAT, ...)
1467
1468  -- Function: int u16_u16_vsprintf (uint16_t *BUF, const uint16_t
1469           *FORMAT, va_list AP)
1470
1471  -- Function: int u16_u16_vsnprintf (uint16_t *BUF, size_t SIZE, const
1472           uint16_t *FORMAT, va_list AP)
1473
1474  -- Function: int u16_u16_vasprintf (uint16_t **RESULTP, const uint16_t
1475           *FORMAT, va_list AP)
1476
1477  -- Function: uint16_t * u16_u16_vasnprintf (uint16_t *resultbuf,
1478           size_t *LENGTHP, const uint16_t *FORMAT, va_list AP)
1479
1480    The following functions take an ASCII format string and return a
1481 result in UTF-32 format.
1482
1483  -- Function: int u32_sprintf (uint32_t *BUF, const char *FORMAT, ...)
1484
1485  -- Function: int u32_snprintf (uint32_t *BUF, size_t SIZE, const char
1486           *FORMAT, ...)
1487
1488  -- Function: int u32_asprintf (uint32_t **RESULTP, const char *FORMAT,
1489           ...)
1490
1491  -- Function: uint32_t * u32_asnprintf (uint32_t *RESULTBUF, size_t
1492           *LENGTHP, const char *FORMAT, ...)
1493
1494  -- Function: int u32_vsprintf (uint32_t *BUF, const char *FORMAT,
1495           va_list ap)
1496
1497  -- Function: int u32_vsnprintf (uint32_t *BUF, size_t SIZE, const char
1498           *FORMAT, va_list AP)
1499
1500  -- Function: int u32_vasprintf (uint32_t **RESULTP, const char
1501           *FORMAT, va_list AP)
1502
1503  -- Function: uint32_t * u32_vasnprintf (uint32_t *resultbuf, size_t
1504           *LENGTHP, const char *FORMAT, va_list AP)
1505
1506    The following functions take an UTF-32 format string and return a
1507 result in UTF-32 format.
1508
1509  -- Function: int u32_u32_sprintf (uint32_t *BUF, const uint32_t
1510           *FORMAT, ...)
1511
1512  -- Function: int u32_u32_snprintf (uint32_t *BUF, size_t SIZE, const
1513           uint32_t *FORMAT, ...)
1514
1515  -- Function: int u32_u32_asprintf (uint32_t **RESULTP, const uint32_t
1516           *FORMAT, ...)
1517
1518  -- Function: uint32_t * u32_u32_asnprintf (uint32_t *resultbuf, size_t
1519           *LENGTHP, const uint32_t *FORMAT, ...)
1520
1521  -- Function: int u32_u32_vsprintf (uint32_t *BUF, const uint32_t
1522           *FORMAT, va_list AP)
1523
1524  -- Function: int u32_u32_vsnprintf (uint32_t *BUF, size_t SIZE, const
1525           uint32_t *FORMAT, va_list AP)
1526
1527  -- Function: int u32_u32_vasprintf (uint32_t **RESULTP, const uint32_t
1528           *FORMAT, va_list AP)
1529
1530  -- Function: uint32_t * u32_u32_vasnprintf (uint32_t *resultbuf,
1531           size_t *LENGTHP, const uint32_t *FORMAT, va_list AP)
1532
1533    The following functions take an ASCII format string and produce
1534 output in locale encoding to a `FILE' stream.
1535
1536  -- Function: int ulc_fprintf (FILE *STREAM, const char *FORMAT, ...)
1537
1538  -- Function: int ulc_vfprintf (FILE *STREAM, const char *FORMAT,
1539           va_list AP)
1540
1541 \1f
1542 File: libunistring.info,  Node: uniname.h,  Next: unictype.h,  Prev: unistdio.h,  Up: Top
1543
1544 7 Names of Unicode characters `<uniname.h>'
1545 *******************************************
1546
1547    This include file implements the association between a Unicode
1548 character and its name.
1549
1550    The name of a Unicode character allows to distinguish it from other,
1551 similar looking characters.  For example, the character `x' has the name
1552 `"LATIN SMALL LETTER X"' and is therefore different from the character
1553 named `"MULTIPLICATION SIGN"'.
1554
1555  -- Macro: unsigned int UNINAME_MAX
1556      This macro expands to a constant that is the required size of
1557      buffer for a Unicode character name.
1558
1559  -- Function: char * unicode_character_name (ucs4_t UC, char *BUF)
1560      Looks up the name of a Unicode character, in uppercase ASCII.  BUF
1561      must point to a buffer, at least `UNINAME_MAX' bytes in size.
1562      Returns the filled BUF, or NULL if the character does not have a
1563      name.
1564
1565  -- Function: ucs4_t unicode_name_character (const char *NAME)
1566      Looks up the Unicode character with a given name, in upper- or
1567      lowercase ASCII.  Returns the character if found, or
1568      `UNINAME_INVALID' if not found.
1569
1570  -- Macro: ucs4_t UNINAME_INVALID
1571      This macro expands to a constant that is a special return value of
1572      the `unicode_name_character' function.
1573
1574 \1f
1575 File: libunistring.info,  Node: unictype.h,  Next: uniwidth.h,  Prev: uniname.h,  Up: Top
1576
1577 8 Unicode character classification and properties `<unictype.h>'
1578 ****************************************************************
1579
1580    This include file declares functions that classify Unicode characters
1581 and that test whether Unicode characters have specific properties.
1582
1583    The classification assigns a "general category" to every Unicode
1584 character.  This is similar to the classification provided by ISO C in
1585 `<wctype.h>'.
1586
1587    Properties are the data that guides various text processing
1588 algorithms in the presence of specific Unicode characters.
1589
1590 * Menu:
1591
1592 * General category::
1593 * Canonical combining class::
1594 * Bidirectional category::
1595 * Decimal digit value::
1596 * Digit value::
1597 * Numeric value::
1598 * Mirrored character::
1599 * Properties::
1600 * Scripts::
1601 * Blocks::
1602 * ISO C and Java syntax::
1603 * Classifications like in ISO C::
1604
1605 \1f
1606 File: libunistring.info,  Node: General category,  Next: Canonical combining class,  Up: unictype.h
1607
1608 8.1 General category
1609 ====================
1610
1611    Every Unicode character or code point has a _general category_
1612 assigned to it.  This classification is important for most algorithms
1613 that work on Unicode text.
1614
1615    The GNU libunistring library provides two kinds of API for working
1616 with general categories.  The object oriented API uses a variable to
1617 denote every predefined general category value or combinations thereof.
1618 The low-level API uses a bit mask instead.  The advantage of the object
1619 oriented API is that if only a few predefined general category values
1620 are used, the data tables are relatively small.  When you combine
1621 general category values (using `uc_general_category_or',
1622 `uc_general_category_and', or `uc_general_category_and_not'), or when
1623 you use the low level bit masks, a big table is used thats holds the
1624 complete general category information for all Unicode characters.
1625
1626 * Menu:
1627
1628 * Object oriented API::
1629 * Bit mask API::
1630
1631 \1f
1632 File: libunistring.info,  Node: Object oriented API,  Next: Bit mask API,  Up: General category
1633
1634 8.1.1 The object oriented API for general category
1635 --------------------------------------------------
1636
1637  -- Type: uc_general_category_t
1638      This data type denotes a general category value.  It is an
1639      immediate type that can be copied by simple assignment, without
1640      involving memory allocation.  It is not an array type.
1641
1642    The following are the predefined general category value.  Additional
1643 general categories may be added in the future.
1644
1645  -- Constant: uc_general_category_t UC_CATEGORY_L
1646  -- Constant: uc_general_category_t UC_CATEGORY_Lu
1647  -- Constant: uc_general_category_t UC_CATEGORY_Ll
1648  -- Constant: uc_general_category_t UC_CATEGORY_Lt
1649  -- Constant: uc_general_category_t UC_CATEGORY_Lm
1650  -- Constant: uc_general_category_t UC_CATEGORY_Lo
1651  -- Constant: uc_general_category_t UC_CATEGORY_M
1652  -- Constant: uc_general_category_t UC_CATEGORY_Mn
1653  -- Constant: uc_general_category_t UC_CATEGORY_Mc
1654  -- Constant: uc_general_category_t UC_CATEGORY_Me
1655  -- Constant: uc_general_category_t UC_CATEGORY_N
1656  -- Constant: uc_general_category_t UC_CATEGORY_Nd
1657  -- Constant: uc_general_category_t UC_CATEGORY_Nl
1658  -- Constant: uc_general_category_t UC_CATEGORY_No
1659  -- Constant: uc_general_category_t UC_CATEGORY_P
1660  -- Constant: uc_general_category_t UC_CATEGORY_Pc
1661  -- Constant: uc_general_category_t UC_CATEGORY_Pd
1662  -- Constant: uc_general_category_t UC_CATEGORY_Ps
1663  -- Constant: uc_general_category_t UC_CATEGORY_Pe
1664  -- Constant: uc_general_category_t UC_CATEGORY_Pi
1665  -- Constant: uc_general_category_t UC_CATEGORY_Pf
1666  -- Constant: uc_general_category_t UC_CATEGORY_Po
1667  -- Constant: uc_general_category_t UC_CATEGORY_S
1668  -- Constant: uc_general_category_t UC_CATEGORY_Sm
1669  -- Constant: uc_general_category_t UC_CATEGORY_Sc
1670  -- Constant: uc_general_category_t UC_CATEGORY_Sk
1671  -- Constant: uc_general_category_t UC_CATEGORY_So
1672  -- Constant: uc_general_category_t UC_CATEGORY_Z
1673  -- Constant: uc_general_category_t UC_CATEGORY_Zs
1674  -- Constant: uc_general_category_t UC_CATEGORY_Zl
1675  -- Constant: uc_general_category_t UC_CATEGORY_Zp
1676  -- Constant: uc_general_category_t UC_CATEGORY_C
1677  -- Constant: uc_general_category_t UC_CATEGORY_Cc
1678  -- Constant: uc_general_category_t UC_CATEGORY_Cf
1679  -- Constant: uc_general_category_t UC_CATEGORY_Cs
1680  -- Constant: uc_general_category_t UC_CATEGORY_Co
1681  -- Constant: uc_general_category_t UC_CATEGORY_Cn
1682
1683    The following are alias names for predefined General category values.
1684
1685  -- Macro: uc_general_category_t UC_LETTER
1686      This is another name for `UC_CATEGORY_L'.
1687
1688  -- Macro: uc_general_category_t UC_UPPERCASE_LETTER
1689      This is another name for `UC_CATEGORY_Lu'.
1690
1691  -- Macro: uc_general_category_t UC_LOWERCASE_LETTER
1692      This is another name for `UC_CATEGORY_Ll'.
1693
1694  -- Macro: uc_general_category_t UC_TITLECASE_LETTER
1695      This is another name for `UC_CATEGORY_Lt'.
1696
1697  -- Macro: uc_general_category_t UC_MODIFIER_LETTER
1698      This is another name for `UC_CATEGORY_Lm'.
1699
1700  -- Macro: uc_general_category_t UC_OTHER_LETTER
1701      This is another name for `UC_CATEGORY_Lo'.
1702
1703  -- Macro: uc_general_category_t UC_MARK
1704      This is another name for `UC_CATEGORY_M'.
1705
1706  -- Macro: uc_general_category_t UC_NON_SPACING_MARK
1707      This is another name for `UC_CATEGORY_Mn'.
1708
1709  -- Macro: uc_general_category_t UC_COMBINING_SPACING_MARK
1710      This is another name for `UC_CATEGORY_Mc'.
1711
1712  -- Macro: uc_general_category_t UC_ENCLOSING_MARK
1713      This is another name for `UC_CATEGORY_Me'.
1714
1715  -- Macro: uc_general_category_t UC_NUMBER
1716      This is another name for `UC_CATEGORY_N'.
1717
1718  -- Macro: uc_general_category_t UC_DECIMAL_DIGIT_NUMBER
1719      This is another name for `UC_CATEGORY_Nd'.
1720
1721  -- Macro: uc_general_category_t UC_LETTER_NUMBER
1722      This is another name for `UC_CATEGORY_Nl'.
1723
1724  -- Macro: uc_general_category_t UC_OTHER_NUMBER
1725      This is another name for `UC_CATEGORY_No'.
1726
1727  -- Macro: uc_general_category_t UC_PUNCTUATION
1728      This is another name for `UC_CATEGORY_P'.
1729
1730  -- Macro: uc_general_category_t UC_CONNECTOR_PUNCTUATION
1731      This is another name for `UC_CATEGORY_Pc'.
1732
1733  -- Macro: uc_general_category_t UC_DASH_PUNCTUATION
1734      This is another name for `UC_CATEGORY_Pd'.
1735
1736  -- Macro: uc_general_category_t UC_OPEN_PUNCTUATION
1737      This is another name for `UC_CATEGORY_Ps' ("start punctuation").
1738
1739  -- Macro: uc_general_category_t UC_CLOSE_PUNCTUATION
1740      This is another name for `UC_CATEGORY_Pe' ("end punctuation").
1741
1742  -- Macro: uc_general_category_t UC_INITIAL_QUOTE_PUNCTUATION
1743      This is another name for `UC_CATEGORY_Pi'.
1744
1745  -- Macro: uc_general_category_t UC_FINAL_QUOTE_PUNCTUATION
1746      This is another name for `UC_CATEGORY_Pf'.
1747
1748  -- Macro: uc_general_category_t UC_OTHER_PUNCTUATION
1749      This is another name for `UC_CATEGORY_Po'.
1750
1751  -- Macro: uc_general_category_t UC_SYMBOL
1752      This is another name for `UC_CATEGORY_S'.
1753
1754  -- Macro: uc_general_category_t UC_MATH_SYMBOL
1755      This is another name for `UC_CATEGORY_Sm'.
1756
1757  -- Macro: uc_general_category_t UC_CURRENCY_SYMBOL
1758      This is another name for `UC_CATEGORY_Sc'.
1759
1760  -- Macro: uc_general_category_t UC_MODIFIER_SYMBOL
1761      This is another name for `UC_CATEGORY_Sk'.
1762
1763  -- Macro: uc_general_category_t UC_OTHER_SYMBOL
1764      This is another name for `UC_CATEGORY_So'.
1765
1766  -- Macro: uc_general_category_t UC_SEPARATOR
1767      This is another name for `UC_CATEGORY_Z'.
1768
1769  -- Macro: uc_general_category_t UC_SPACE_SEPARATOR
1770      This is another name for `UC_CATEGORY_Zs'.
1771
1772  -- Macro: uc_general_category_t UC_LINE_SEPARATOR
1773      This is another name for `UC_CATEGORY_Zl'.
1774
1775  -- Macro: uc_general_category_t UC_PARAGRAPH_SEPARATOR
1776      This is another name for `UC_CATEGORY_Zp'.
1777
1778  -- Macro: uc_general_category_t UC_OTHER
1779      This is another name for `UC_CATEGORY_C'.
1780
1781  -- Macro: uc_general_category_t UC_CONTROL
1782      This is another name for `UC_CATEGORY_Cc'.
1783
1784  -- Macro: uc_general_category_t UC_FORMAT
1785      This is another name for `UC_CATEGORY_Cf'.
1786
1787  -- Macro: uc_general_category_t UC_SURROGATE
1788      This is another name for `UC_CATEGORY_Cs'.  All code points in this
1789      category are invalid characters.
1790
1791  -- Macro: uc_general_category_t UC_PRIVATE_USE
1792      This is another name for `UC_CATEGORY_Co'.
1793
1794  -- Macro: uc_general_category_t UC_UNASSIGNED
1795      This is another name for `UC_CATEGORY_Cn'.  Some code points in
1796      this category are invalid characters.
1797
1798    The following functions combine general categories, like in a
1799 boolean algebra, except that there is no `not' operation.
1800
1801  -- Function: uc_general_category_t uc_general_category_or
1802           (uc_general_category_t CATEGORY1, uc_general_category_t
1803           CATEGORY2)
1804      Returns the union of two general categories.  This corresponds to
1805      the unions of the two sets of characters.
1806
1807  -- Function: uc_general_category_t uc_general_category_and
1808           (uc_general_category_t CATEGORY1, uc_general_category_t
1809           CATEGORY2)
1810      Returns the intersection of two general categories as bit masks.
1811      This _does not_ correspond to the intersection of the two sets of
1812      characters.
1813
1814  -- Function: uc_general_category_t uc_general_category_and_not
1815           (uc_general_category_t CATEGORY1, uc_general_category_t
1816           CATEGORY2)
1817      Returns the intersection of a general category with the complement
1818      of a second general category, as bit masks.  This _does not_
1819      correspond to the intersection with complement, when viewing the
1820      categories as sets of characters.
1821
1822    The following functions associate general categories with their name.
1823
1824  -- Function: const char * uc_general_category_name
1825           (uc_general_category_t CATEGORY)
1826      Returns the name of a general category.  Returns NULL if the
1827      general category corresponds to a bit mask that does not have a
1828      name.
1829
1830  -- Function: uc_general_category_t uc_general_category_byname (const
1831           char *CATEGORY_NAME)
1832      Returns the general category given by name, e.g. `"Lu"'.
1833
1834    The following functions view general categories as sets of Unicode
1835 characters.
1836
1837  -- Function: uc_general_category_t uc_general_category (ucs4_t UC)
1838      Returns the general category of a Unicode character.
1839
1840      This function uses a big table.
1841
1842  -- Function: bool uc_is_general_category (ucs4_t UC,
1843           uc_general_category_t CATEGORY)
1844      Tests whether a Unicode character belongs to a given category.
1845      The CATEGORY argument can be a predefined general category or the
1846      combination of several predefined general categories.
1847
1848 \1f
1849 File: libunistring.info,  Node: Bit mask API,  Prev: Object oriented API,  Up: General category
1850
1851 8.1.2 The bit mask API for general category
1852 -------------------------------------------
1853
1854    The following are the predefined general category value as bit masks.
1855 Additional general categories may be added in the future.
1856
1857  -- Macro: uint32_t UC_CATEGORY_MASK_L
1858  -- Macro: uint32_t UC_CATEGORY_MASK_Lu
1859  -- Macro: uint32_t UC_CATEGORY_MASK_Ll
1860  -- Macro: uint32_t UC_CATEGORY_MASK_Lt
1861  -- Macro: uint32_t UC_CATEGORY_MASK_Lm
1862  -- Macro: uint32_t UC_CATEGORY_MASK_Lo
1863  -- Macro: uint32_t UC_CATEGORY_MASK_M
1864  -- Macro: uint32_t UC_CATEGORY_MASK_Mn
1865  -- Macro: uint32_t UC_CATEGORY_MASK_Mc
1866  -- Macro: uint32_t UC_CATEGORY_MASK_Me
1867  -- Macro: uint32_t UC_CATEGORY_MASK_N
1868  -- Macro: uint32_t UC_CATEGORY_MASK_Nd
1869  -- Macro: uint32_t UC_CATEGORY_MASK_Nl
1870  -- Macro: uint32_t UC_CATEGORY_MASK_No
1871  -- Macro: uint32_t UC_CATEGORY_MASK_P
1872  -- Macro: uint32_t UC_CATEGORY_MASK_Pc
1873  -- Macro: uint32_t UC_CATEGORY_MASK_Pd
1874  -- Macro: uint32_t UC_CATEGORY_MASK_Ps
1875  -- Macro: uint32_t UC_CATEGORY_MASK_Pe
1876  -- Macro: uint32_t UC_CATEGORY_MASK_Pi
1877  -- Macro: uint32_t UC_CATEGORY_MASK_Pf
1878  -- Macro: uint32_t UC_CATEGORY_MASK_Po
1879  -- Macro: uint32_t UC_CATEGORY_MASK_S
1880  -- Macro: uint32_t UC_CATEGORY_MASK_Sm
1881  -- Macro: uint32_t UC_CATEGORY_MASK_Sc
1882  -- Macro: uint32_t UC_CATEGORY_MASK_Sk
1883  -- Macro: uint32_t UC_CATEGORY_MASK_So
1884  -- Macro: uint32_t UC_CATEGORY_MASK_Z
1885  -- Macro: uint32_t UC_CATEGORY_MASK_Zs
1886  -- Macro: uint32_t UC_CATEGORY_MASK_Zl
1887  -- Macro: uint32_t UC_CATEGORY_MASK_Zp
1888  -- Macro: uint32_t UC_CATEGORY_MASK_C
1889  -- Macro: uint32_t UC_CATEGORY_MASK_Cc
1890  -- Macro: uint32_t UC_CATEGORY_MASK_Cf
1891  -- Macro: uint32_t UC_CATEGORY_MASK_Cs
1892  -- Macro: uint32_t UC_CATEGORY_MASK_Co
1893  -- Macro: uint32_t UC_CATEGORY_MASK_Cn
1894
1895    The following function views general categories as sets of Unicode
1896 characters.
1897
1898  -- Function: bool uc_is_general_category_withtable (ucs4_t UC,
1899           uint32_t BITMASK)
1900      Tests whether a Unicode character belongs to a given category.
1901      The BITMASK argument can be a predefined general category bitmask
1902      or the combination of several predefined general category bitmasks.
1903
1904      This function uses a big table comprising all general categories.
1905
1906 \1f
1907 File: libunistring.info,  Node: Canonical combining class,  Next: Bidirectional category,  Prev: General category,  Up: unictype.h
1908
1909 8.2 Canonical combining class
1910 =============================
1911
1912    Every Unicode character or code point has a _canonical combining
1913 class_ assigned to it.
1914
1915    What is the meaning of the canonical combining class?  Essentially,
1916 it indicates the priority with which a combining character is attached
1917 to its base character.  The characters for which the canonical
1918 combining class is 0 are the base characters, and the characters for
1919 which it is greater than 0 are the combining characters.  Combining
1920 characters are rendered near/attached/around their base character, and
1921 combining characters with small combining classes are attached "first"
1922 or "closer" to the base character.
1923
1924    The canonical combining class of a character is a number in the range
1925 0..255.  The possible values are described in the Unicode Character
1926 Database `http://www.unicode.org/Public/UNIDATA/UCD.html'.  The list
1927 here is not definitive; more values can be added in future versions.
1928
1929  -- Constant: int UC_CCC_NR
1930      The canonical combining class value for "Not Reordered" characters.
1931      The value is 0.
1932
1933  -- Constant: int UC_CCC_OV
1934      The canonical combining class value for "Overlay" characters.
1935
1936  -- Constant: int UC_CCC_NK
1937      The canonical combining class value for "Nukta" characters.
1938
1939  -- Constant: int UC_CCC_KV
1940      The canonical combining class value for "Kana Voicing" characters.
1941
1942  -- Constant: int UC_CCC_VR
1943      The canonical combining class value for "Virama" characters.
1944
1945  -- Constant: int UC_CCC_ATBL
1946      The canonical combining class value for "Attached Below Left"
1947      characters.
1948
1949  -- Constant: int UC_CCC_ATB
1950      The canonical combining class value for "Attached Below"
1951      characters.
1952
1953  -- Constant: int UC_CCC_ATAR
1954      The canonical combining class value for "Attached Above Right"
1955      characters.
1956
1957  -- Constant: int UC_CCC_BL
1958      The canonical combining class value for "Below Left" characters.
1959
1960  -- Constant: int UC_CCC_B
1961      The canonical combining class value for "Below" characters.
1962
1963  -- Constant: int UC_CCC_BR
1964      The canonical combining class value for "Below Right" characters.
1965
1966  -- Constant: int UC_CCC_L
1967      The canonical combining class value for "Left" characters.
1968
1969  -- Constant: int UC_CCC_R
1970      The canonical combining class value for "Right" characters.
1971
1972  -- Constant: int UC_CCC_AL
1973      The canonical combining class value for "Above Left" characters.
1974
1975  -- Constant: int UC_CCC_A
1976      The canonical combining class value for "Above" characters.
1977
1978  -- Constant: int UC_CCC_AR
1979      The canonical combining class value for "Above Right" characters.
1980
1981  -- Constant: int UC_CCC_DB
1982      The canonical combining class value for "Double Below" characters.
1983
1984  -- Constant: int UC_CCC_DA
1985      The canonical combining class value for "Double Above" characters.
1986
1987  -- Constant: int UC_CCC_IS
1988      The canonical combining class value for "Iota Subscript"
1989      characters.
1990
1991    The following function looks up the canonical combining class of a
1992 character.
1993
1994  -- Function: int uc_combining_class (ucs4_t UC)
1995      Returns the canonical combining class of a Unicode character.
1996
1997 \1f
1998 File: libunistring.info,  Node: Bidirectional category,  Next: Decimal digit value,  Prev: Canonical combining class,  Up: unictype.h
1999
2000 8.3 Bidirectional category
2001 ==========================
2002
2003    Every Unicode character or code point has a _bidirectional category_
2004 assigned to it.
2005
2006    The bidirectional category guides the bidirectional algorithm
2007 (`http://www.unicode.org/reports/tr9/').  The possible values are the
2008 following.
2009
2010  -- Constant: int UC_BIDI_L
2011      The bidirectional category for `Left-to-Right`" characters.
2012
2013  -- Constant: int UC_BIDI_LRE
2014      The bidirectional category for "Left-to-Right Embedding"
2015      characters.
2016
2017  -- Constant: int UC_BIDI_LRO
2018      The bidirectional category for "Left-to-Right Override" characters.
2019
2020  -- Constant: int UC_BIDI_R
2021      The bidirectional category for "Right-to-Left" characters.
2022
2023  -- Constant: int UC_BIDI_AL
2024      The bidirectional category for "Right-to-Left Arabic" characters.
2025
2026  -- Constant: int UC_BIDI_RLE
2027      The bidirectional category for "Right-to-Left Embedding"
2028      characters.
2029
2030  -- Constant: int UC_BIDI_RLO
2031      The bidirectional category for "Right-to-Left Override" characters.
2032
2033  -- Constant: int UC_BIDI_PDF
2034      The bidirectional category for "Pop Directional Format" characters.
2035
2036  -- Constant: int UC_BIDI_EN
2037      The bidirectional category for "European Number" characters.
2038
2039  -- Constant: int UC_BIDI_ES
2040      The bidirectional category for "European Number Separator"
2041      characters.
2042
2043  -- Constant: int UC_BIDI_ET
2044      The bidirectional category for "European Number Terminator"
2045      characters.
2046
2047  -- Constant: int UC_BIDI_AN
2048      The bidirectional category for "Arabic Number" characters.
2049
2050  -- Constant: int UC_BIDI_CS
2051      The bidirectional category for "Common Number Separator"
2052      characters.
2053
2054  -- Constant: int UC_BIDI_NSM
2055      The bidirectional category for "Non-Spacing Mark" characters.
2056
2057  -- Constant: int UC_BIDI_BN
2058      The bidirectional category for "Boundary Neutral" characters.
2059
2060  -- Constant: int UC_BIDI_B
2061      The bidirectional category for "Paragraph Separator" characters.
2062
2063  -- Constant: int UC_BIDI_S
2064      The bidirectional category for "Segment Separator" characters.
2065
2066  -- Constant: int UC_BIDI_WS
2067      The bidirectional category for "Whitespace" characters.
2068
2069  -- Constant: int UC_BIDI_ON
2070      The bidirectional category for "Other Neutral" characters.
2071
2072    The following functions implement the association between a
2073 bidirectional category and its name.
2074
2075  -- Function: const char * uc_bidi_category_name (int CATEGORY)
2076      Returns the name of a bidirectional category.
2077
2078  -- Function: int uc_bidi_category_byname (const char *CATEGORY_NAME)
2079      Returns the bidirectional category given by name, e.g. `"LRE"'.
2080
2081    The following functions view bidirectional categories as sets of
2082 Unicode characters.
2083
2084  -- Function: int uc_bidi_category (ucs4_t UC)
2085      Returns the bidirectional category of a Unicode character.
2086
2087  -- Function: bool uc_is_bidi_category (ucs4_t UC, int CATEGORY)
2088      Tests whether a Unicode character belongs to a given bidirectional
2089      category.
2090
2091 \1f
2092 File: libunistring.info,  Node: Decimal digit value,  Next: Digit value,  Prev: Bidirectional category,  Up: unictype.h
2093
2094 8.4 Decimal digit value
2095 =======================
2096
2097    Decimal digits (like the digits from `0' to `9') exist in many
2098 scripts.  The following function converts a decimal digit character to
2099 its numerical value.
2100
2101  -- Function: int uc_decimal_value (ucs4_t UC)
2102      Returns the decimal digit value of a Unicode character.  The
2103      return value is an integer in the range 0..9, or -1 for characters
2104      that do not represent a decimal digit.
2105
2106 \1f
2107 File: libunistring.info,  Node: Digit value,  Next: Numeric value,  Prev: Decimal digit value,  Up: unictype.h
2108
2109 8.5 Digit value
2110 ===============
2111
2112    Digit characters are like decimal digit characters, possibly in
2113 special forms, like as superscript, subscript, or circled.  The
2114 following function converts a digit character to its numerical value.
2115
2116  -- Function: int uc_digit_value (ucs4_t UC)
2117      Returns the digit value of a Unicode character.  The return value
2118      is an integer in the range 0..9, or -1 for characters that do not
2119      represent a digit.
2120
2121 \1f
2122 File: libunistring.info,  Node: Numeric value,  Next: Mirrored character,  Prev: Digit value,  Up: unictype.h
2123
2124 8.6 Numeric value
2125 =================
2126
2127    There are also characters that represent numbers without a digit
2128 system, like the Roman numerals, and fractional numbers, like 1/4 or
2129 3/4.
2130
2131    The following type represents the numeric value of a Unicode
2132 character.
2133
2134  -- Type: uc_fraction_t
2135      This is a structure type with the following fields:
2136           int numerator;
2137           int denominator;
2138      An integer N is represented by `numerator = N', `denominator = 1'.
2139
2140    The following function converts a number character to its numerical
2141 value.
2142
2143  -- Function: uc_fraction_t uc_numeric_value (ucs4_t UC)
2144      Returns the numeric value of a Unicode character.  The return
2145      value is a fraction, or the pseudo-fraction `{ 0, 0 }' for
2146      characters that do not represent a number.
2147
2148 \1f
2149 File: libunistring.info,  Node: Mirrored character,  Next: Properties,  Prev: Numeric value,  Up: unictype.h
2150
2151 8.7 Mirrored character
2152 ======================
2153
2154    Character mirroring is used to associate the closing parenthesis
2155 character to the opening parenthesis character, the closing brace
2156 character with the opening brace character, and so on.
2157
2158    The following function looks up the mirrored character of a Unicode
2159 character.
2160
2161  -- Function: bool uc_mirror_char (ucs4_t UC, ucs4_t *PUC)
2162      Stores the mirrored character of a Unicode character UC in `*PUC'
2163      and returns `true', if it exists.  Otherwise it stores UC
2164      unmodified in `*PUC' and returns `false'.
2165
2166 \1f
2167 File: libunistring.info,  Node: Properties,  Next: Scripts,  Prev: Mirrored character,  Up: unictype.h
2168
2169 8.8 Properties
2170 ==============
2171
2172    This section defines boolean properties of Unicode characters.  This
2173 means, a character either has the given property or does not have it.
2174 In other words, the property can be viewed as a subset of the set of
2175 Unicode characters.
2176
2177    The GNU libunistring library provides two kinds of API for working
2178 with properties.  The object oriented API uses a type `uc_property_t'
2179 to designate a property.  In the function-based API, which is a bit more
2180 low level, a property is merely a function.
2181
2182 * Menu:
2183
2184 * Properties as objects::
2185 * Properties as functions::
2186
2187 \1f
2188 File: libunistring.info,  Node: Properties as objects,  Next: Properties as functions,  Up: Properties
2189
2190 8.8.1 Properties as objects - the object oriented API
2191 -----------------------------------------------------
2192
2193    The following type designates a property on Unicode characters.
2194
2195  -- Type: uc_property_t
2196      This data type denotes a boolean property on Unicode characters.
2197      It is an immediate type that can be copied by simple assignment,
2198      without involving memory allocation.  It is not an array type.
2199
2200    Many Unicode properties are predefined.
2201
2202    The following are general properties.
2203
2204  -- Constant: uc_property_t UC_PROPERTY_WHITE_SPACE
2205  -- Constant: uc_property_t UC_PROPERTY_ALPHABETIC
2206  -- Constant: uc_property_t UC_PROPERTY_OTHER_ALPHABETIC
2207  -- Constant: uc_property_t UC_PROPERTY_NOT_A_CHARACTER
2208  -- Constant: uc_property_t UC_PROPERTY_DEFAULT_IGNORABLE_CODE_POINT
2209  -- Constant: uc_property_t
2210 UC_PROPERTY_OTHER_DEFAULT_IGNORABLE_CODE_POINT
2211  -- Constant: uc_property_t UC_PROPERTY_DEPRECATED
2212  -- Constant: uc_property_t UC_PROPERTY_LOGICAL_ORDER_EXCEPTION
2213  -- Constant: uc_property_t UC_PROPERTY_VARIATION_SELECTOR
2214  -- Constant: uc_property_t UC_PROPERTY_PRIVATE_USE
2215  -- Constant: uc_property_t UC_PROPERTY_UNASSIGNED_CODE_VALUE
2216
2217    The following properties are related to case folding.
2218
2219  -- Constant: uc_property_t UC_PROPERTY_UPPERCASE
2220  -- Constant: uc_property_t UC_PROPERTY_OTHER_UPPERCASE
2221  -- Constant: uc_property_t UC_PROPERTY_LOWERCASE
2222  -- Constant: uc_property_t UC_PROPERTY_OTHER_LOWERCASE
2223  -- Constant: uc_property_t UC_PROPERTY_TITLECASE
2224  -- Constant: uc_property_t UC_PROPERTY_SOFT_DOTTED
2225
2226    The following properties are related to identifiers.
2227
2228  -- Constant: uc_property_t UC_PROPERTY_ID_START
2229  -- Constant: uc_property_t UC_PROPERTY_OTHER_ID_START
2230  -- Constant: uc_property_t UC_PROPERTY_ID_CONTINUE
2231  -- Constant: uc_property_t UC_PROPERTY_OTHER_ID_CONTINUE
2232  -- Constant: uc_property_t UC_PROPERTY_XID_START
2233  -- Constant: uc_property_t UC_PROPERTY_XID_CONTINUE
2234  -- Constant: uc_property_t UC_PROPERTY_PATTERN_WHITE_SPACE
2235  -- Constant: uc_property_t UC_PROPERTY_PATTERN_SYNTAX
2236
2237    The following properties have an influence on shaping and rendering.
2238
2239  -- Constant: uc_property_t UC_PROPERTY_JOIN_CONTROL
2240  -- Constant: uc_property_t UC_PROPERTY_GRAPHEME_BASE
2241  -- Constant: uc_property_t UC_PROPERTY_GRAPHEME_EXTEND
2242  -- Constant: uc_property_t UC_PROPERTY_OTHER_GRAPHEME_EXTEND
2243  -- Constant: uc_property_t UC_PROPERTY_GRAPHEME_LINK
2244
2245    The following properties relate to bidirectional reordering.
2246
2247  -- Constant: uc_property_t UC_PROPERTY_BIDI_CONTROL
2248  -- Constant: uc_property_t UC_PROPERTY_BIDI_LEFT_TO_RIGHT
2249  -- Constant: uc_property_t UC_PROPERTY_BIDI_HEBREW_RIGHT_TO_LEFT
2250  -- Constant: uc_property_t UC_PROPERTY_BIDI_ARABIC_RIGHT_TO_LEFT
2251  -- Constant: uc_property_t UC_PROPERTY_BIDI_EUROPEAN_DIGIT
2252  -- Constant: uc_property_t UC_PROPERTY_BIDI_EUR_NUM_SEPARATOR
2253  -- Constant: uc_property_t UC_PROPERTY_BIDI_EUR_NUM_TERMINATOR
2254  -- Constant: uc_property_t UC_PROPERTY_BIDI_ARABIC_DIGIT
2255  -- Constant: uc_property_t UC_PROPERTY_BIDI_COMMON_SEPARATOR
2256  -- Constant: uc_property_t UC_PROPERTY_BIDI_BLOCK_SEPARATOR
2257  -- Constant: uc_property_t UC_PROPERTY_BIDI_SEGMENT_SEPARATOR
2258  -- Constant: uc_property_t UC_PROPERTY_BIDI_WHITESPACE
2259  -- Constant: uc_property_t UC_PROPERTY_BIDI_NON_SPACING_MARK
2260  -- Constant: uc_property_t UC_PROPERTY_BIDI_BOUNDARY_NEUTRAL
2261  -- Constant: uc_property_t UC_PROPERTY_BIDI_PDF
2262  -- Constant: uc_property_t UC_PROPERTY_BIDI_EMBEDDING_OR_OVERRIDE
2263  -- Constant: uc_property_t UC_PROPERTY_BIDI_OTHER_NEUTRAL
2264
2265    The following properties deal with number representations.
2266
2267  -- Constant: uc_property_t UC_PROPERTY_HEX_DIGIT
2268  -- Constant: uc_property_t UC_PROPERTY_ASCII_HEX_DIGIT
2269
2270    The following properties deal with CJK.
2271
2272  -- Constant: uc_property_t UC_PROPERTY_IDEOGRAPHIC
2273  -- Constant: uc_property_t UC_PROPERTY_UNIFIED_IDEOGRAPH
2274  -- Constant: uc_property_t UC_PROPERTY_RADICAL
2275  -- Constant: uc_property_t UC_PROPERTY_IDS_BINARY_OPERATOR
2276  -- Constant: uc_property_t UC_PROPERTY_IDS_TRINARY_OPERATOR
2277
2278    Other miscellaneous properties are:
2279
2280  -- Constant: uc_property_t UC_PROPERTY_ZERO_WIDTH
2281  -- Constant: uc_property_t UC_PROPERTY_SPACE
2282  -- Constant: uc_property_t UC_PROPERTY_NON_BREAK
2283  -- Constant: uc_property_t UC_PROPERTY_ISO_CONTROL
2284  -- Constant: uc_property_t UC_PROPERTY_FORMAT_CONTROL
2285  -- Constant: uc_property_t UC_PROPERTY_DASH
2286  -- Constant: uc_property_t UC_PROPERTY_HYPHEN
2287  -- Constant: uc_property_t UC_PROPERTY_PUNCTUATION
2288  -- Constant: uc_property_t UC_PROPERTY_LINE_SEPARATOR
2289  -- Constant: uc_property_t UC_PROPERTY_PARAGRAPH_SEPARATOR
2290  -- Constant: uc_property_t UC_PROPERTY_QUOTATION_MARK
2291  -- Constant: uc_property_t UC_PROPERTY_SENTENCE_TERMINAL
2292  -- Constant: uc_property_t UC_PROPERTY_TERMINAL_PUNCTUATION
2293  -- Constant: uc_property_t UC_PROPERTY_CURRENCY_SYMBOL
2294  -- Constant: uc_property_t UC_PROPERTY_MATH
2295  -- Constant: uc_property_t UC_PROPERTY_OTHER_MATH
2296  -- Constant: uc_property_t UC_PROPERTY_PAIRED_PUNCTUATION
2297  -- Constant: uc_property_t UC_PROPERTY_LEFT_OF_PAIR
2298  -- Constant: uc_property_t UC_PROPERTY_COMBINING
2299  -- Constant: uc_property_t UC_PROPERTY_COMPOSITE
2300  -- Constant: uc_property_t UC_PROPERTY_DECIMAL_DIGIT
2301  -- Constant: uc_property_t UC_PROPERTY_NUMERIC
2302  -- Constant: uc_property_t UC_PROPERTY_DIACRITIC
2303  -- Constant: uc_property_t UC_PROPERTY_EXTENDER
2304  -- Constant: uc_property_t UC_PROPERTY_IGNORABLE_CONTROL
2305
2306    The following function looks up a property by its name.
2307
2308  -- Function: uc_property_t uc_property_byname (const char
2309           *PROPERTY_NAME)
2310      Returns the property given by name, e.g. `"White space"'.  If a
2311      property with the given name exists, the result will satisfy the
2312      `uc_property_is_valid' predicate.  Otherwise the result will not
2313      satisfy this predicate and must not be passed to functions that
2314      expect an `uc_property_t' argument.
2315
2316      This function references a big table of all predefined properties.
2317      Its use can significantly increase the size of your application.
2318
2319  -- Function: bool uc_property_is_valid (uc_property_t property)
2320      Returns `true' when the given property is valid, or `false'
2321      otherwise.
2322
2323    The following function views a property as a set of Unicode
2324 characters.
2325
2326  -- Function: bool uc_is_property (ucs4_t UC, uc_property_t PROPERTY)
2327      Tests whether the Unicode character UC has the given property.
2328
2329 \1f
2330 File: libunistring.info,  Node: Properties as functions,  Prev: Properties as objects,  Up: Properties
2331
2332 8.8.2 Properties as functions - the functional API
2333 --------------------------------------------------
2334
2335    The following are general properties.
2336
2337  -- Function: bool uc_is_property_white_space (ucs4_t UC)
2338  -- Function: bool uc_is_property_alphabetic (ucs4_t UC)
2339  -- Function: bool uc_is_property_other_alphabetic (ucs4_t UC)
2340  -- Function: bool uc_is_property_not_a_character (ucs4_t UC)
2341  -- Function: bool uc_is_property_default_ignorable_code_point (ucs4_t
2342           UC)
2343  -- Function: bool uc_is_property_other_default_ignorable_code_point
2344           (ucs4_t UC)
2345  -- Function: bool uc_is_property_deprecated (ucs4_t UC)
2346  -- Function: bool uc_is_property_logical_order_exception (ucs4_t UC)
2347  -- Function: bool uc_is_property_variation_selector (ucs4_t UC)
2348  -- Function: bool uc_is_property_private_use (ucs4_t UC)
2349  -- Function: bool uc_is_property_unassigned_code_value (ucs4_t UC)
2350
2351    The following properties are related to case folding.
2352
2353  -- Function: bool uc_is_property_uppercase (ucs4_t UC)
2354  -- Function: bool uc_is_property_other_uppercase (ucs4_t UC)
2355  -- Function: bool uc_is_property_lowercase (ucs4_t UC)
2356  -- Function: bool uc_is_property_other_lowercase (ucs4_t UC)
2357  -- Function: bool uc_is_property_titlecase (ucs4_t UC)
2358  -- Function: bool uc_is_property_soft_dotted (ucs4_t UC)
2359
2360    The following properties are related to identifiers.
2361
2362  -- Function: bool uc_is_property_id_start (ucs4_t UC)
2363  -- Function: bool uc_is_property_other_id_start (ucs4_t UC)
2364  -- Function: bool uc_is_property_id_continue (ucs4_t UC)
2365  -- Function: bool uc_is_property_other_id_continue (ucs4_t UC)
2366  -- Function: bool uc_is_property_xid_start (ucs4_t UC)
2367  -- Function: bool uc_is_property_xid_continue (ucs4_t UC)
2368  -- Function: bool uc_is_property_pattern_white_space (ucs4_t UC)
2369  -- Function: bool uc_is_property_pattern_syntax (ucs4_t UC)
2370
2371    The following properties have an influence on shaping and rendering.
2372
2373  -- Function: bool uc_is_property_join_control (ucs4_t UC)
2374  -- Function: bool uc_is_property_grapheme_base (ucs4_t UC)
2375  -- Function: bool uc_is_property_grapheme_extend (ucs4_t UC)
2376  -- Function: bool uc_is_property_other_grapheme_extend (ucs4_t UC)
2377  -- Function: bool uc_is_property_grapheme_link (ucs4_t UC)
2378
2379    The following properties relate to bidirectional reordering.
2380
2381  -- Function: bool uc_is_property_bidi_control (ucs4_t UC)
2382  -- Function: bool uc_is_property_bidi_left_to_right (ucs4_t UC)
2383  -- Function: bool uc_is_property_bidi_hebrew_right_to_left (ucs4_t UC)
2384  -- Function: bool uc_is_property_bidi_arabic_right_to_left (ucs4_t UC)
2385  -- Function: bool uc_is_property_bidi_european_digit (ucs4_t UC)
2386  -- Function: bool uc_is_property_bidi_eur_num_separator (ucs4_t UC)
2387  -- Function: bool uc_is_property_bidi_eur_num_terminator (ucs4_t UC)
2388  -- Function: bool uc_is_property_bidi_arabic_digit (ucs4_t UC)
2389  -- Function: bool uc_is_property_bidi_common_separator (ucs4_t UC)
2390  -- Function: bool uc_is_property_bidi_block_separator (ucs4_t UC)
2391  -- Function: bool uc_is_property_bidi_segment_separator (ucs4_t UC)
2392  -- Function: bool uc_is_property_bidi_whitespace (ucs4_t UC)
2393  -- Function: bool uc_is_property_bidi_non_spacing_mark (ucs4_t UC)
2394  -- Function: bool uc_is_property_bidi_boundary_neutral (ucs4_t UC)
2395  -- Function: bool uc_is_property_bidi_pdf (ucs4_t UC)
2396  -- Function: bool uc_is_property_bidi_embedding_or_override (ucs4_t UC)
2397  -- Function: bool uc_is_property_bidi_other_neutral (ucs4_t UC)
2398
2399    The following properties deal with number representations.
2400
2401  -- Function: bool uc_is_property_hex_digit (ucs4_t UC)
2402  -- Function: bool uc_is_property_ascii_hex_digit (ucs4_t UC)
2403
2404    The following properties deal with CJK.
2405
2406  -- Function: bool uc_is_property_ideographic (ucs4_t UC)
2407  -- Function: bool uc_is_property_unified_ideograph (ucs4_t UC)
2408  -- Function: bool uc_is_property_radical (ucs4_t UC)
2409  -- Function: bool uc_is_property_ids_binary_operator (ucs4_t UC)
2410  -- Function: bool uc_is_property_ids_trinary_operator (ucs4_t UC)
2411
2412    Other miscellaneous properties are:
2413
2414  -- Function: bool uc_is_property_zero_width (ucs4_t UC)
2415  -- Function: bool uc_is_property_space (ucs4_t UC)
2416  -- Function: bool uc_is_property_non_break (ucs4_t UC)
2417  -- Function: bool uc_is_property_iso_control (ucs4_t UC)
2418  -- Function: bool uc_is_property_format_control (ucs4_t UC)
2419  -- Function: bool uc_is_property_dash (ucs4_t UC)
2420  -- Function: bool uc_is_property_hyphen (ucs4_t UC)
2421  -- Function: bool uc_is_property_punctuation (ucs4_t UC)
2422  -- Function: bool uc_is_property_line_separator (ucs4_t UC)
2423  -- Function: bool uc_is_property_paragraph_separator (ucs4_t UC)
2424  -- Function: bool uc_is_property_quotation_mark (ucs4_t UC)
2425  -- Function: bool uc_is_property_sentence_terminal (ucs4_t UC)
2426  -- Function: bool uc_is_property_terminal_punctuation (ucs4_t UC)
2427  -- Function: bool uc_is_property_currency_symbol (ucs4_t UC)
2428  -- Function: bool uc_is_property_math (ucs4_t UC)
2429  -- Function: bool uc_is_property_other_math (ucs4_t UC)
2430  -- Function: bool uc_is_property_paired_punctuation (ucs4_t UC)
2431  -- Function: bool uc_is_property_left_of_pair (ucs4_t UC)
2432  -- Function: bool uc_is_property_combining (ucs4_t UC)
2433  -- Function: bool uc_is_property_composite (ucs4_t UC)
2434  -- Function: bool uc_is_property_decimal_digit (ucs4_t UC)
2435  -- Function: bool uc_is_property_numeric (ucs4_t UC)
2436  -- Function: bool uc_is_property_diacritic (ucs4_t UC)
2437  -- Function: bool uc_is_property_extender (ucs4_t UC)
2438  -- Function: bool uc_is_property_ignorable_control (ucs4_t UC)
2439
2440 \1f
2441 File: libunistring.info,  Node: Scripts,  Next: Blocks,  Prev: Properties,  Up: unictype.h
2442
2443 8.9 Scripts
2444 ===========
2445
2446    The Unicode characters are subdivided into scripts.
2447
2448    The following type is used to represent a script:
2449
2450  -- Type: uc_script_t
2451      This data type is a structure type that refers to statically
2452      allocated read-only data.  It contains the following fields:
2453           const char *name;
2454
2455      The `name' field contains the name of the script.
2456
2457    The following functions look up a script.
2458
2459  -- Function: const uc_script_t * uc_script (ucs4_t UC)
2460      Returns the script of a Unicode character.  Returns NULL if UC
2461      does not belong to any script.
2462
2463  -- Function: const uc_script_t * uc_script_byname (const char
2464           *SCRIPT_NAME)
2465      Returns the script given by its name, e.g. `"HAN"'.  Returns NULL
2466      if a script with the given name does not exist.
2467
2468    The following function views a script as a set of Unicode characters.
2469
2470  -- Function: bool uc_is_script (ucs4_t UC, const uc_script_t *SCRIPT)
2471      Tests whether a Unicode character belongs to a given script.
2472
2473    The following gives a global picture of all scripts.
2474
2475  -- Function: void uc_all_scripts (const uc_script_t **SCRIPTS, size_t
2476           *COUNT)
2477      Get the list of all scripts.  Stores a pointer to an array of all
2478      scripts in `*SCRIPTS' and the length of this array in `*COUNT'.
2479
2480 \1f
2481 File: libunistring.info,  Node: Blocks,  Next: ISO C and Java syntax,  Prev: Scripts,  Up: unictype.h
2482
2483 8.10 Blocks
2484 ===========
2485
2486    The Unicode characters are subdivided into blocks.  A block is an
2487 interval of Unicode code points.
2488
2489    The following type is used to represent a block.
2490
2491  -- Type: uc_block_t
2492      This data type is a structure type that refers to statically
2493      allocated data.  It contains the following fields:
2494           ucs4_t start;
2495           ucs4_t end;
2496           const char *name;
2497
2498      The `start' field is the first Unicode code point in the block.
2499
2500      The `end' field is the last Unicode code point in the block.
2501
2502      The `name' field is the name of the block.
2503
2504    The following function looks up a block.
2505
2506  -- Function: const uc_block_t * uc_block (ucs4_t UC)
2507      Returns the block a character belongs to.
2508
2509    The following function views a block as a set of Unicode characters.
2510
2511  -- Function: bool uc_is_block (ucs4_t UC, const uc_block_t *BLOCK)
2512      Tests whether a Unicode character belongs to a given block.
2513
2514    The following gives a global picture of all block.
2515
2516  -- Function: void uc_all_blocks (const uc_block_t **BLOCKS, size_t
2517           *COUNT)
2518      Get the list of all blocks.  Stores a pointer to an array of all
2519      blocks in `*BLOCKS' and the length of this array in `*COUNT'.
2520
2521 \1f
2522 File: libunistring.info,  Node: ISO C and Java syntax,  Next: Classifications like in ISO C,  Prev: Blocks,  Up: unictype.h
2523
2524 8.11 ISO C and Java syntax
2525 ==========================
2526
2527    The following properties are taken from language standards.  The
2528 supported language standards are ISO C 99 and Java.
2529
2530  -- Function: bool uc_is_c_whitespace (ucs4_t UC)
2531      Tests whether a Unicode character is considered whitespace in ISO
2532      C 99.
2533
2534  -- Function: bool uc_is_java_whitespace (ucs4_t UC)
2535      Tests whether a Unicode character is considered whitespace in Java.
2536
2537    The following enumerated values are the possible return values of
2538 the functions `uc_c_ident_category' and `uc_java_ident_category'.
2539
2540  -- Constant: int UC_IDENTIFIER_START
2541      This return value means that the given character is valid as first
2542      or subsequent character in an identifier.
2543
2544  -- Constant: int UC_IDENTIFIER_VALID
2545      This return value means that the given character is valid as
2546      subsequent character only.
2547
2548  -- Constant: int UC_IDENTIFIER_INVALID
2549      This return value means that the given character is not valid in
2550      an identifier.
2551
2552  -- Constant: int UC_IDENTIFIER_IGNORABLE
2553      This return value (only for Java) means that the given character
2554      is ignorable.
2555
2556    The following function determine whether a given character can be a
2557 constituent of an identifier in the given programming language.
2558
2559  -- Function: int uc_c_ident_category (ucs4_t UC)
2560      Returns the categorization of a Unicode character with respect to
2561      the ISO C 99 identifier syntax.
2562
2563  -- Function: int uc_java_ident_category (ucs4_t UC)
2564      Returns the categorization of a Unicode character with respect to
2565      the Java identifier syntax.
2566
2567 \1f
2568 File: libunistring.info,  Node: Classifications like in ISO C,  Prev: ISO C and Java syntax,  Up: unictype.h
2569
2570 8.12 Classifications like in ISO C
2571 ==================================
2572
2573    The following character classifications mimic those declared in the
2574 ISO C header files `<ctype.h>' and `<wctype.h>'.  These functions are
2575 deprecated, because this set of functions was designed with ASCII in
2576 mind and cannot reflect the more diverse reality of the Unicode
2577 character set.  But they can be a quick-and-dirty porting aid when
2578 migrating from `wchar_t' APIs to Unicode strings.
2579
2580  -- Function: bool uc_is_alnum (ucs4_t UC)
2581      Tests for any character for which `uc_is_alpha' or `uc_is_digit' is
2582      true.
2583
2584  -- Function: bool uc_is_alpha (ucs4_t UC)
2585      Tests for any character for which `uc_is_upper' or `uc_is_lower' is
2586      true, or any character that is one of a locale-specific set of
2587      characters for which none of `uc_is_cntrl', `uc_is_digit',
2588      `uc_is_punct', or `uc_is_space' is true.
2589
2590  -- Function: bool uc_is_cntrl (ucs4_t UC)
2591      Tests for any control character.
2592
2593  -- Function: bool uc_is_digit (ucs4_t UC)
2594      Tests for any character that corresponds to a decimal-digit
2595      character.
2596
2597  -- Function: bool uc_is_graph (ucs4_t UC)
2598      Tests for any character for which `uc_is_print' is true and
2599      `uc_is_space' is false.
2600
2601  -- Function: bool uc_is_lower (ucs4_t UC)
2602      Tests for any character that corresponds to a lowercase letter or
2603      is one of a locale-specific set of characters for which none of
2604      `uc_is_cntrl', `uc_is_digit', `uc_is_punct', or `uc_is_space' is
2605      true.
2606
2607  -- Function: bool uc_is_print (ucs4_t UC)
2608      Tests for any printing character.
2609
2610  -- Function: bool uc_is_punct (ucs4_t UC)
2611      Tests for any printing character that is one of a locale-specific
2612      set of characters for which neither `uc_is_space' nor
2613      `uc_is_alnum' is true.
2614
2615  -- Function: bool uc_is_space (ucs4_t UC)
2616      Test for any character that corresponds to a locale-specific set
2617      of characters for which none of `uc_is_alnum', `uc_is_graph', or
2618      `uc_is_punct' is true.
2619
2620  -- Function: bool uc_is_upper (ucs4_t UC)
2621      Tests for any character that corresponds to an uppercase letter or
2622      is one of a locale-specific set of characters for which none of
2623      `uc_is_cntrl', `uc_is_digit', `uc_is_punct', or `uc_is_space' is
2624      true.
2625
2626  -- Function: bool uc_is_xdigit (ucs4_t UC)
2627      Tests for any character that corresponds to a hexadecimal-digit
2628      character.
2629
2630  -- Function: bool uc_is_blank (ucs4_t UC)
2631      Tests for any character that corresponds to a standard blank
2632      character or a locale-specific set of characters for which
2633      `uc_is_alnum' is false.
2634
2635 \1f
2636 File: libunistring.info,  Node: uniwidth.h,  Next: uniwbrk.h,  Prev: unictype.h,  Up: Top
2637
2638 9 Display width `<uniwidth.h>'
2639 ******************************
2640
2641    This include file declares functions that return the display width,
2642 measured in columns, of characters or strings, when output to a device
2643 that uses non-proportional fonts.
2644
2645    Note that for some rarely used characters the actual fonts or
2646 terminal emulators can use a different width.  There is no mechanism
2647 for communicating the display width of characters across a Unix
2648 pseudo-terminal (tty).  Also, there are scripts with complex rendering,
2649 like the Indic scripts.  For these scripts, there is no such concept as
2650 non-proportional fonts.  Therefore the results of these functions
2651 usually work fine on most scripts and on most characters but can fail
2652 to represent the actual display width.
2653
2654    These functions are locale dependent.  The ENCODING argument
2655 identifies the encoding (e.g. `"ISO-8859-2"' for Polish).
2656
2657  -- Function: int uc_width (ucs4_t UC, const char *ENCODING)
2658      Determines and returns the number of column positions required for
2659      UC.  Returns -1 if UC is a control character that has an influence
2660      on the column position when output.
2661
2662  -- Function: int u8_width (const uint8_t *S, size_t N, const char
2663           *ENCODING)
2664  -- Function: int u16_width (const uint16_t *S, size_t N, const char
2665           *ENCODING)
2666  -- Function: int u32_width (const uint32_t *S, size_t N, const char
2667           *ENCODING)
2668      Determines and returns the number of column positions required for
2669      first N units (or fewer if S ends before this) in S.  This
2670      function ignores control characters in the string.
2671
2672  -- Function: int u8_strwidth (const uint8_t *S, const char *ENCODING)
2673  -- Function: int u16_strwidth (const uint16_t *S, const char *ENCODING)
2674  -- Function: int u32_strwidth (const uint32_t *S, const char *ENCODING)
2675      Determines and returns the number of column positions required for
2676      S.  This function ignores control characters in the string.
2677
2678 \1f
2679 File: libunistring.info,  Node: uniwbrk.h,  Next: unilbrk.h,  Prev: uniwidth.h,  Up: Top
2680
2681 10 Word breaks in strings `<uniwbrk.h>'
2682 ***************************************
2683
2684    This include file declares functions for determining where in a
2685 string "words" start and end.  Here "words" are not necessarily the
2686 same as entities that can be looked up in dictionaries, but rather
2687 groups of consecutive characters that should not be split by text
2688 processing operations.
2689
2690 * Menu:
2691
2692 * Word breaks in a string::
2693 * Word break property::
2694
2695 \1f
2696 File: libunistring.info,  Node: Word breaks in a string,  Next: Word break property,  Up: uniwbrk.h
2697
2698 10.1 Word breaks in a string
2699 ============================
2700
2701    The following functions determine the word breaks in a string.
2702
2703  -- Function: void u8_wordbreaks (const uint8_t *S, size_t N, char *P)
2704  -- Function: void u16_wordbreaks (const uint16_t *S, size_t N, char *P)
2705  -- Function: void u32_wordbreaks (const uint32_t *S, size_t N, char *P)
2706  -- Function: void ulc_wordbreaks (const char *S, size_t N, char *P)
2707      Determines the word break points in S, an array of N units, and
2708      stores the result at `P[0..N-1]'.
2709     `P[i] = 1'
2710           means that there is a word boundary between `S[i-1]' and
2711           `S[i]'.
2712
2713     `P[i] = 0'
2714           means that `S[i-1]' and `S[i]' must not be separated.
2715      `P[0]' is always set to 0.  If an application wants to consider a
2716      word break to be present at the beginning of the string (before
2717      `S[0]') or at the end of the string (after `S[0..N-1]'), it has to
2718      treat these cases explicitly.
2719
2720 \1f
2721 File: libunistring.info,  Node: Word break property,  Prev: Word breaks in a string,  Up: uniwbrk.h
2722
2723 10.2 Word break property
2724 ========================
2725
2726    This is a more low-level API.  The word break property is a property
2727 defined in Unicode Standard Annex #29, section "Word Boundaries", see
2728 `http://www.unicode.org/reports/tr29/#Word_Boundaries'.  It is used for
2729 determining the word breaks in a string.
2730
2731    The following are the possible values of the word break property.
2732 More values may be added in the future.
2733
2734  -- Constant: int WBP_OTHER
2735  -- Constant: int WBP_CR
2736  -- Constant: int WBP_LF
2737  -- Constant: int WBP_NEWLINE
2738  -- Constant: int WBP_EXTEND
2739  -- Constant: int WBP_FORMAT
2740  -- Constant: int WBP_KATAKANA
2741  -- Constant: int WBP_ALETTER
2742  -- Constant: int WBP_MIDNUMLET
2743  -- Constant: int WBP_MIDLETTER
2744  -- Constant: int WBP_MIDNUM
2745  -- Constant: int WBP_NUMERIC
2746  -- Constant: int WBP_EXTENDNUMLET
2747
2748    The following function looks up the word break property of a
2749 character.
2750
2751  -- Function: int uc_wordbreak_property (ucs4_t UC)
2752      Returns the Word_Break property of a Unicode character.
2753
2754 \1f
2755 File: libunistring.info,  Node: unilbrk.h,  Next: uninorm.h,  Prev: uniwbrk.h,  Up: Top
2756
2757 11 Line breaking `<unilbrk.h>'
2758 ******************************
2759
2760    This include file declares functions for determining where in a
2761 string line breaks could or should be introduced, in order to make the
2762 displayed string fit into a column of given width.
2763
2764    These functions are locale dependent.  The ENCODING argument
2765 identifies the encoding (e.g. `"ISO-8859-2"' for Polish).
2766
2767    The following enumerated values indicate whether, at a given
2768 position, a line break is possible or not.  Given an string S as an
2769 array `S[0..N-1]' and a position I, the values have the following
2770 meanings:
2771
2772  -- Constant: int UC_BREAK_MANDATORY
2773      This value indicates that `S[I]' is a line break character.
2774
2775  -- Constant: int UC_BREAK_POSSIBLE
2776      This value indicates that a line break may be inserted between
2777      `S[I-1]' and `S[I]'.
2778
2779  -- Constant: int UC_BREAK_HYPHENATION
2780      This value indicates that a hyphen and a line break may be
2781      inserted between `S[I-1]' and `S[I]'.  But beware of language
2782      dependent hyphenation rules.
2783
2784  -- Constant: int UC_BREAK_PROHIBITED
2785      This value indicates that `S[I-1]' and `S[I]' must not be
2786      separated.
2787
2788  -- Constant: int UC_BREAK_UNDEFINED
2789      This value is not used as a return value; rather, in the
2790      overriding argument of the `u*_width_linebreaks' functions, it
2791      indicates the absence of an override.
2792
2793    The following functions determine the positions at which line breaks
2794 are possible.
2795
2796  -- Function: void u8_possible_linebreaks (const uint8_t *S, size_t N,
2797           const char *ENCODING, char *P)
2798  -- Function: void u16_possible_linebreaks (const uint16_t *S, size_t
2799           N, const char *ENCODING, char *P)
2800  -- Function: void u32_possible_linebreaks (const uint32_t *S, size_t
2801           N, const char *ENCODING, char *P)
2802  -- Function: void ulc_possible_linebreaks (const char *S, size_t N,
2803           const char *ENCODING, char *P)
2804      Determines the line break points in S, and stores the result at
2805      `P[0..N-1]'.  Every `P[I]' is assigned one of the values
2806      `UC_BREAK_MANDATORY', `UC_BREAK_POSSIBLE', `UC_BREAK_HYPHENATION',
2807      `UC_BREAK_PROHIBITED'.
2808
2809    The following functions determine where line breaks should be
2810 inserted so that each line fits in a given width, when output to a
2811 device that uses non-proportional fonts.
2812
2813  -- Function: int u8_width_linebreaks (const uint8_t *S, size_t N, int
2814           WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
2815           *OVERRIDE, const char *ENCODING, char *P)
2816  -- Function: int u16_width_linebreaks (const uint16_t *S, size_t N,
2817           int WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
2818           *OVERRIDE, const char *ENCODING, char *P)
2819  -- Function: int u32_width_linebreaks (const uint32_t *S, size_t N,
2820           int WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
2821           *OVERRIDE, const char *ENCODING, char *P)
2822  -- Function: int ulc_width_linebreaks (const char *S, size_t N, int
2823           WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
2824           *OVERRIDE, const char *ENCODING, char *P)
2825      Chooses the best line breaks, assuming that every character
2826      occupies a width given by the `uc_width' function (see *note
2827      uniwidth.h::).
2828
2829      The string is `S[0..N-1]'.
2830
2831      The maximum number of columns per line is given as WIDTH.  The
2832      starting column of the string is given as START_COLUMN.  If the
2833      algorithm shall keep room after the last piece, this amount of
2834      room can be given as AT_END_COLUMNS.
2835
2836      OVERRIDE is an optional override; if `OVERRIDE[I] !=
2837      UC_BREAK_UNDEFINED', `OVERRIDE[I]' takes precedence over `P[I]' as
2838      returned by the `u*_possible_linebreaks' function.
2839
2840      The given ENCODING is used for disambiguating widths in `uc_width'.
2841
2842      Returns the column after the end of the string, and stores the
2843      result at `P[0..N-1]'.  Every `P[I]' is assigned one of the values
2844      `UC_BREAK_MANDATORY', `UC_BREAK_POSSIBLE', `UC_BREAK_HYPHENATION',
2845      `UC_BREAK_PROHIBITED'.  Here the value `UC_BREAK_POSSIBLE'
2846      indicates that a line break _should_ be inserted.
2847
2848 \1f
2849 File: libunistring.info,  Node: uninorm.h,  Next: unicase.h,  Prev: unilbrk.h,  Up: Top
2850
2851 12 Normalization forms (composition and decomposition) `<uninorm.h>'
2852 ********************************************************************
2853
2854    This include file defines functions for transforming Unicode strings
2855 to one of the four normal forms, known as NFC, NFD, NKFC, NFKD.  These
2856 transformations involve decomposition and -- for NFC and NFKC --
2857 composition of Unicode characters.
2858
2859 * Menu:
2860
2861 * Decomposition of characters::
2862 * Composition of characters::
2863 * Normalization of strings::
2864 * Normalizing comparisons::
2865 * Normalization of streams::
2866
2867 \1f
2868 File: libunistring.info,  Node: Decomposition of characters,  Next: Composition of characters,  Up: uninorm.h
2869
2870 12.1 Decomposition of Unicode characters
2871 ========================================
2872
2873    The following enumerated values are the possible types of
2874 decomposition of a Unicode character.
2875
2876  -- Constant: int UC_DECOMP_CANONICAL
2877      Denotes canonical decomposition.
2878
2879  -- Constant: int UC_DECOMP_FONT
2880      UCD marker: `<font>'.  Denotes a font variant (e.g. a blackletter
2881      form).
2882
2883  -- Constant: int UC_DECOMP_NOBREAK
2884      UCD marker: `<noBreak>'.  Denotes a no-break version of a space or
2885      hyphen.
2886
2887  -- Constant: int UC_DECOMP_INITIAL
2888      UCD marker: `<initial>'.  Denotes an initial presentation form
2889      (Arabic).
2890
2891  -- Constant: int UC_DECOMP_MEDIAL
2892      UCD marker: `<medial>'.  Denotes a medial presentation form
2893      (Arabic).
2894
2895  -- Constant: int UC_DECOMP_FINAL
2896      UCD marker: `<final>'.  Denotes a final presentation form (Arabic).
2897
2898  -- Constant: int UC_DECOMP_ISOLATED
2899      UCD marker: `<isolated>'.  Denotes an isolated presentation form
2900      (Arabic).
2901
2902  -- Constant: int UC_DECOMP_CIRCLE
2903      UCD marker: `<circle>'.  Denotes an encircled form.
2904
2905  -- Constant: int UC_DECOMP_SUPER
2906      UCD marker: `<super>'.  Denotes a superscript form.
2907
2908  -- Constant: int UC_DECOMP_SUB
2909      UCD marker: `<sub>'.  Denotes a subscript form.
2910
2911  -- Constant: int UC_DECOMP_VERTICAL
2912      UCD marker: `<vertical>'.  Denotes a vertical layout presentation
2913      form.
2914
2915  -- Constant: int UC_DECOMP_WIDE
2916      UCD marker: `<wide>'.  Denotes a wide (or zenkaku) compatibility
2917      character.
2918
2919  -- Constant: int UC_DECOMP_NARROW
2920      UCD marker: `<narrow>'.  Denotes a narrow (or hankaku)
2921      compatibility character.
2922
2923  -- Constant: int UC_DECOMP_SMALL
2924      UCD marker: `<small>'.  Denotes a small variant form (CNS
2925      compatibility).
2926
2927  -- Constant: int UC_DECOMP_SQUARE
2928      UCD marker: `<square>'.  Denotes a CJK squared font variant.
2929
2930  -- Constant: int UC_DECOMP_FRACTION
2931      UCD marker: `<fraction>'.  Denotes a vulgar fraction form.
2932
2933  -- Constant: int UC_DECOMP_COMPAT
2934      UCD marker: `<compat>'.  Denotes an otherwise unspecified
2935      compatibility character.
2936
2937    The following constant denotes the maximum size of decomposition of
2938 a single Unicode character.
2939
2940  -- Macro: unsigned int UC_DECOMPOSITION_MAX_LENGTH
2941      This macro expands to a constant that is the required size of
2942      buffer passed to the `uc_decomposition' and
2943      `uc_canonical_decomposition' functions.
2944
2945    The following functions decompose a Unicode character.
2946
2947  -- Function: int uc_decomposition (ucs4_t UC, int *DECOMP_TAG, ucs4_t
2948           *DECOMPOSITION)
2949      Returns the character decomposition mapping of the Unicode
2950      character UC.  DECOMPOSITION must point to an array of at least
2951      `UC_DECOMPOSITION_MAX_LENGTH' `ucs_t' elements.
2952
2953      When a decomposition exists, `DECOMPOSITION[0..N-1]' and
2954      `*DECOMP_TAG' are filled and N is returned.  Otherwise -1 is
2955      returned.
2956
2957  -- Function: int uc_canonical_decomposition (ucs4_t UC, ucs4_t
2958           *DECOMPOSITION)
2959      Returns the canonical character decomposition mapping of the
2960      Unicode character UC.  DECOMPOSITION must point to an array of at
2961      least `UC_DECOMPOSITION_MAX_LENGTH' `ucs_t' elements.
2962
2963      When a decomposition exists, `DECOMPOSITION[0..N-1]' is filled and
2964      N is returned.  Otherwise -1 is returned.
2965
2966 \1f
2967 File: libunistring.info,  Node: Composition of characters,  Next: Normalization of strings,  Prev: Decomposition of characters,  Up: uninorm.h
2968
2969 12.2 Composition of Unicode characters
2970 ======================================
2971
2972    The following function composes a Unicode character from two Unicode
2973 characters.
2974
2975  -- Function: ucs4_t uc_composition (ucs4_t UC1, ucs4_t UC2)
2976      Attempts to combine the Unicode characters UC1, UC2.  UC1 is known
2977      to have canonical combining class 0.
2978
2979      Returns the combination of UC1 and UC2, if it exists.  Returns 0
2980      otherwise.
2981
2982      Not all decompositions can be recombined using this function.  See
2983      the Unicode file `CompositionExclusions.txt' for details.
2984
2985 \1f
2986 File: libunistring.info,  Node: Normalization of strings,  Next: Normalizing comparisons,  Prev: Composition of characters,  Up: uninorm.h
2987
2988 12.3 Normalization of strings
2989 =============================
2990
2991    The Unicode standard defines four normalization forms for Unicode
2992 strings.  The following type is used to denote a normalization form.
2993
2994  -- Type: uninorm_t
2995      An object of type `uninorm_t' denotes a Unicode normalization form.
2996      This is a scalar type; its values can be compared with `=='.
2997
2998    The following constants denote the four normalization forms.
2999
3000  -- Macro: uninorm_t UNINORM_NFD
3001      Denotes Normalization form D: canonical decomposition.
3002
3003  -- Macro: uninorm_t UNINORM_NFC
3004      Normalization form C: canonical decomposition, then canonical
3005      composition.
3006
3007  -- Macro: uninorm_t UNINORM_NFKD
3008      Normalization form KD: compatibility decomposition.
3009
3010  -- Macro: uninorm_t UNINORM_NFKC
3011      Normalization form KC: compatibility decomposition, then canonical
3012      composition.
3013
3014    The following functions operate on `uninorm_t' objects.
3015
3016  -- Function: bool uninorm_is_compat_decomposing (uninorm_t NF)
3017      Tests whether the normalization form NF does compatibility
3018      decomposition.
3019
3020  -- Function: bool uninorm_is_composing (uninorm_t NF)
3021      Tests whether the normalization form NF includes canonical
3022      composition.
3023
3024  -- Function: uninorm_t uninorm_decomposing_form (uninorm_t NF)
3025      Returns the decomposing variant of the normalization form NF.
3026      This maps NFC,NFD -> NFD and NFKC,NFKD -> NFKD.
3027
3028    The following functions apply a Unicode normalization form to a
3029 Unicode string.
3030
3031  -- Function: uint8_t * u8_normalize (uninorm_t NF, const uint8_t *S,
3032           size_t N, uint8_t *RESULTBUF, size_t *LENGTHP)
3033  -- Function: uint16_t * u16_normalize (uninorm_t NF, const uint16_t
3034           *S, size_t N, uint16_t *RESULTBUF, size_t *LENGTHP)
3035  -- Function: uint32_t * u32_normalize (uninorm_t NF, const uint32_t
3036           *S, size_t N, uint32_t *RESULTBUF, size_t *LENGTHP)
3037      Returns the specified normalization form of a string.
3038
3039 \1f
3040 File: libunistring.info,  Node: Normalizing comparisons,  Next: Normalization of streams,  Prev: Normalization of strings,  Up: uninorm.h
3041
3042 12.4 Normalizing comparisons
3043 ============================
3044
3045    The following functions compare Unicode string, ignoring differences
3046 in normalization.
3047
3048  -- Function: int u8_normcmp (const uint8_t *S1, size_t N1, const
3049           uint8_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
3050  -- Function: int u16_normcmp (const uint16_t *S1, size_t N1, const
3051           uint16_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
3052  -- Function: int u32_normcmp (const uint32_t *S1, size_t N1, const
3053           uint32_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
3054      Compares S1 and S2, ignoring differences in normalization.
3055
3056      NF must be either `UNINORM_NFD' or `UNINORM_NFKD'.
3057
3058      If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
3059      if S1 > S2, and returns 0.  Upon failure, returns -1 with `errno'
3060      set.
3061
3062  -- Function: char * u8_normxfrm (const uint8_t *S, size_t N, uninorm_t
3063           NF, char *RESULTBUF, size_t *LENGTHP)
3064  -- Function: char * u16_normxfrm (const uint16_t *S, size_t N,
3065           uninorm_t NF, char *RESULTBUF, size_t *LENGTHP)
3066  -- Function: char * u32_normxfrm (const uint32_t *S, size_t N,
3067           uninorm_t NF, char *RESULTBUF, size_t *LENGTHP)
3068      Converts the string S of length N to a NUL-terminated byte
3069      sequence, in such a way that comparing `u8_normxfrm (S1)' and
3070      `u8_normxfrm (S2)' with the `u8_cmp2' function is equivalent to
3071      comparing S1 and S2 with the `u8_normcoll' function.
3072
3073      NF must be either `UNINORM_NFC' or `UNINORM_NFKC'.
3074
3075  -- Function: int u8_normcoll (const uint8_t *S1, size_t N1, const
3076           uint8_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
3077  -- Function: int u16_normcoll (const uint16_t *S1, size_t N1, const
3078           uint16_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
3079  -- Function: int u32_normcoll (const uint32_t *S1, size_t N1, const
3080           uint32_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
3081      Compares S1 and S2, ignoring differences in normalization, using
3082      the collation rules of the current locale.
3083
3084      NF must be either `UNINORM_NFC' or `UNINORM_NFKC'.
3085
3086      If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
3087      if S1 > S2, and returns 0.  Upon failure, returns -1 with `errno'
3088      set.
3089
3090 \1f
3091 File: libunistring.info,  Node: Normalization of streams,  Prev: Normalizing comparisons,  Up: uninorm.h
3092
3093 12.5 Normalization of streams of Unicode characters
3094 ===================================================
3095
3096    A "stream of Unicode characters" is essentially a function that
3097 accepts an `ucs4_t' argument repeatedly, optionally combined with a
3098 function that "flushes" the stream.
3099
3100  -- Type: struct uninorm_filter
3101      This is the data type of a stream of Unicode characters that
3102      normalizes its input according to a given normalization form and
3103      passes the normalized character sequence to the encapsulated
3104      stream of Unicode characters.
3105
3106  -- Function: struct uninorm_filter * uninorm_filter_create (uninorm_t
3107           NF, int (*STREAM_FUNC) (void *STREAM_DATA, ucs4_t UC), void
3108           *STREAM_DATA)
3109      Creates and returns a normalization filter for Unicode characters.
3110
3111      The pair (STREAM_FUNC, STREAM_DATA) is the encapsulated stream.
3112      `STREAM_FUNC (STREAM_DATA, UC)' receives the Unicode character UC
3113      and returns 0 if successful, or -1 with `errno' set upon failure.
3114
3115      Returns the new filter, or NULL with `errno' set upon failure.
3116
3117  -- Function: int uninorm_filter_write (struct uninorm_filter *FILTER,
3118           ucs4_t UC)
3119      Stuffs a Unicode character into a normalizing filter.  Returns 0
3120      if successful, or -1 with `errno' set upon failure.
3121
3122  -- Function: int uninorm_filter_flush (struct uninorm_filter *FILTER)
3123      Brings data buffered in the filter to its destination, the
3124      encapsulated stream.
3125
3126      Returns 0 if successful, or -1 with `errno' set upon failure.
3127
3128      Note! If after calling this function, additional characters are
3129      written into the filter, the resulting character sequence in the
3130      encapsulated stream will not necessarily be normalized.
3131
3132  -- Function: int uninorm_filter_free (struct uninorm_filter *FILTER)
3133      Brings data buffered in the filter to its destination, the
3134      encapsulated stream, then closes and frees the filter.
3135
3136      Returns 0 if successful, or -1 with `errno' set upon failure.
3137
3138 \1f
3139 File: libunistring.info,  Node: unicase.h,  Next: uniregex.h,  Prev: uninorm.h,  Up: Top
3140
3141 13 Case mappings `<unicase.h>'
3142 ******************************
3143
3144    This include file defines functions for case mapping for Unicode
3145 strings and case insensitive comparison of Unicode strings and C
3146 strings.
3147
3148    These string functions fix the problems that were mentioned in *note
3149 char * strings::, namely, they handle the Croatian LETTER DZ WITH
3150 CARON, the German LATIN SMALL LETTER SHARP S, the Greek sigma and the
3151 Lithuanian i correctly.
3152
3153 * Menu:
3154
3155 * Case mappings of characters::
3156 * Case mappings of strings::
3157 * Case mappings of substrings::
3158 * Case insensitive comparison::
3159 * Case detection::
3160
3161 \1f
3162 File: libunistring.info,  Node: Case mappings of characters,  Next: Case mappings of strings,  Up: unicase.h
3163
3164 13.1 Case mappings of characters
3165 ================================
3166
3167    The following functions implement case mappings on Unicode
3168 characters -- for those cases only where the result of the mapping is a
3169 again a single Unicode character.
3170
3171    These mappings are locale and context independent.
3172
3173    *WARNING!* These functions are not sufficient for languages such as
3174 German, Greek and Lithuanian.  Better use the functions below that
3175 treat an entire string at once and are language aware.
3176
3177  -- Function: ucs4_t uc_toupper (ucs4_t UC)
3178      Returns the uppercase mapping of the Unicode character UC.
3179
3180  -- Function: ucs4_t uc_tolower (ucs4_t UC)
3181      Returns the lowercase mapping of the Unicode character UC.
3182
3183  -- Function: ucs4_t uc_totitle (ucs4_t UC)
3184      Returns the titlecase mapping of the Unicode character UC.
3185
3186      The titlecase mapping of a character is to be used when the
3187      character should look like upper case and the following characters
3188      are lower cased.
3189
3190      For most characters, this is the same as the uppercase mapping.
3191      There are only few characters where the title case variant and the
3192      uuper case variant are different.  These characters occur in the
3193      Latin writing of the Croatian, Bosnian, and Serbian languages.
3194
3195      Lower case            Title case            Upper case
3196      ------------------------------------------------------------------
3197      LATIN SMALL LETTER LJ LATIN CAPITAL LETTER  LATIN CAPITAL LETTER
3198                            L WITH SMALL LETTER J LJ
3199      LATIN SMALL LETTER NJ LATIN CAPITAL LETTER  LATIN CAPITAL LETTER
3200                            N WITH SMALL LETTER J NJ
3201      LATIN SMALL LETTER DZ LATIN CAPITAL LETTER  LATIN CAPITAL LETTER
3202                            D WITH SMALL LETTER Z DZ
3203      LATIN SMALL LETTER    LATIN CAPITAL LETTER  LATIN CAPITAL LETTER
3204      DZ WITH CARON         D WITH SMALL LETTER   DZ WITH CARON
3205                            Z WITH CARON
3206
3207 \1f
3208 File: libunistring.info,  Node: Case mappings of strings,  Next: Case mappings of substrings,  Prev: Case mappings of characters,  Up: unicase.h
3209
3210 13.2 Case mappings of strings
3211 =============================
3212
3213    Case mapping should always be performed on entire strings, not on
3214 individual characters.  The functions in this sections do so.
3215
3216    These functions allow to apply a normalization after the case
3217 mapping.  The reason is that if you want to treat `ä' and `Ä' the
3218 same, you most often also want to treat the composed and decomposed
3219 forms of such a character, U+00C4 LATIN CAPITAL LETTER A WITH DIAERESIS
3220 and U+0041 LATIN CAPITAL LETTER A U+0308 COMBINING DIAERESIS the same.
3221 The NF argument designates the normalization.
3222
3223    These functions are locale dependent.  The ISO639_LANGUAGE argument
3224 identifies the language (e.g. `"tr"' for Turkish).  NULL means to use
3225 locale independent case mappings.
3226
3227  -- Function: const char * uc_locale_language ()
3228      Returns the ISO 639 language code of the current locale.  Returns
3229      `""' if it is unknown, or in the "C" locale.
3230
3231  -- Function: uint8_t * u8_toupper (const uint8_t *S, size_t N, const
3232           char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
3233           size_t *LENGTHP)
3234  -- Function: uint16_t * u16_toupper (const uint16_t *S, size_t N,
3235           const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
3236           *RESULTBUF, size_t *LENGTHP)
3237  -- Function: uint32_t * u32_toupper (const uint32_t *S, size_t N,
3238           const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
3239           *RESULTBUF, size_t *LENGTHP)
3240      Returns the uppercase mapping of a string.
3241
3242      The NF argument identifies the normalization form to apply after
3243      the case-mapping.  It can also be NULL, for no normalization.
3244
3245  -- Function: uint8_t * u8_tolower (const uint8_t *S, size_t N, const
3246           char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
3247           size_t *LENGTHP)
3248  -- Function: uint16_t * u16_tolower (const uint16_t *S, size_t N,
3249           const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
3250           *RESULTBUF, size_t *LENGTHP)
3251  -- Function: uint32_t * u32_tolower (const uint32_t *S, size_t N,
3252           const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
3253           *RESULTBUF, size_t *LENGTHP)
3254      Returns the lowercase mapping of a string.
3255
3256      The NF argument identifies the normalization form to apply after
3257      the case-mapping.  It can also be NULL, for no normalization.
3258
3259  -- Function: uint8_t * u8_totitle (const uint8_t *S, size_t N, const
3260           char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
3261           size_t *LENGTHP)
3262  -- Function: uint16_t * u16_totitle (const uint16_t *S, size_t N,
3263           const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
3264           *RESULTBUF, size_t *LENGTHP)
3265  -- Function: uint32_t * u32_totitle (const uint32_t *S, size_t N,
3266           const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
3267           *RESULTBUF, size_t *LENGTHP)
3268      Returns the titlecase mapping of a string.
3269
3270      Mapping to title case means that, in each word, the first cased
3271      character is being mapped to title case and the remaining
3272      characters of the word are being mapped to lower case.
3273
3274      The NF argument identifies the normalization form to apply after
3275      the case-mapping.  It can also be NULL, for no normalization.
3276
3277 \1f
3278 File: libunistring.info,  Node: Case mappings of substrings,  Next: Case insensitive comparison,  Prev: Case mappings of strings,  Up: unicase.h
3279
3280 13.3 Case mappings of substrings
3281 ================================
3282
3283    Case mapping of a substring cannot simply be performed by extracting
3284 the substring and then applying the case mapping function to it.  This
3285 does not work because case mapping requires some information about the
3286 surrounding characters.  The following functions allow to apply case
3287 mappings to substrings of a given string, while taking into account the
3288 characters that precede it (the "prefix") and the characters that
3289 follow it (the "suffix").
3290
3291  -- Type: casing_prefix_context_t
3292      This data type denotes the case-mapping context that is given by a
3293      prefix string.  It is an immediate type that can be copied by
3294      simple assignment, without involving memory allocation.  It is not
3295      an array type.
3296
3297  -- Constant: casing_prefix_context_t unicase_empty_prefix_context
3298      This constant is the case-mapping context that corresponds to an
3299      empty prefix string.
3300
3301    The following functions return `casing_prefix_context_t' objects:
3302
3303  -- Function: casing_prefix_context_t u8_casing_prefix_context (const
3304           uint8_t *S, size_t N)
3305  -- Function: casing_prefix_context_t u16_casing_prefix_context (const
3306           uint16_t *S, size_t N)
3307  -- Function: casing_prefix_context_t u32_casing_prefix_context (const
3308           uint32_t *S, size_t N)
3309      Returns the case-mapping context of a given prefix string.
3310
3311  -- Function: casing_prefix_context_t u8_casing_prefixes_context (const
3312           uint8_t *S, size_t N, casing_prefix_context_t A_CONTEXT)
3313  -- Function: casing_prefix_context_t u16_casing_prefixes_context
3314           (const uint16_t *S, size_t N, casing_prefix_context_t
3315           A_CONTEXT)
3316  -- Function: casing_prefix_context_t u32_casing_prefixes_context
3317           (const uint32_t *S, size_t N, casing_prefix_context_t
3318           A_CONTEXT)
3319      Returns the case-mapping context of the prefix concat(A, S), given
3320      the case-mapping context of the prefix A.
3321
3322  -- Type: casing_suffix_context_t
3323      This data type denotes the case-mapping context that is given by a
3324      suffix string.  It is an immediate type that can be copied by
3325      simple assignment, without involving memory allocation.  It is not
3326      an array type.
3327
3328  -- Constant: casing_suffix_context_t unicase_empty_suffix_context
3329      This constant is the case-mapping context that corresponds to an
3330      empty suffix string.
3331
3332    The following functions return `casing_suffix_context_t' objects:
3333
3334  -- Function: casing_suffix_context_t u8_casing_suffix_context (const
3335           uint8_t *S, size_t N)
3336  -- Function: casing_suffix_context_t u16_casing_suffix_context (const
3337           uint16_t *S, size_t N)
3338  -- Function: casing_suffix_context_t u32_casing_suffix_context (const
3339           uint32_t *S, size_t N)
3340      Returns the case-mapping context of a given suffix string.
3341
3342  -- Function: casing_suffix_context_t u8_casing_suffixes_context (const
3343           uint8_t *S, size_t N, casing_suffix_context_t A_CONTEXT)
3344  -- Function: casing_suffix_context_t u16_casing_suffixes_context
3345           (const uint16_t *S, size_t N, casing_suffix_context_t
3346           A_CONTEXT)
3347  -- Function: casing_suffix_context_t u32_casing_suffixes_context
3348           (const uint32_t *S, size_t N, casing_suffix_context_t
3349           A_CONTEXT)
3350      Returns the case-mapping context of the suffix concat(S, A), given
3351      the case-mapping context of the suffix A.
3352
3353    The following functions perform a case mapping, considering the
3354 prefix context and the suffix context.
3355
3356  -- Function: uint8_t * u8_ct_toupper (const uint8_t *S, size_t N,
3357           casing_prefix_context_t PREFIX_CONTEXT,
3358           casing_suffix_context_t SUFFIX_CONTEXT, const char
3359           *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
3360           *LENGTHP)
3361  -- Function: uint16_t * u16_ct_toupper (const uint16_t *S, size_t N,
3362           casing_prefix_context_t PREFIX_CONTEXT,
3363           casing_suffix_context_t SUFFIX_CONTEXT, const char
3364           *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
3365           *LENGTHP)
3366  -- Function: uint32_t * u32_ct_toupper (const uint32_t *S, size_t N,
3367           casing_prefix_context_t PREFIX_CONTEXT,
3368           casing_suffix_context_t SUFFIX_CONTEXT, const char
3369           *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
3370           *LENGTHP)
3371      Returns the uppercase mapping of a string that is surrounded by a
3372      prefix and a suffix.
3373
3374  -- Function: uint8_t * u8_ct_tolower (const uint8_t *S, size_t N,
3375           casing_prefix_context_t PREFIX_CONTEXT,
3376           casing_suffix_context_t SUFFIX_CONTEXT, const char
3377           *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
3378           *LENGTHP)
3379  -- Function: uint16_t * u16_ct_tolower (const uint16_t *S, size_t N,
3380           casing_prefix_context_t PREFIX_CONTEXT,
3381           casing_suffix_context_t SUFFIX_CONTEXT, const char
3382           *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
3383           *LENGTHP)
3384  -- Function: uint32_t * u32_ct_tolower (const uint32_t *S, size_t N,
3385           casing_prefix_context_t PREFIX_CONTEXT,
3386           casing_suffix_context_t SUFFIX_CONTEXT, const char
3387           *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
3388           *LENGTHP)
3389      Returns the lowercase mapping of a string that is surrounded by a
3390      prefix and a suffix.
3391
3392  -- Function: uint8_t * u8_ct_totitle (const uint8_t *S, size_t N,
3393           casing_prefix_context_t PREFIX_CONTEXT,
3394           casing_suffix_context_t SUFFIX_CONTEXT, const char
3395           *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
3396           *LENGTHP)
3397  -- Function: uint16_t * u16_ct_totitle (const uint16_t *S, size_t N,
3398           casing_prefix_context_t PREFIX_CONTEXT,
3399           casing_suffix_context_t SUFFIX_CONTEXT, const char
3400           *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
3401           *LENGTHP)
3402  -- Function: uint32_t * u32_ct_totitle (const uint32_t *S, size_t N,
3403           casing_prefix_context_t PREFIX_CONTEXT,
3404           casing_suffix_context_t SUFFIX_CONTEXT, const char
3405           *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
3406           *LENGTHP)
3407      Returns the titlecase mapping of a string that is surrounded by a
3408      prefix and a suffix.
3409
3410    For example, to uppercase the UTF-8 substring between `s +
3411 start_index' and `s + end_index' of a string that extends from `s' to
3412 `s + u8_strlen (s)', you can use the statements
3413
3414      size_t result_length;
3415      uint8_t result =
3416        u8_ct_toupper (s + start_index, end_index - start_index,
3417                       u8_casing_prefix_context (s, start_index),
3418                       u8_casing_suffix_context (s + end_index,
3419                                                 u8_strlen (s) - end_index),
3420                       iso639_language, NULL, NULL, &result_length);
3421
3422 \1f
3423 File: libunistring.info,  Node: Case insensitive comparison,  Next: Case detection,  Prev: Case mappings of substrings,  Up: unicase.h
3424
3425 13.4 Case insensitive comparison
3426 ================================
3427
3428    The following functions implement comparison that ignores
3429 differences in case and normalization.
3430
3431  -- Function: uint8_t * u8_casefold (const uint8_t *S, size_t N, const
3432           char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
3433           size_t *LENGTHP)
3434  -- Function: uint16_t * u16_casefold (const uint16_t *S, size_t N,
3435           const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
3436           *RESULTBUF, size_t *LENGTHP)
3437  -- Function: uint32_t * u32_casefold (const uint32_t *S, size_t N,
3438           const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
3439           *RESULTBUF, size_t *LENGTHP)
3440      Returns the case folded string.
3441
3442      Comparing `u8_casefold (S1)' and `u8_casefold (S2)' with the
3443      `u8_cmp2' function is equivalent to comparing S1 and S2 with
3444      `u8_casecmp'.
3445
3446      The NF argument identifies the normalization form to apply after
3447      the case-mapping.  It can also be NULL, for no normalization.
3448
3449  -- Function: uint8_t * u8_ct_casefold (const uint8_t *S, size_t N,
3450           casing_prefix_context_t PREFIX_CONTEXT,
3451           casing_suffix_context_t SUFFIX_CONTEXT, const char
3452           *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
3453           *LENGTHP)
3454  -- Function: uint16_t * u16_ct_casefold (const uint16_t *S, size_t N,
3455           casing_prefix_context_t PREFIX_CONTEXT,
3456           casing_suffix_context_t SUFFIX_CONTEXT, const char
3457           *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
3458           *LENGTHP)
3459  -- Function: uint32_t * u32_ct_casefold (const uint32_t *S, size_t N,
3460           casing_prefix_context_t PREFIX_CONTEXT,
3461           casing_suffix_context_t SUFFIX_CONTEXT, const char
3462           *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
3463           *LENGTHP)
3464      Returns the case folded string.  The case folding takes into
3465      account the case mapping contexts of the prefix and suffix strings.
3466
3467  -- Function: int u8_casecmp (const uint8_t *S1, size_t N1, const
3468           uint8_t *S2, size_t N2, const char *ISO639_LANGUAGE,
3469           uninorm_t NF, int *RESULTP)
3470  -- Function: int u16_casecmp (const uint16_t *S1, size_t N1, const
3471           uint16_t *S2, size_t N2, const char *ISO639_LANGUAGE,
3472           uninorm_t NF, int *RESULTP)
3473  -- Function: int u32_casecmp (const uint32_t *S1, size_t N1, const
3474           uint32_t *S2, size_t N2, const char *ISO639_LANGUAGE,
3475           uninorm_t NF, int *RESULTP)
3476  -- Function: int ulc_casecmp (const char *S1, size_t N1, const char
3477           *S2, size_t N2, const char *ISO639_LANGUAGE, uninorm_t NF,
3478           int *RESULTP)
3479      Compares S1 and S2, ignoring differences in case and normalization.
3480
3481      The NF argument identifies the normalization form to apply after
3482      the case-mapping.  It can also be NULL, for no normalization.
3483
3484      If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
3485      if S1 > S2, and returns 0.  Upon failure, returns -1 with `errno'
3486      set.
3487
3488    The following functions additionally take into account the sorting
3489 rules of the current locale.
3490
3491  -- Function: char * u8_casexfrm (const uint8_t *S, size_t N, const
3492           char *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
3493           *LENGTHP)
3494  -- Function: char * u16_casexfrm (const uint16_t *S, size_t N, const
3495           char *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
3496           *LENGTHP)
3497  -- Function: char * u32_casexfrm (const uint32_t *S, size_t N, const
3498           char *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
3499           *LENGTHP)
3500  -- Function: char * ulc_casexfrm (const char *S, size_t N, const char
3501           *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
3502           *LENGTHP)
3503      Converts the string S of length N to a NUL-terminated byte
3504      sequence, in such a way that comparing `u8_casexfrm (S1)' and
3505      `u8_casexfrm (S2)' with the gnulib function `memcmp2' is
3506      equivalent to comparing S1 and S2 with `u8_casecoll'.
3507
3508      NF must be either `UNINORM_NFC', `UNINORM_NFKC', or NULL for no
3509      normalization.
3510
3511  -- Function: int u8_casecoll (const uint8_t *S1, size_t N1, const
3512           uint8_t *S2, size_t N2, const char *ISO639_LANGUAGE,
3513           uninorm_t NF, int *RESULTP)
3514  -- Function: int u16_casecoll (const uint16_t *S1, size_t N1, const
3515           uint16_t *S2, size_t N2, const char *ISO639_LANGUAGE,
3516           uninorm_t NF, int *RESULTP)
3517  -- Function: int u32_casecoll (const uint32_t *S1, size_t N1, const
3518           uint32_t *S2, size_t N2, const char *ISO639_LANGUAGE,
3519           uninorm_t NF, int *RESULTP)
3520  -- Function: int ulc_casecoll (const char *S1, size_t N1, const char
3521           *S2, size_t N2, const char *ISO639_LANGUAGE, uninorm_t NF,
3522           int *RESULTP)
3523      Compares S1 and S2, ignoring differences in case and normalization,
3524      using the collation rules of the current locale.
3525
3526      The NF argument identifies the normalization form to apply after
3527      the case-mapping.  It must be either `UNINORM_NFC' or
3528      `UNINORM_NFKC'.  It can also be NULL, for no normalization.
3529
3530      If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
3531      if S1 > S2, and returns 0.  Upon failure, returns -1 with `errno'
3532      set.
3533
3534 \1f
3535 File: libunistring.info,  Node: Case detection,  Prev: Case insensitive comparison,  Up: unicase.h
3536
3537 13.5 Case detection
3538 ===================
3539
3540    The following functions determine whether a Unicode string is
3541 entirely in upper case. or entirely in lower case, or entirely in title
3542 case, or already case-folded.
3543
3544  -- Function: int u8_is_uppercase (const uint8_t *S, size_t N, const
3545           char *ISO639_LANGUAGE, bool *RESULTP)
3546  -- Function: int u16_is_uppercase (const uint16_t *S, size_t N, const
3547           char *ISO639_LANGUAGE, bool *RESULTP)
3548  -- Function: int u32_is_uppercase (const uint32_t *S, size_t N, const
3549           char *ISO639_LANGUAGE, bool *RESULTP)
3550      Sets `*RESULTP' to true if mapping NFD(S) to upper case is a
3551      no-op, or to false otherwise, and returns 0.  Upon failure,
3552      returns -1 with `errno' set.
3553
3554  -- Function: int u8_is_lowercase (const uint8_t *S, size_t N, const
3555           char *ISO639_LANGUAGE, bool *RESULTP)
3556  -- Function: int u16_is_lowercase (const uint16_t *S, size_t N, const
3557           char *ISO639_LANGUAGE, bool *RESULTP)
3558  -- Function: int u32_is_lowercase (const uint32_t *S, size_t N, const
3559           char *ISO639_LANGUAGE, bool *RESULTP)
3560      Sets `*RESULTP' to true if mapping NFD(S) to lower case is a
3561      no-op, or to false otherwise, and returns 0.  Upon failure,
3562      returns -1 with `errno' set.
3563
3564  -- Function: int u8_is_titlecase (const uint8_t *S, size_t N, const
3565           char *ISO639_LANGUAGE, bool *RESULTP)
3566  -- Function: int u16_is_titlecase (const uint16_t *S, size_t N, const
3567           char *ISO639_LANGUAGE, bool *RESULTP)
3568  -- Function: int u32_is_titlecase (const uint32_t *S, size_t N, const
3569           char *ISO639_LANGUAGE, bool *RESULTP)
3570      Sets `*RESULTP' to true if mapping NFD(S) to title case is a
3571      no-op, or to false otherwise, and returns 0.  Upon failure,
3572      returns -1 with `errno' set.
3573
3574  -- Function: int u8_is_casefolded (const uint8_t *S, size_t N, const
3575           char *ISO639_LANGUAGE, bool *RESULTP)
3576  -- Function: int u16_is_casefolded (const uint16_t *S, size_t N, const
3577           char *ISO639_LANGUAGE, bool *RESULTP)
3578  -- Function: int u32_is_casefolded (const uint32_t *S, size_t N, const
3579           char *ISO639_LANGUAGE, bool *RESULTP)
3580      Sets `*RESULTP' to true if applying case folding to NFD(S) is a
3581      no-op, or to false otherwise, and returns 0.  Upon failure,
3582      returns -1 with `errno' set.
3583
3584    The following functions determine whether case mappings have any
3585 effect on a Unicode string.
3586
3587  -- Function: int u8_is_cased (const uint8_t *S, size_t N, const char
3588           *ISO639_LANGUAGE, bool *RESULTP)
3589  -- Function: int u16_is_cased (const uint16_t *S, size_t N, const char
3590           *ISO639_LANGUAGE, bool *RESULTP)
3591  -- Function: int u32_is_cased (const uint32_t *S, size_t N, const char
3592           *ISO639_LANGUAGE, bool *RESULTP)
3593      Sets `*RESULTP' to true if case matters for S, that is, if mapping
3594      NFD(S) to either upper case or lower case or title case is not a
3595      no-op.  Set `*RESULTP' to false if NFD(S) maps to itself under the
3596      upper case mapping, under the lower case mapping, and under the
3597      title case mapping; in other words, when NFD(S) consists entirely
3598      of caseless characters. Upon failure, returns -1 with `errno' set.
3599
3600 \1f
3601 File: libunistring.info,  Node: uniregex.h,  Next: Using the library,  Prev: unicase.h,  Up: Top
3602
3603 14 Regular expressions `<uniregex.h>'
3604 *************************************
3605
3606    This include file is not yet implemented.
3607
3608 \1f
3609 File: libunistring.info,  Node: Using the library,  Next: More functionality,  Prev: uniregex.h,  Up: Top
3610
3611 15 Using the library
3612 ********************
3613
3614    This chapter explains some practical considerations, regarding the
3615 installation and compiler options that are needed in order to use this
3616 library.
3617
3618 * Menu:
3619
3620 * Installation::
3621 * Compiler options::
3622 * Include files::
3623 * Autoconf macro::
3624 * Reporting problems::
3625
3626 \1f
3627 File: libunistring.info,  Node: Installation,  Next: Compiler options,  Up: Using the library
3628
3629 15.1 Installation
3630 =================
3631
3632    Before you can use the library, it must be installed.  First, you
3633 have to make sure all dependencies are installed.  They are listed in
3634 the file `DEPENDENCIES'.
3635
3636    Then you can proceed to build and install the library, as described
3637 in the file `INSTALL'.  For installation on Windows systems, please
3638 refer to the file `README.woe32'.
3639
3640 \1f
3641 File: libunistring.info,  Node: Compiler options,  Next: Include files,  Prev: Installation,  Up: Using the library
3642
3643 15.2 Compiler options
3644 =====================
3645
3646    Let's denote as `LIBUNISTRING_PREFIX' the value of the `--prefix'
3647 option that you passed to `configure' while installing this package.
3648 If you didn't pass any `--prefix' option, then the package is installed
3649 in `/usr/local'.
3650
3651    Let's denote as `LIBUNISTRING_INCLUDEDIR' the directory where the
3652 include files were installed.  This is usually the same as
3653 `${LIBUNISTRING_PREFIX}/include'.  Except that if you passed an
3654 `--includedir' option to `configure', it is the value of that option.
3655
3656    Let's further denote as `LIBUNISTRING_LIBDIR' the directory where
3657 the library itself was installed.  This is the value that you passed
3658 with the `--libdir' option to `configure', or otherwise the same as
3659 `${LIBUNISTRING_PREFIX}/lib'.  Recall that when building in 64-bit mode
3660 on a 64-bit GNU/Linux system that supports executables in either 64-bit
3661 mode or 32-bit mode, you should have used the option
3662 `--libdir=${LIBUNISTRING_PREFIX}/lib64'.
3663
3664    So that the compiler finds the include files, you have to pass it the
3665 option `-I${LIBUNISTRING_INCLUDEDIR}'.
3666
3667    So that the compiler finds the library during its linking pass, you
3668 have to pass it the options `-L${LIBUNISTRING_LIBDIR} -lunistring'.  On
3669 some systems, in some configurations, you also have to pass options
3670 needed for linking with `libiconv'.  The autoconf macro
3671 `gl_LIBUNISTRING' (see *note Autoconf macro::) deals with this
3672 particularity.
3673
3674 \1f
3675 File: libunistring.info,  Node: Include files,  Next: Autoconf macro,  Prev: Compiler options,  Up: Using the library
3676
3677 15.3 Include files
3678 ==================
3679
3680    Most of the include files have been presented in the introduction,
3681 see *note Introduction::, and subsequent detailed chapters.
3682
3683    Another include file is `<unistring/version.h>'. It contains the
3684 version number of the libunistring library.
3685
3686  -- Macro: int _LIBUNISTRING_VERSION
3687      This constant contains the version of libunistring that is being
3688      used at compile time.  It encodes the major and minor parts of the
3689      version number only.  These parts are encoded in the form
3690      `(major<<8) + minor'.
3691
3692  -- Constant: int _libunistring_version
3693      This constant contains the version of libunistring that is being
3694      used at run time.  It encodes the major and minor parts of the
3695      version number only.  These parts are encoded in the form
3696      `(major<<8) + minor'.
3697
3698    It is possible that `_libunistring_version' is greater than
3699 `_LIBUNISTRING_VERSION'.  This can happen when you use `libunistring'
3700 as a shared library, and a newer, binary backward-compatible version
3701 has been installed after your program that uses `libunistring' was
3702 installed.
3703
3704 \1f
3705 File: libunistring.info,  Node: Autoconf macro,  Next: Reporting problems,  Prev: Include files,  Up: Using the library
3706
3707 15.4 Autoconf macro
3708 ===================
3709
3710    GNU Gnulib provides an autoconf macro that tests for the availability
3711 of `libunistring'.  It is contained in the Gnulib module
3712 `libunistring', see
3713 `http://www.gnu.org/software/gnulib/MODULES.html#module=libunistring'.
3714
3715    The macro is called `gl_LIBUNISTRING'.  It searches for an installed
3716 libunistring.  If found, it sets and AC_SUBSTs `HAVE_LIBUNISTRING=yes'
3717 and the `LIBUNISTRING' and `LTLIBUNISTRING' variables and augments the
3718 `CPPFLAGS' variable, and defines the C macro `HAVE_LIBUNISTRING' to 1.
3719 Otherwise, it sets and AC_SUBSTs `HAVE_LIBUNISTRING=no' and
3720 `LIBUNISTRING' and `LTLIBUNISTRING' to empty.
3721
3722    The complexities that `gl_LIBUNISTRING' deals with are the following:
3723
3724    * On some operating systems, in some configurations, libunistring
3725      depends on `libiconv', and the options for linking with libiconv
3726      must be mentioned explicitly on the link command line.
3727
3728    * GNU `libunistring', if installed, is not necessarily already in the
3729      search path (`CPPFLAGS' for the include file search path,
3730      `LDFLAGS' for the library search path).
3731
3732    * GNU `libunistring', if installed, is not necessarily already in the
3733      run time library search path.  To avoid the need for setting an
3734      environment variable like `LD_LIBRARY_PATH', the macro adds the
3735      appropriate run time search path options to the `LIBUNISTRING'
3736      variable.  This works on most systems.
3737
3738 \1f
3739 File: libunistring.info,  Node: Reporting problems,  Prev: Autoconf macro,  Up: Using the library
3740
3741 15.5 Reporting problems
3742 =======================
3743
3744    If you encounter any problem, please don't hesitate to send a
3745 detailed bug report to the `bug-libunistring@gnu.org' mailing list.
3746 You can alternatively also use the bug tracker at the project page
3747 `https://savannah.gnu.org/projects/libunistring'.
3748
3749    Please always include the version number of this library, and a short
3750 description of your operating system and compilation environment with
3751 corresponding version numbers.
3752
3753    For problems that appear while building and installing
3754 `libunistring', for which you don't find the remedy in the `INSTALL'
3755 file, please include a description of the options that you passed to
3756 the `configure' script.
3757
3758 \1f
3759 File: libunistring.info,  Node: More functionality,  Next: Licenses,  Prev: Using the library,  Up: Top
3760
3761 16 More advanced functionality
3762 ******************************
3763
3764    For bidirectional reordering of strings, we recommend the GNU
3765 FriBidi library: `http://www.fribidi.org/'.
3766
3767    For the rendering of Unicode strings outside of the context of a
3768 given toolkit (KDE/Qt or GNOME/Gtk), we recommend the Pango library:
3769 `http://www.pango.org/'.
3770
3771 \1f
3772 File: libunistring.info,  Node: Licenses,  Next: Index,  Prev: More functionality,  Up: Top
3773
3774 Appendix A Licenses
3775 *******************
3776
3777    The files of this package are covered by the licenses indicated in
3778 each particular file or directory.  Here is a summary:
3779
3780    * The `libunistring' library is covered by the GNU Lesser General
3781      Public License (LGPL).  A copy of the license is included in *note
3782      GNU LGPL::.
3783
3784    * This manual is free documentation.  It is dually licensed under the
3785      GNU FDL and the GNU GPL.  This means that you can redistribute this
3786      manual under either of these two licenses, at your choice.
3787      This manual is covered by the GNU FDL.  Permission is granted to
3788      copy, distribute and/or modify this document under the terms of the
3789      GNU Free Documentation License (FDL), either version 1.2 of the
3790      License, or (at your option) any later version published by the
3791      Free Software Foundation (FSF); with no Invariant Sections, with no
3792      Front-Cover Text, and with no Back-Cover Texts.  A copy of the
3793      license is included in *note GNU FDL::.
3794      This manual is covered by the GNU GPL.  You can redistribute it
3795      and/or modify it under the terms of the GNU General Public License
3796      (GPL), either version 3 of the License, or (at your option) any
3797      later version published by the Free Software Foundation (FSF).  A
3798      copy of the license is included in *note GNU GPL::.
3799
3800 * Menu:
3801
3802 * GNU GPL::                     GNU General Public License
3803 * GNU LGPL::                    GNU Lesser General Public License
3804 * GNU FDL::                     GNU Free Documentation License
3805
3806 \1f
3807 File: libunistring.info,  Node: GNU GPL,  Next: GNU LGPL,  Up: Licenses
3808
3809 A.1 GNU GENERAL PUBLIC LICENSE
3810 ==============================
3811
3812                         Version 3, 29 June 2007
3813
3814      Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/'
3815
3816      Everyone is permitted to copy and distribute verbatim copies of this
3817      license document, but changing it is not allowed.
3818
3819 Preamble
3820 ========
3821
3822    The GNU General Public License is a free, copyleft license for
3823 software and other kinds of works.
3824
3825    The licenses for most software and other practical works are designed
3826 to take away your freedom to share and change the works.  By contrast,
3827 the GNU General Public License is intended to guarantee your freedom to
3828 share and change all versions of a program--to make sure it remains
3829 free software for all its users.  We, the Free Software Foundation, use
3830 the GNU General Public License for most of our software; it applies
3831 also to any other work released this way by its authors.  You can apply
3832 it to your programs, too.
3833
3834    When we speak of free software, we are referring to freedom, not
3835 price.  Our General Public Licenses are designed to make sure that you
3836 have the freedom to distribute copies of free software (and charge for
3837 them if you wish), that you receive source code or can get it if you
3838 want it, that you can change the software or use pieces of it in new
3839 free programs, and that you know you can do these things.
3840
3841    To protect your rights, we need to prevent others from denying you
3842 these rights or asking you to surrender the rights.  Therefore, you
3843 have certain responsibilities if you distribute copies of the software,
3844 or if you modify it: responsibilities to respect the freedom of others.
3845
3846    For example, if you distribute copies of such a program, whether
3847 gratis or for a fee, you must pass on to the recipients the same
3848 freedoms that you received.  You must make sure that they, too, receive
3849 or can get the source code.  And you must show them these terms so they
3850 know their rights.
3851
3852    Developers that use the GNU GPL protect your rights with two steps:
3853 (1) assert copyright on the software, and (2) offer you this License
3854 giving you legal permission to copy, distribute and/or modify it.
3855
3856    For the developers' and authors' protection, the GPL clearly explains
3857 that there is no warranty for this free software.  For both users' and
3858 authors' sake, the GPL requires that modified versions be marked as
3859 changed, so that their problems will not be attributed erroneously to
3860 authors of previous versions.
3861
3862    Some devices are designed to deny users access to install or run
3863 modified versions of the software inside them, although the
3864 manufacturer can do so.  This is fundamentally incompatible with the
3865 aim of protecting users' freedom to change the software.  The
3866 systematic pattern of such abuse occurs in the area of products for
3867 individuals to use, which is precisely where it is most unacceptable.
3868 Therefore, we have designed this version of the GPL to prohibit the
3869 practice for those products.  If such problems arise substantially in
3870 other domains, we stand ready to extend this provision to those domains
3871 in future versions of the GPL, as needed to protect the freedom of
3872 users.
3873
3874    Finally, every program is threatened constantly by software patents.
3875 States should not allow patents to restrict development and use of
3876 software on general-purpose computers, but in those that do, we wish to
3877 avoid the special danger that patents applied to a free program could
3878 make it effectively proprietary.  To prevent this, the GPL assures that
3879 patents cannot be used to render the program non-free.
3880
3881    The precise terms and conditions for copying, distribution and
3882 modification follow.
3883
3884 TERMS AND CONDITIONS
3885 ====================
3886
3887   0. Definitions.
3888
3889      "This License" refers to version 3 of the GNU General Public
3890      License.
3891
3892      "Copyright" also means copyright-like laws that apply to other
3893      kinds of works, such as semiconductor masks.
3894
3895      "The Program" refers to any copyrightable work licensed under this
3896      License.  Each licensee is addressed as "you".  "Licensees" and
3897      "recipients" may be individuals or organizations.
3898
3899      To "modify" a work means to copy from or adapt all or part of the
3900      work in a fashion requiring copyright permission, other than the
3901      making of an exact copy.  The resulting work is called a "modified
3902      version" of the earlier work or a work "based on" the earlier work.
3903
3904      A "covered work" means either the unmodified Program or a work
3905      based on the Program.
3906
3907      To "propagate" a work means to do anything with it that, without
3908      permission, would make you directly or secondarily liable for
3909      infringement under applicable copyright law, except executing it
3910      on a computer or modifying a private copy.  Propagation includes
3911      copying, distribution (with or without modification), making
3912      available to the public, and in some countries other activities as
3913      well.
3914
3915      To "convey" a work means any kind of propagation that enables other
3916      parties to make or receive copies.  Mere interaction with a user
3917      through a computer network, with no transfer of a copy, is not
3918      conveying.
3919
3920      An interactive user interface displays "Appropriate Legal Notices"
3921      to the extent that it includes a convenient and prominently visible
3922      feature that (1) displays an appropriate copyright notice, and (2)
3923      tells the user that there is no warranty for the work (except to
3924      the extent that warranties are provided), that licensees may
3925      convey the work under this License, and how to view a copy of this
3926      License.  If the interface presents a list of user commands or
3927      options, such as a menu, a prominent item in the list meets this
3928      criterion.
3929
3930   1. Source Code.
3931
3932      The "source code" for a work means the preferred form of the work
3933      for making modifications to it.  "Object code" means any
3934      non-source form of a work.
3935
3936      A "Standard Interface" means an interface that either is an
3937      official standard defined by a recognized standards body, or, in
3938      the case of interfaces specified for a particular programming
3939      language, one that is widely used among developers working in that
3940      language.
3941
3942      The "System Libraries" of an executable work include anything,
3943      other than the work as a whole, that (a) is included in the normal
3944      form of packaging a Major Component, but which is not part of that
3945      Major Component, and (b) serves only to enable use of the work
3946      with that Major Component, or to implement a Standard Interface
3947      for which an implementation is available to the public in source
3948      code form.  A "Major Component", in this context, means a major
3949      essential component (kernel, window system, and so on) of the
3950      specific operating system (if any) on which the executable work
3951      runs, or a compiler used to produce the work, or an object code
3952      interpreter used to run it.
3953
3954      The "Corresponding Source" for a work in object code form means all
3955      the source code needed to generate, install, and (for an executable
3956      work) run the object code and to modify the work, including
3957      scripts to control those activities.  However, it does not include
3958      the work's System Libraries, or general-purpose tools or generally
3959      available free programs which are used unmodified in performing
3960      those activities but which are not part of the work.  For example,
3961      Corresponding Source includes interface definition files
3962      associated with source files for the work, and the source code for
3963      shared libraries and dynamically linked subprograms that the work
3964      is specifically designed to require, such as by intimate data
3965      communication or control flow between those subprograms and other
3966      parts of the work.
3967
3968      The Corresponding Source need not include anything that users can
3969      regenerate automatically from other parts of the Corresponding
3970      Source.
3971
3972      The Corresponding Source for a work in source code form is that
3973      same work.
3974
3975   2. Basic Permissions.
3976
3977      All rights granted under this License are granted for the term of
3978      copyright on the Program, and are irrevocable provided the stated
3979      conditions are met.  This License explicitly affirms your unlimited
3980      permission to run the unmodified Program.  The output from running
3981      a covered work is covered by this License only if the output,
3982      given its content, constitutes a covered work.  This License
3983      acknowledges your rights of fair use or other equivalent, as
3984      provided by copyright law.
3985
3986      You may make, run and propagate covered works that you do not
3987      convey, without conditions so long as your license otherwise
3988      remains in force.  You may convey covered works to others for the
3989      sole purpose of having them make modifications exclusively for
3990      you, or provide you with facilities for running those works,
3991      provided that you comply with the terms of this License in
3992      conveying all material for which you do not control copyright.
3993      Those thus making or running the covered works for you must do so
3994      exclusively on your behalf, under your direction and control, on
3995      terms that prohibit them from making any copies of your
3996      copyrighted material outside their relationship with you.
3997
3998      Conveying under any other circumstances is permitted solely under
3999      the conditions stated below.  Sublicensing is not allowed; section
4000      10 makes it unnecessary.
4001
4002   3. Protecting Users' Legal Rights From Anti-Circumvention Law.
4003
4004      No covered work shall be deemed part of an effective technological
4005      measure under any applicable law fulfilling obligations under
4006      article 11 of the WIPO copyright treaty adopted on 20 December
4007      1996, or similar laws prohibiting or restricting circumvention of
4008      such measures.
4009
4010      When you convey a covered work, you waive any legal power to forbid
4011      circumvention of technological measures to the extent such
4012      circumvention is effected by exercising rights under this License
4013      with respect to the covered work, and you disclaim any intention
4014      to limit operation or modification of the work as a means of
4015      enforcing, against the work's users, your or third parties' legal
4016      rights to forbid circumvention of technological measures.
4017
4018   4. Conveying Verbatim Copies.
4019
4020      You may convey verbatim copies of the Program's source code as you
4021      receive it, in any medium, provided that you conspicuously and
4022      appropriately publish on each copy an appropriate copyright notice;
4023      keep intact all notices stating that this License and any
4024      non-permissive terms added in accord with section 7 apply to the
4025      code; keep intact all notices of the absence of any warranty; and
4026      give all recipients a copy of this License along with the Program.
4027
4028      You may charge any price or no price for each copy that you convey,
4029      and you may offer support or warranty protection for a fee.
4030
4031   5. Conveying Modified Source Versions.
4032
4033      You may convey a work based on the Program, or the modifications to
4034      produce it from the Program, in the form of source code under the
4035      terms of section 4, provided that you also meet all of these
4036      conditions:
4037
4038        a. The work must carry prominent notices stating that you
4039           modified it, and giving a relevant date.
4040
4041        b. The work must carry prominent notices stating that it is
4042           released under this License and any conditions added under
4043           section 7.  This requirement modifies the requirement in
4044           section 4 to "keep intact all notices".
4045
4046        c. You must license the entire work, as a whole, under this
4047           License to anyone who comes into possession of a copy.  This
4048           License will therefore apply, along with any applicable
4049           section 7 additional terms, to the whole of the work, and all
4050           its parts, regardless of how they are packaged.  This License
4051           gives no permission to license the work in any other way, but
4052           it does not invalidate such permission if you have separately
4053           received it.
4054
4055        d. If the work has interactive user interfaces, each must display
4056           Appropriate Legal Notices; however, if the Program has
4057           interactive interfaces that do not display Appropriate Legal
4058           Notices, your work need not make them do so.
4059
4060      A compilation of a covered work with other separate and independent
4061      works, which are not by their nature extensions of the covered
4062      work, and which are not combined with it such as to form a larger
4063      program, in or on a volume of a storage or distribution medium, is
4064      called an "aggregate" if the compilation and its resulting
4065      copyright are not used to limit the access or legal rights of the
4066      compilation's users beyond what the individual works permit.
4067      Inclusion of a covered work in an aggregate does not cause this
4068      License to apply to the other parts of the aggregate.
4069
4070   6. Conveying Non-Source Forms.
4071
4072      You may convey a covered work in object code form under the terms
4073      of sections 4 and 5, provided that you also convey the
4074      machine-readable Corresponding Source under the terms of this
4075      License, in one of these ways:
4076
4077        a. Convey the object code in, or embodied in, a physical product
4078           (including a physical distribution medium), accompanied by the
4079           Corresponding Source fixed on a durable physical medium
4080           customarily used for software interchange.
4081
4082        b. Convey the object code in, or embodied in, a physical product
4083           (including a physical distribution medium), accompanied by a
4084           written offer, valid for at least three years and valid for
4085           as long as you offer spare parts or customer support for that
4086           product model, to give anyone who possesses the object code
4087           either (1) a copy of the Corresponding Source for all the
4088           software in the product that is covered by this License, on a
4089           durable physical medium customarily used for software
4090           interchange, for a price no more than your reasonable cost of
4091           physically performing this conveying of source, or (2) access
4092           to copy the Corresponding Source from a network server at no
4093           charge.
4094
4095        c. Convey individual copies of the object code with a copy of
4096           the written offer to provide the Corresponding Source.  This
4097           alternative is allowed only occasionally and noncommercially,
4098           and only if you received the object code with such an offer,
4099           in accord with subsection 6b.
4100
4101        d. Convey the object code by offering access from a designated
4102           place (gratis or for a charge), and offer equivalent access
4103           to the Corresponding Source in the same way through the same
4104           place at no further charge.  You need not require recipients
4105           to copy the Corresponding Source along with the object code.
4106           If the place to copy the object code is a network server, the
4107           Corresponding Source may be on a different server (operated
4108           by you or a third party) that supports equivalent copying
4109           facilities, provided you maintain clear directions next to
4110           the object code saying where to find the Corresponding Source.
4111           Regardless of what server hosts the Corresponding Source, you
4112           remain obligated to ensure that it is available for as long
4113           as needed to satisfy these requirements.
4114
4115        e. Convey the object code using peer-to-peer transmission,
4116           provided you inform other peers where the object code and
4117           Corresponding Source of the work are being offered to the
4118           general public at no charge under subsection 6d.
4119
4120
4121      A separable portion of the object code, whose source code is
4122      excluded from the Corresponding Source as a System Library, need
4123      not be included in conveying the object code work.
4124
4125      A "User Product" is either (1) a "consumer product", which means
4126      any tangible personal property which is normally used for personal,
4127      family, or household purposes, or (2) anything designed or sold for
4128      incorporation into a dwelling.  In determining whether a product
4129      is a consumer product, doubtful cases shall be resolved in favor of
4130      coverage.  For a particular product received by a particular user,
4131      "normally used" refers to a typical or common use of that class of
4132      product, regardless of the status of the particular user or of the
4133      way in which the particular user actually uses, or expects or is
4134      expected to use, the product.  A product is a consumer product
4135      regardless of whether the product has substantial commercial,
4136      industrial or non-consumer uses, unless such uses represent the
4137      only significant mode of use of the product.
4138
4139      "Installation Information" for a User Product means any methods,
4140      procedures, authorization keys, or other information required to
4141      install and execute modified versions of a covered work in that
4142      User Product from a modified version of its Corresponding Source.
4143      The information must suffice to ensure that the continued
4144      functioning of the modified object code is in no case prevented or
4145      interfered with solely because modification has been made.
4146
4147      If you convey an object code work under this section in, or with,
4148      or specifically for use in, a User Product, and the conveying
4149      occurs as part of a transaction in which the right of possession
4150      and use of the User Product is transferred to the recipient in
4151      perpetuity or for a fixed term (regardless of how the transaction
4152      is characterized), the Corresponding Source conveyed under this
4153      section must be accompanied by the Installation Information.  But
4154      this requirement does not apply if neither you nor any third party
4155      retains the ability to install modified object code on the User
4156      Product (for example, the work has been installed in ROM).
4157
4158      The requirement to provide Installation Information does not
4159      include a requirement to continue to provide support service,
4160      warranty, or updates for a work that has been modified or
4161      installed by the recipient, or for the User Product in which it
4162      has been modified or installed.  Access to a network may be denied
4163      when the modification itself materially and adversely affects the
4164      operation of the network or violates the rules and protocols for
4165      communication across the network.
4166
4167      Corresponding Source conveyed, and Installation Information
4168      provided, in accord with this section must be in a format that is
4169      publicly documented (and with an implementation available to the
4170      public in source code form), and must require no special password
4171      or key for unpacking, reading or copying.
4172
4173   7. Additional Terms.
4174
4175      "Additional permissions" are terms that supplement the terms of
4176      this License by making exceptions from one or more of its
4177      conditions.  Additional permissions that are applicable to the
4178      entire Program shall be treated as though they were included in
4179      this License, to the extent that they are valid under applicable
4180      law.  If additional permissions apply only to part of the Program,
4181      that part may be used separately under those permissions, but the
4182      entire Program remains governed by this License without regard to
4183      the additional permissions.
4184
4185      When you convey a copy of a covered work, you may at your option
4186      remove any additional permissions from that copy, or from any part
4187      of it.  (Additional permissions may be written to require their own
4188      removal in certain cases when you modify the work.)  You may place
4189      additional permissions on material, added by you to a covered work,
4190      for which you have or can give appropriate copyright permission.
4191
4192      Notwithstanding any other provision of this License, for material
4193      you add to a covered work, you may (if authorized by the copyright
4194      holders of that material) supplement the terms of this License
4195      with terms:
4196
4197        a. Disclaiming warranty or limiting liability differently from
4198           the terms of sections 15 and 16 of this License; or
4199
4200        b. Requiring preservation of specified reasonable legal notices
4201           or author attributions in that material or in the Appropriate
4202           Legal Notices displayed by works containing it; or
4203
4204        c. Prohibiting misrepresentation of the origin of that material,
4205           or requiring that modified versions of such material be
4206           marked in reasonable ways as different from the original
4207           version; or
4208
4209        d. Limiting the use for publicity purposes of names of licensors
4210           or authors of the material; or
4211
4212        e. Declining to grant rights under trademark law for use of some
4213           trade names, trademarks, or service marks; or
4214
4215        f. Requiring indemnification of licensors and authors of that
4216           material by anyone who conveys the material (or modified
4217           versions of it) with contractual assumptions of liability to
4218           the recipient, for any liability that these contractual
4219           assumptions directly impose on those licensors and authors.
4220
4221      All other non-permissive additional terms are considered "further
4222      restrictions" within the meaning of section 10.  If the Program as
4223      you received it, or any part of it, contains a notice stating that
4224      it is governed by this License along with a term that is a further
4225      restriction, you may remove that term.  If a license document
4226      contains a further restriction but permits relicensing or
4227      conveying under this License, you may add to a covered work
4228      material governed by the terms of that license document, provided
4229      that the further restriction does not survive such relicensing or
4230      conveying.
4231
4232      If you add terms to a covered work in accord with this section, you
4233      must place, in the relevant source files, a statement of the
4234      additional terms that apply to those files, or a notice indicating
4235      where to find the applicable terms.
4236
4237      Additional terms, permissive or non-permissive, may be stated in
4238      the form of a separately written license, or stated as exceptions;
4239      the above requirements apply either way.
4240
4241   8. Termination.
4242
4243      You may not propagate or modify a covered work except as expressly
4244      provided under this License.  Any attempt otherwise to propagate or
4245      modify it is void, and will automatically terminate your rights
4246      under this License (including any patent licenses granted under
4247      the third paragraph of section 11).
4248
4249      However, if you cease all violation of this License, then your
4250      license from a particular copyright holder is reinstated (a)
4251      provisionally, unless and until the copyright holder explicitly
4252      and finally terminates your license, and (b) permanently, if the
4253      copyright holder fails to notify you of the violation by some
4254      reasonable means prior to 60 days after the cessation.
4255
4256      Moreover, your license from a particular copyright holder is
4257      reinstated permanently if the copyright holder notifies you of the
4258      violation by some reasonable means, this is the first time you have
4259      received notice of violation of this License (for any work) from
4260      that copyright holder, and you cure the violation prior to 30 days
4261      after your receipt of the notice.
4262
4263      Termination of your rights under this section does not terminate
4264      the licenses of parties who have received copies or rights from
4265      you under this License.  If your rights have been terminated and
4266      not permanently reinstated, you do not qualify to receive new
4267      licenses for the same material under section 10.
4268
4269   9. Acceptance Not Required for Having Copies.
4270
4271      You are not required to accept this License in order to receive or
4272      run a copy of the Program.  Ancillary propagation of a covered work
4273      occurring solely as a consequence of using peer-to-peer
4274      transmission to receive a copy likewise does not require
4275      acceptance.  However, nothing other than this License grants you
4276      permission to propagate or modify any covered work.  These actions
4277      infringe copyright if you do not accept this License.  Therefore,
4278      by modifying or propagating a covered work, you indicate your
4279      acceptance of this License to do so.
4280
4281  10. Automatic Licensing of Downstream Recipients.
4282
4283      Each time you convey a covered work, the recipient automatically
4284      receives a license from the original licensors, to run, modify and
4285      propagate that work, subject to this License.  You are not
4286      responsible for enforcing compliance by third parties with this
4287      License.
4288
4289      An "entity transaction" is a transaction transferring control of an
4290      organization, or substantially all assets of one, or subdividing an
4291      organization, or merging organizations.  If propagation of a
4292      covered work results from an entity transaction, each party to that
4293      transaction who receives a copy of the work also receives whatever
4294      licenses to the work the party's predecessor in interest had or
4295      could give under the previous paragraph, plus a right to
4296      possession of the Corresponding Source of the work from the
4297      predecessor in interest, if the predecessor has it or can get it
4298      with reasonable efforts.
4299
4300      You may not impose any further restrictions on the exercise of the
4301      rights granted or affirmed under this License.  For example, you
4302      may not impose a license fee, royalty, or other charge for
4303      exercise of rights granted under this License, and you may not
4304      initiate litigation (including a cross-claim or counterclaim in a
4305      lawsuit) alleging that any patent claim is infringed by making,
4306      using, selling, offering for sale, or importing the Program or any
4307      portion of it.
4308
4309  11. Patents.
4310
4311      A "contributor" is a copyright holder who authorizes use under this
4312      License of the Program or a work on which the Program is based.
4313      The work thus licensed is called the contributor's "contributor
4314      version".
4315
4316      A contributor's "essential patent claims" are all patent claims
4317      owned or controlled by the contributor, whether already acquired or
4318      hereafter acquired, that would be infringed by some manner,
4319      permitted by this License, of making, using, or selling its
4320      contributor version, but do not include claims that would be
4321      infringed only as a consequence of further modification of the
4322      contributor version.  For purposes of this definition, "control"
4323      includes the right to grant patent sublicenses in a manner
4324      consistent with the requirements of this License.
4325
4326      Each contributor grants you a non-exclusive, worldwide,
4327      royalty-free patent license under the contributor's essential
4328      patent claims, to make, use, sell, offer for sale, import and
4329      otherwise run, modify and propagate the contents of its
4330      contributor version.
4331
4332      In the following three paragraphs, a "patent license" is any
4333      express agreement or commitment, however denominated, not to
4334      enforce a patent (such as an express permission to practice a
4335      patent or covenant not to sue for patent infringement).  To
4336      "grant" such a patent license to a party means to make such an
4337      agreement or commitment not to enforce a patent against the party.
4338
4339      If you convey a covered work, knowingly relying on a patent
4340      license, and the Corresponding Source of the work is not available
4341      for anyone to copy, free of charge and under the terms of this
4342      License, through a publicly available network server or other
4343      readily accessible means, then you must either (1) cause the
4344      Corresponding Source to be so available, or (2) arrange to deprive
4345      yourself of the benefit of the patent license for this particular
4346      work, or (3) arrange, in a manner consistent with the requirements
4347      of this License, to extend the patent license to downstream
4348      recipients.  "Knowingly relying" means you have actual knowledge
4349      that, but for the patent license, your conveying the covered work
4350      in a country, or your recipient's use of the covered work in a
4351      country, would infringe one or more identifiable patents in that
4352      country that you have reason to believe are valid.
4353
4354      If, pursuant to or in connection with a single transaction or
4355      arrangement, you convey, or propagate by procuring conveyance of, a
4356      covered work, and grant a patent license to some of the parties
4357      receiving the covered work authorizing them to use, propagate,
4358      modify or convey a specific copy of the covered work, then the
4359      patent license you grant is automatically extended to all
4360      recipients of the covered work and works based on it.
4361
4362      A patent license is "discriminatory" if it does not include within
4363      the scope of its coverage, prohibits the exercise of, or is
4364      conditioned on the non-exercise of one or more of the rights that
4365      are specifically granted under this License.  You may not convey a
4366      covered work if you are a party to an arrangement with a third
4367      party that is in the business of distributing software, under
4368      which you make payment to the third party based on the extent of
4369      your activity of conveying the work, and under which the third
4370      party grants, to any of the parties who would receive the covered
4371      work from you, a discriminatory patent license (a) in connection
4372      with copies of the covered work conveyed by you (or copies made
4373      from those copies), or (b) primarily for and in connection with
4374      specific products or compilations that contain the covered work,
4375      unless you entered into that arrangement, or that patent license
4376      was granted, prior to 28 March 2007.
4377
4378      Nothing in this License shall be construed as excluding or limiting
4379      any implied license or other defenses to infringement that may
4380      otherwise be available to you under applicable patent law.
4381
4382  12. No Surrender of Others' Freedom.
4383
4384      If conditions are imposed on you (whether by court order,
4385      agreement or otherwise) that contradict the conditions of this
4386      License, they do not excuse you from the conditions of this
4387      License.  If you cannot convey a covered work so as to satisfy
4388      simultaneously your obligations under this License and any other
4389      pertinent obligations, then as a consequence you may not convey it
4390      at all.  For example, if you agree to terms that obligate you to
4391      collect a royalty for further conveying from those to whom you
4392      convey the Program, the only way you could satisfy both those
4393      terms and this License would be to refrain entirely from conveying
4394      the Program.
4395
4396  13. Use with the GNU Affero General Public License.
4397
4398      Notwithstanding any other provision of this License, you have
4399      permission to link or combine any covered work with a work licensed
4400      under version 3 of the GNU Affero General Public License into a
4401      single combined work, and to convey the resulting work.  The terms
4402      of this License will continue to apply to the part which is the
4403      covered work, but the special requirements of the GNU Affero
4404      General Public License, section 13, concerning interaction through
4405      a network will apply to the combination as such.
4406
4407  14. Revised Versions of this License.
4408
4409      The Free Software Foundation may publish revised and/or new
4410      versions of the GNU General Public License from time to time.
4411      Such new versions will be similar in spirit to the present
4412      version, but may differ in detail to address new problems or
4413      concerns.
4414
4415      Each version is given a distinguishing version number.  If the
4416      Program specifies that a certain numbered version of the GNU
4417      General Public License "or any later version" applies to it, you
4418      have the option of following the terms and conditions either of
4419      that numbered version or of any later version published by the
4420      Free Software Foundation.  If the Program does not specify a
4421      version number of the GNU General Public License, you may choose
4422      any version ever published by the Free Software Foundation.
4423
4424      If the Program specifies that a proxy can decide which future
4425      versions of the GNU General Public License can be used, that
4426      proxy's public statement of acceptance of a version permanently
4427      authorizes you to choose that version for the Program.
4428
4429      Later license versions may give you additional or different
4430      permissions.  However, no additional obligations are imposed on any
4431      author or copyright holder as a result of your choosing to follow a
4432      later version.
4433
4434  15. Disclaimer of Warranty.
4435
4436      THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
4437      APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE
4438      COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
4439      WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED,
4440      INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4441      MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE
4442      RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.
4443      SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
4444      NECESSARY SERVICING, REPAIR OR CORRECTION.
4445
4446  16. Limitation of Liability.
4447
4448      IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
4449      WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES
4450      AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU
4451      FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
4452      CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE
4453      THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA
4454      BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
4455      PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
4456      PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF
4457      THE POSSIBILITY OF SUCH DAMAGES.
4458
4459  17. Interpretation of Sections 15 and 16.
4460
4461      If the disclaimer of warranty and limitation of liability provided
4462      above cannot be given local legal effect according to their terms,
4463      reviewing courts shall apply local law that most closely
4464      approximates an absolute waiver of all civil liability in
4465      connection with the Program, unless a warranty or assumption of
4466      liability accompanies a copy of the Program in return for a fee.
4467
4468
4469 END OF TERMS AND CONDITIONS
4470 ===========================
4471
4472 How to Apply These Terms to Your New Programs
4473 =============================================
4474
4475    If you develop a new program, and you want it to be of the greatest
4476 possible use to the public, the best way to achieve this is to make it
4477 free software which everyone can redistribute and change under these
4478 terms.
4479
4480    To do so, attach the following notices to the program.  It is safest
4481 to attach them to the start of each source file to most effectively
4482 state the exclusion of warranty; and each file should have at least the
4483 "copyright" line and a pointer to where the full notice is found.
4484
4485      ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
4486      Copyright (C) YEAR NAME OF AUTHOR
4487
4488      This program is free software: you can redistribute it and/or modify
4489      it under the terms of the GNU General Public License as published by
4490      the Free Software Foundation, either version 3 of the License, or (at
4491      your option) any later version.
4492
4493      This program is distributed in the hope that it will be useful, but
4494      WITHOUT ANY WARRANTY; without even the implied warranty of
4495      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
4496      General Public License for more details.
4497
4498      You should have received a copy of the GNU General Public License
4499      along with this program.  If not, see `http://www.gnu.org/licenses/'.
4500
4501    Also add information on how to contact you by electronic and paper
4502 mail.
4503
4504    If the program does terminal interaction, make it output a short
4505 notice like this when it starts in an interactive mode:
4506
4507      PROGRAM Copyright (C) YEAR NAME OF AUTHOR
4508      This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4509      This is free software, and you are welcome to redistribute it
4510      under certain conditions; type `show c' for details.
4511
4512    The hypothetical commands `show w' and `show c' should show the
4513 appropriate parts of the General Public License.  Of course, your
4514 program's commands might be different; for a GUI interface, you would
4515 use an "about box".
4516
4517    You should also get your employer (if you work as a programmer) or
4518 school, if any, to sign a "copyright disclaimer" for the program, if
4519 necessary.  For more information on this, and how to apply and follow
4520 the GNU GPL, see `http://www.gnu.org/licenses/'.
4521
4522    The GNU General Public License does not permit incorporating your
4523 program into proprietary programs.  If your program is a subroutine
4524 library, you may consider it more useful to permit linking proprietary
4525 applications with the library.  If this is what you want to do, use the
4526 GNU Lesser General Public License instead of this License.  But first,
4527 please read `http://www.gnu.org/philosophy/why-not-lgpl.html'.
4528
4529 \1f
4530 File: libunistring.info,  Node: GNU LGPL,  Next: GNU FDL,  Prev: GNU GPL,  Up: Licenses
4531
4532 A.2 GNU LESSER GENERAL PUBLIC LICENSE
4533 =====================================
4534
4535                         Version 3, 29 June 2007
4536
4537      Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/'
4538
4539      Everyone is permitted to copy and distribute verbatim copies of this
4540      license document, but changing it is not allowed.
4541
4542    This version of the GNU Lesser General Public License incorporates
4543 the terms and conditions of version 3 of the GNU General Public
4544 License, supplemented by the additional permissions listed below.
4545
4546   0. Additional Definitions.
4547
4548      As used herein, "this License" refers to version 3 of the GNU
4549      Lesser General Public License, and the "GNU GPL" refers to version
4550      3 of the GNU General Public License.
4551
4552      "The Library" refers to a covered work governed by this License,
4553      other than an Application or a Combined Work as defined below.
4554
4555      An "Application" is any work that makes use of an interface
4556      provided by the Library, but which is not otherwise based on the
4557      Library.  Defining a subclass of a class defined by the Library is
4558      deemed a mode of using an interface provided by the Library.
4559
4560      A "Combined Work" is a work produced by combining or linking an
4561      Application with the Library.  The particular version of the
4562      Library with which the Combined Work was made is also called the
4563      "Linked Version".
4564
4565      The "Minimal Corresponding Source" for a Combined Work means the
4566      Corresponding Source for the Combined Work, excluding any source
4567      code for portions of the Combined Work that, considered in
4568      isolation, are based on the Application, and not on the Linked
4569      Version.
4570
4571      The "Corresponding Application Code" for a Combined Work means the
4572      object code and/or source code for the Application, including any
4573      data and utility programs needed for reproducing the Combined Work
4574      from the Application, but excluding the System Libraries of the
4575      Combined Work.
4576
4577   1. Exception to Section 3 of the GNU GPL.
4578
4579      You may convey a covered work under sections 3 and 4 of this
4580      License without being bound by section 3 of the GNU GPL.
4581
4582   2. Conveying Modified Versions.
4583
4584      If you modify a copy of the Library, and, in your modifications, a
4585      facility refers to a function or data to be supplied by an
4586      Application that uses the facility (other than as an argument
4587      passed when the facility is invoked), then you may convey a copy
4588      of the modified version:
4589
4590        a. under this License, provided that you make a good faith
4591           effort to ensure that, in the event an Application does not
4592           supply the function or data, the facility still operates, and
4593           performs whatever part of its purpose remains meaningful, or
4594
4595        b. under the GNU GPL, with none of the additional permissions of
4596           this License applicable to that copy.
4597
4598   3. Object Code Incorporating Material from Library Header Files.
4599
4600      The object code form of an Application may incorporate material
4601      from a header file that is part of the Library.  You may convey
4602      such object code under terms of your choice, provided that, if the
4603      incorporated material is not limited to numerical parameters, data
4604      structure layouts and accessors, or small macros, inline functions
4605      and templates (ten or fewer lines in length), you do both of the
4606      following:
4607
4608        a. Give prominent notice with each copy of the object code that
4609           the Library is used in it and that the Library and its use are
4610           covered by this License.
4611
4612        b. Accompany the object code with a copy of the GNU GPL and this
4613           license document.
4614
4615   4. Combined Works.
4616
4617      You may convey a Combined Work under terms of your choice that,
4618      taken together, effectively do not restrict modification of the
4619      portions of the Library contained in the Combined Work and reverse
4620      engineering for debugging such modifications, if you also do each
4621      of the following:
4622
4623        a. Give prominent notice with each copy of the Combined Work that
4624           the Library is used in it and that the Library and its use are
4625           covered by this License.
4626
4627        b. Accompany the Combined Work with a copy of the GNU GPL and
4628           this license document.
4629
4630        c. For a Combined Work that displays copyright notices during
4631           execution, include the copyright notice for the Library among
4632           these notices, as well as a reference directing the user to
4633           the copies of the GNU GPL and this license document.
4634
4635        d. Do one of the following:
4636
4637             0. Convey the Minimal Corresponding Source under the terms
4638                of this License, and the Corresponding Application Code
4639                in a form suitable for, and under terms that permit, the
4640                user to recombine or relink the Application with a
4641                modified version of the Linked Version to produce a
4642                modified Combined Work, in the manner specified by
4643                section 6 of the GNU GPL for conveying Corresponding
4644                Source.
4645
4646             1. Use a suitable shared library mechanism for linking with
4647                the Library.  A suitable mechanism is one that (a) uses
4648                at run time a copy of the Library already present on the
4649                user's computer system, and (b) will operate properly
4650                with a modified version of the Library that is
4651                interface-compatible with the Linked Version.
4652
4653        e. Provide Installation Information, but only if you would
4654           otherwise be required to provide such information under
4655           section 6 of the GNU GPL, and only to the extent that such
4656           information is necessary to install and execute a modified
4657           version of the Combined Work produced by recombining or
4658           relinking the Application with a modified version of the
4659           Linked Version. (If you use option 4d0, the Installation
4660           Information must accompany the Minimal Corresponding Source
4661           and Corresponding Application Code. If you use option 4d1,
4662           you must provide the Installation Information in the manner
4663           specified by section 6 of the GNU GPL for conveying
4664           Corresponding Source.)
4665
4666   5. Combined Libraries.
4667
4668      You may place library facilities that are a work based on the
4669      Library side by side in a single library together with other
4670      library facilities that are not Applications and are not covered
4671      by this License, and convey such a combined library under terms of
4672      your choice, if you do both of the following:
4673
4674        a. Accompany the combined library with a copy of the same work
4675           based on the Library, uncombined with any other library
4676           facilities, conveyed under the terms of this License.
4677
4678        b. Give prominent notice with the combined library that part of
4679           it is a work based on the Library, and explaining where to
4680           find the accompanying uncombined form of the same work.
4681
4682   6. Revised Versions of the GNU Lesser General Public License.
4683
4684      The Free Software Foundation may publish revised and/or new
4685      versions of the GNU Lesser General Public License from time to
4686      time. Such new versions will be similar in spirit to the present
4687      version, but may differ in detail to address new problems or
4688      concerns.
4689
4690      Each version is given a distinguishing version number. If the
4691      Library as you received it specifies that a certain numbered
4692      version of the GNU Lesser General Public License "or any later
4693      version" applies to it, you have the option of following the terms
4694      and conditions either of that published version or of any later
4695      version published by the Free Software Foundation. If the Library
4696      as you received it does not specify a version number of the GNU
4697      Lesser General Public License, you may choose any version of the
4698      GNU Lesser General Public License ever published by the Free
4699      Software Foundation.
4700
4701      If the Library as you received it specifies that a proxy can decide
4702      whether future versions of the GNU Lesser General Public License
4703      shall apply, that proxy's public statement of acceptance of any
4704      version is permanent authorization for you to choose that version
4705      for the Library.
4706
4707
4708 \1f
4709 File: libunistring.info,  Node: GNU FDL,  Prev: GNU LGPL,  Up: Licenses
4710
4711 A.3 GNU Free Documentation License
4712 ==================================
4713
4714                      Version 1.3, 3 November 2008
4715
4716      Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
4717      `http://fsf.org/'
4718
4719      Everyone is permitted to copy and distribute verbatim copies
4720      of this license document, but changing it is not allowed.
4721
4722   0. PREAMBLE
4723
4724      The purpose of this License is to make a manual, textbook, or other
4725      functional and useful document "free" in the sense of freedom: to
4726      assure everyone the effective freedom to copy and redistribute it,
4727      with or without modifying it, either commercially or
4728      noncommercially.  Secondarily, this License preserves for the
4729      author and publisher a way to get credit for their work, while not
4730      being considered responsible for modifications made by others.
4731
4732      This License is a kind of "copyleft", which means that derivative
4733      works of the document must themselves be free in the same sense.
4734      It complements the GNU General Public License, which is a copyleft
4735      license designed for free software.
4736
4737      We have designed this License in order to use it for manuals for
4738      free software, because free software needs free documentation: a
4739      free program should come with manuals providing the same freedoms
4740      that the software does.  But this License is not limited to
4741      software manuals; it can be used for any textual work, regardless
4742      of subject matter or whether it is published as a printed book.
4743      We recommend this License principally for works whose purpose is
4744      instruction or reference.
4745
4746   1. APPLICABILITY AND DEFINITIONS
4747
4748      This License applies to any manual or other work, in any medium,
4749      that contains a notice placed by the copyright holder saying it
4750      can be distributed under the terms of this License.  Such a notice
4751      grants a world-wide, royalty-free license, unlimited in duration,
4752      to use that work under the conditions stated herein.  The
4753      "Document", below, refers to any such manual or work.  Any member
4754      of the public is a licensee, and is addressed as "you".  You
4755      accept the license if you copy, modify or distribute the work in a
4756      way requiring permission under copyright law.
4757
4758      A "Modified Version" of the Document means any work containing the
4759      Document or a portion of it, either copied verbatim, or with
4760      modifications and/or translated into another language.
4761
4762      A "Secondary Section" is a named appendix or a front-matter section
4763      of the Document that deals exclusively with the relationship of the
4764      publishers or authors of the Document to the Document's overall
4765      subject (or to related matters) and contains nothing that could
4766      fall directly within that overall subject.  (Thus, if the Document
4767      is in part a textbook of mathematics, a Secondary Section may not
4768      explain any mathematics.)  The relationship could be a matter of
4769      historical connection with the subject or with related matters, or
4770      of legal, commercial, philosophical, ethical or political position
4771      regarding them.
4772
4773      The "Invariant Sections" are certain Secondary Sections whose
4774      titles are designated, as being those of Invariant Sections, in
4775      the notice that says that the Document is released under this
4776      License.  If a section does not fit the above definition of
4777      Secondary then it is not allowed to be designated as Invariant.
4778      The Document may contain zero Invariant Sections.  If the Document
4779      does not identify any Invariant Sections then there are none.
4780
4781      The "Cover Texts" are certain short passages of text that are
4782      listed, as Front-Cover Texts or Back-Cover Texts, in the notice
4783      that says that the Document is released under this License.  A
4784      Front-Cover Text may be at most 5 words, and a Back-Cover Text may
4785      be at most 25 words.
4786
4787      A "Transparent" copy of the Document means a machine-readable copy,
4788      represented in a format whose specification is available to the
4789      general public, that is suitable for revising the document
4790      straightforwardly with generic text editors or (for images
4791      composed of pixels) generic paint programs or (for drawings) some
4792      widely available drawing editor, and that is suitable for input to
4793      text formatters or for automatic translation to a variety of
4794      formats suitable for input to text formatters.  A copy made in an
4795      otherwise Transparent file format whose markup, or absence of
4796      markup, has been arranged to thwart or discourage subsequent
4797      modification by readers is not Transparent.  An image format is
4798      not Transparent if used for any substantial amount of text.  A
4799      copy that is not "Transparent" is called "Opaque".
4800
4801      Examples of suitable formats for Transparent copies include plain
4802      ASCII without markup, Texinfo input format, LaTeX input format,
4803      SGML or XML using a publicly available DTD, and
4804      standard-conforming simple HTML, PostScript or PDF designed for
4805      human modification.  Examples of transparent image formats include
4806      PNG, XCF and JPG.  Opaque formats include proprietary formats that
4807      can be read and edited only by proprietary word processors, SGML or
4808      XML for which the DTD and/or processing tools are not generally
4809      available, and the machine-generated HTML, PostScript or PDF
4810      produced by some word processors for output purposes only.
4811
4812      The "Title Page" means, for a printed book, the title page itself,
4813      plus such following pages as are needed to hold, legibly, the
4814      material this License requires to appear in the title page.  For
4815      works in formats which do not have any title page as such, "Title
4816      Page" means the text near the most prominent appearance of the
4817      work's title, preceding the beginning of the body of the text.
4818
4819      The "publisher" means any person or entity that distributes copies
4820      of the Document to the public.
4821
4822      A section "Entitled XYZ" means a named subunit of the Document
4823      whose title either is precisely XYZ or contains XYZ in parentheses
4824      following text that translates XYZ in another language.  (Here XYZ
4825      stands for a specific section name mentioned below, such as
4826      "Acknowledgements", "Dedications", "Endorsements", or "History".)
4827      To "Preserve the Title" of such a section when you modify the
4828      Document means that it remains a section "Entitled XYZ" according
4829      to this definition.
4830
4831      The Document may include Warranty Disclaimers next to the notice
4832      which states that this License applies to the Document.  These
4833      Warranty Disclaimers are considered to be included by reference in
4834      this License, but only as regards disclaiming warranties: any other
4835      implication that these Warranty Disclaimers may have is void and
4836      has no effect on the meaning of this License.
4837
4838   2. VERBATIM COPYING
4839
4840      You may copy and distribute the Document in any medium, either
4841      commercially or noncommercially, provided that this License, the
4842      copyright notices, and the license notice saying this License
4843      applies to the Document are reproduced in all copies, and that you
4844      add no other conditions whatsoever to those of this License.  You
4845      may not use technical measures to obstruct or control the reading
4846      or further copying of the copies you make or distribute.  However,
4847      you may accept compensation in exchange for copies.  If you
4848      distribute a large enough number of copies you must also follow
4849      the conditions in section 3.
4850
4851      You may also lend copies, under the same conditions stated above,
4852      and you may publicly display copies.
4853
4854   3. COPYING IN QUANTITY
4855
4856      If you publish printed copies (or copies in media that commonly
4857      have printed covers) of the Document, numbering more than 100, and
4858      the Document's license notice requires Cover Texts, you must
4859      enclose the copies in covers that carry, clearly and legibly, all
4860      these Cover Texts: Front-Cover Texts on the front cover, and
4861      Back-Cover Texts on the back cover.  Both covers must also clearly
4862      and legibly identify you as the publisher of these copies.  The
4863      front cover must present the full title with all words of the
4864      title equally prominent and visible.  You may add other material
4865      on the covers in addition.  Copying with changes limited to the
4866      covers, as long as they preserve the title of the Document and
4867      satisfy these conditions, can be treated as verbatim copying in
4868      other respects.
4869
4870      If the required texts for either cover are too voluminous to fit
4871      legibly, you should put the first ones listed (as many as fit
4872      reasonably) on the actual cover, and continue the rest onto
4873      adjacent pages.
4874
4875      If you publish or distribute Opaque copies of the Document
4876      numbering more than 100, you must either include a
4877      machine-readable Transparent copy along with each Opaque copy, or
4878      state in or with each Opaque copy a computer-network location from
4879      which the general network-using public has access to download
4880      using public-standard network protocols a complete Transparent
4881      copy of the Document, free of added material.  If you use the
4882      latter option, you must take reasonably prudent steps, when you
4883      begin distribution of Opaque copies in quantity, to ensure that
4884      this Transparent copy will remain thus accessible at the stated
4885      location until at least one year after the last time you
4886      distribute an Opaque copy (directly or through your agents or
4887      retailers) of that edition to the public.
4888
4889      It is requested, but not required, that you contact the authors of
4890      the Document well before redistributing any large number of
4891      copies, to give them a chance to provide you with an updated
4892      version of the Document.
4893
4894   4. MODIFICATIONS
4895
4896      You may copy and distribute a Modified Version of the Document
4897      under the conditions of sections 2 and 3 above, provided that you
4898      release the Modified Version under precisely this License, with
4899      the Modified Version filling the role of the Document, thus
4900      licensing distribution and modification of the Modified Version to
4901      whoever possesses a copy of it.  In addition, you must do these
4902      things in the Modified Version:
4903
4904        A. Use in the Title Page (and on the covers, if any) a title
4905           distinct from that of the Document, and from those of
4906           previous versions (which should, if there were any, be listed
4907           in the History section of the Document).  You may use the
4908           same title as a previous version if the original publisher of
4909           that version gives permission.
4910
4911        B. List on the Title Page, as authors, one or more persons or
4912           entities responsible for authorship of the modifications in
4913           the Modified Version, together with at least five of the
4914           principal authors of the Document (all of its principal
4915           authors, if it has fewer than five), unless they release you
4916           from this requirement.
4917
4918        C. State on the Title page the name of the publisher of the
4919           Modified Version, as the publisher.
4920
4921        D. Preserve all the copyright notices of the Document.
4922
4923        E. Add an appropriate copyright notice for your modifications
4924           adjacent to the other copyright notices.
4925
4926        F. Include, immediately after the copyright notices, a license
4927           notice giving the public permission to use the Modified
4928           Version under the terms of this License, in the form shown in
4929           the Addendum below.
4930
4931        G. Preserve in that license notice the full lists of Invariant
4932           Sections and required Cover Texts given in the Document's
4933           license notice.
4934
4935        H. Include an unaltered copy of this License.
4936
4937        I. Preserve the section Entitled "History", Preserve its Title,
4938           and add to it an item stating at least the title, year, new
4939           authors, and publisher of the Modified Version as given on
4940           the Title Page.  If there is no section Entitled "History" in
4941           the Document, create one stating the title, year, authors,
4942           and publisher of the Document as given on its Title Page,
4943           then add an item describing the Modified Version as stated in
4944           the previous sentence.
4945
4946        J. Preserve the network location, if any, given in the Document
4947           for public access to a Transparent copy of the Document, and
4948           likewise the network locations given in the Document for
4949           previous versions it was based on.  These may be placed in
4950           the "History" section.  You may omit a network location for a
4951           work that was published at least four years before the
4952           Document itself, or if the original publisher of the version
4953           it refers to gives permission.
4954
4955        K. For any section Entitled "Acknowledgements" or "Dedications",
4956           Preserve the Title of the section, and preserve in the
4957           section all the substance and tone of each of the contributor
4958           acknowledgements and/or dedications given therein.
4959
4960        L. Preserve all the Invariant Sections of the Document,
4961           unaltered in their text and in their titles.  Section numbers
4962           or the equivalent are not considered part of the section
4963           titles.
4964
4965        M. Delete any section Entitled "Endorsements".  Such a section
4966           may not be included in the Modified Version.
4967
4968        N. Do not retitle any existing section to be Entitled
4969           "Endorsements" or to conflict in title with any Invariant
4970           Section.
4971
4972        O. Preserve any Warranty Disclaimers.
4973
4974      If the Modified Version includes new front-matter sections or
4975      appendices that qualify as Secondary Sections and contain no
4976      material copied from the Document, you may at your option
4977      designate some or all of these sections as invariant.  To do this,
4978      add their titles to the list of Invariant Sections in the Modified
4979      Version's license notice.  These titles must be distinct from any
4980      other section titles.
4981
4982      You may add a section Entitled "Endorsements", provided it contains
4983      nothing but endorsements of your Modified Version by various
4984      parties--for example, statements of peer review or that the text
4985      has been approved by an organization as the authoritative
4986      definition of a standard.
4987
4988      You may add a passage of up to five words as a Front-Cover Text,
4989      and a passage of up to 25 words as a Back-Cover Text, to the end
4990      of the list of Cover Texts in the Modified Version.  Only one
4991      passage of Front-Cover Text and one of Back-Cover Text may be
4992      added by (or through arrangements made by) any one entity.  If the
4993      Document already includes a cover text for the same cover,
4994      previously added by you or by arrangement made by the same entity
4995      you are acting on behalf of, you may not add another; but you may
4996      replace the old one, on explicit permission from the previous
4997      publisher that added the old one.
4998
4999      The author(s) and publisher(s) of the Document do not by this
5000      License give permission to use their names for publicity for or to
5001      assert or imply endorsement of any Modified Version.
5002
5003   5. COMBINING DOCUMENTS
5004
5005      You may combine the Document with other documents released under
5006      this License, under the terms defined in section 4 above for
5007      modified versions, provided that you include in the combination
5008      all of the Invariant Sections of all of the original documents,
5009      unmodified, and list them all as Invariant Sections of your
5010      combined work in its license notice, and that you preserve all
5011      their Warranty Disclaimers.
5012
5013      The combined work need only contain one copy of this License, and
5014      multiple identical Invariant Sections may be replaced with a single
5015      copy.  If there are multiple Invariant Sections with the same name
5016      but different contents, make the title of each such section unique
5017      by adding at the end of it, in parentheses, the name of the
5018      original author or publisher of that section if known, or else a
5019      unique number.  Make the same adjustment to the section titles in
5020      the list of Invariant Sections in the license notice of the
5021      combined work.
5022
5023      In the combination, you must combine any sections Entitled
5024      "History" in the various original documents, forming one section
5025      Entitled "History"; likewise combine any sections Entitled
5026      "Acknowledgements", and any sections Entitled "Dedications".  You
5027      must delete all sections Entitled "Endorsements."
5028
5029   6. COLLECTIONS OF DOCUMENTS
5030
5031      You may make a collection consisting of the Document and other
5032      documents released under this License, and replace the individual
5033      copies of this License in the various documents with a single copy
5034      that is included in the collection, provided that you follow the
5035      rules of this License for verbatim copying of each of the
5036      documents in all other respects.
5037
5038      You may extract a single document from such a collection, and
5039      distribute it individually under this License, provided you insert
5040      a copy of this License into the extracted document, and follow
5041      this License in all other respects regarding verbatim copying of
5042      that document.
5043
5044   7. AGGREGATION WITH INDEPENDENT WORKS
5045
5046      A compilation of the Document or its derivatives with other
5047      separate and independent documents or works, in or on a volume of
5048      a storage or distribution medium, is called an "aggregate" if the
5049      copyright resulting from the compilation is not used to limit the
5050      legal rights of the compilation's users beyond what the individual
5051      works permit.  When the Document is included in an aggregate, this
5052      License does not apply to the other works in the aggregate which
5053      are not themselves derivative works of the Document.
5054
5055      If the Cover Text requirement of section 3 is applicable to these
5056      copies of the Document, then if the Document is less than one half
5057      of the entire aggregate, the Document's Cover Texts may be placed
5058      on covers that bracket the Document within the aggregate, or the
5059      electronic equivalent of covers if the Document is in electronic
5060      form.  Otherwise they must appear on printed covers that bracket
5061      the whole aggregate.
5062
5063   8. TRANSLATION
5064
5065      Translation is considered a kind of modification, so you may
5066      distribute translations of the Document under the terms of section
5067      4.  Replacing Invariant Sections with translations requires special
5068      permission from their copyright holders, but you may include
5069      translations of some or all Invariant Sections in addition to the
5070      original versions of these Invariant Sections.  You may include a
5071      translation of this License, and all the license notices in the
5072      Document, and any Warranty Disclaimers, provided that you also
5073      include the original English version of this License and the
5074      original versions of those notices and disclaimers.  In case of a
5075      disagreement between the translation and the original version of
5076      this License or a notice or disclaimer, the original version will
5077      prevail.
5078
5079      If a section in the Document is Entitled "Acknowledgements",
5080      "Dedications", or "History", the requirement (section 4) to
5081      Preserve its Title (section 1) will typically require changing the
5082      actual title.
5083
5084   9. TERMINATION
5085
5086      You may not copy, modify, sublicense, or distribute the Document
5087      except as expressly provided under this License.  Any attempt
5088      otherwise to copy, modify, sublicense, or distribute it is void,
5089      and will automatically terminate your rights under this License.
5090
5091      However, if you cease all violation of this License, then your
5092      license from a particular copyright holder is reinstated (a)
5093      provisionally, unless and until the copyright holder explicitly
5094      and finally terminates your license, and (b) permanently, if the
5095      copyright holder fails to notify you of the violation by some
5096      reasonable means prior to 60 days after the cessation.
5097
5098      Moreover, your license from a particular copyright holder is
5099      reinstated permanently if the copyright holder notifies you of the
5100      violation by some reasonable means, this is the first time you have
5101      received notice of violation of this License (for any work) from
5102      that copyright holder, and you cure the violation prior to 30 days
5103      after your receipt of the notice.
5104
5105      Termination of your rights under this section does not terminate
5106      the licenses of parties who have received copies or rights from
5107      you under this License.  If your rights have been terminated and
5108      not permanently reinstated, receipt of a copy of some or all of
5109      the same material does not give you any rights to use it.
5110
5111  10. FUTURE REVISIONS OF THIS LICENSE
5112
5113      The Free Software Foundation may publish new, revised versions of
5114      the GNU Free Documentation License from time to time.  Such new
5115      versions will be similar in spirit to the present version, but may
5116      differ in detail to address new problems or concerns.  See
5117      `http://www.gnu.org/copyleft/'.
5118
5119      Each version of the License is given a distinguishing version
5120      number.  If the Document specifies that a particular numbered
5121      version of this License "or any later version" applies to it, you
5122      have the option of following the terms and conditions either of
5123      that specified version or of any later version that has been
5124      published (not as a draft) by the Free Software Foundation.  If
5125      the Document does not specify a version number of this License,
5126      you may choose any version ever published (not as a draft) by the
5127      Free Software Foundation.  If the Document specifies that a proxy
5128      can decide which future versions of this License can be used, that
5129      proxy's public statement of acceptance of a version permanently
5130      authorizes you to choose that version for the Document.
5131
5132  11. RELICENSING
5133
5134      "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
5135      World Wide Web server that publishes copyrightable works and also
5136      provides prominent facilities for anybody to edit those works.  A
5137      public wiki that anybody can edit is an example of such a server.
5138      A "Massive Multiauthor Collaboration" (or "MMC") contained in the
5139      site means any set of copyrightable works thus published on the MMC
5140      site.
5141
5142      "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
5143      license published by Creative Commons Corporation, a not-for-profit
5144      corporation with a principal place of business in San Francisco,
5145      California, as well as future copyleft versions of that license
5146      published by that same organization.
5147
5148      "Incorporate" means to publish or republish a Document, in whole or
5149      in part, as part of another Document.
5150
5151      An MMC is "eligible for relicensing" if it is licensed under this
5152      License, and if all works that were first published under this
5153      License somewhere other than this MMC, and subsequently
5154      incorporated in whole or in part into the MMC, (1) had no cover
5155      texts or invariant sections, and (2) were thus incorporated prior
5156      to November 1, 2008.
5157
5158      The operator of an MMC Site may republish an MMC contained in the
5159      site under CC-BY-SA on the same site at any time before August 1,
5160      2009, provided the MMC is eligible for relicensing.
5161
5162
5163 ADDENDUM: How to use this License for your documents
5164 ====================================================
5165
5166    To use this License in a document you have written, include a copy of
5167 the License in the document and put the following copyright and license
5168 notices just after the title page:
5169
5170        Copyright (C)  YEAR  YOUR NAME.
5171        Permission is granted to copy, distribute and/or modify this document
5172        under the terms of the GNU Free Documentation License, Version 1.3
5173        or any later version published by the Free Software Foundation;
5174        with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
5175        Texts.  A copy of the license is included in the section entitled ``GNU
5176        Free Documentation License''.
5177
5178    If you have Invariant Sections, Front-Cover Texts and Back-Cover
5179 Texts, replace the "with...Texts." line with this:
5180
5181          with the Invariant Sections being LIST THEIR TITLES, with
5182          the Front-Cover Texts being LIST, and with the Back-Cover Texts
5183          being LIST.
5184
5185    If you have Invariant Sections without Cover Texts, or some other
5186 combination of the three, merge those two alternatives to suit the
5187 situation.
5188
5189    If your document contains nontrivial examples of program code, we
5190 recommend releasing these examples in parallel under your choice of
5191 free software license, such as the GNU General Public License, to
5192 permit their use in free software.
5193
5194 \1f
5195 File: libunistring.info,  Node: Index,  Prev: Licenses,  Up: Top
5196
5197 Index
5198 *****
5199
5200 \0\b[index\0\b]
5201 * Menu:
5202
5203 * ambiguous width:                       uniwidth.h.          (line  10)
5204 * argument conventions:                  Conventions.         (line   9)
5205 * autoconf macro:                        Autoconf macro.      (line   6)
5206 * bidirectional category:                Bidirectional category.
5207                                                               (line   6)
5208 * bidirectional reordering:              More functionality.  (line   6)
5209 * block:                                 Blocks.              (line   6)
5210 * breaks, line:                          unilbrk.h.           (line   6)
5211 * breaks, word:                          uniwbrk.h.           (line   6)
5212 * bug reports:                           Reporting problems.  (line   6)
5213 * bug tracker:                           Reporting problems.  (line   6)
5214 * C string functions:                    char * strings.      (line   6)
5215 * C, programming language:               ISO C and Java syntax.
5216                                                               (line   6)
5217 * C-like API:                            Classifications like in ISO C.
5218                                                               (line   6)
5219 * canonical combining class:             Canonical combining class.
5220                                                               (line   6)
5221 * case detection:                        Case detection.      (line   6)
5222 * case mappings:                         Case mappings of strings.
5223                                                               (line   6)
5224 * casing_prefix_context_t:               Case mappings of substrings.
5225                                                               (line  15)
5226 * casing_suffix_context_t:               Case mappings of substrings.
5227                                                               (line  46)
5228 * char, type:                            char * strings.      (line  23)
5229 * combining, Unicode characters:         Composition of characters.
5230                                                               (line   6)
5231 * comparing <1>:                         Elementary string functions on NUL terminated strings.
5232                                                               (line 130)
5233 * comparing:                             Elementary string functions.
5234                                                               (line 108)
5235 * comparing, ignoring case:              Case insensitive comparison.
5236                                                               (line   6)
5237 * comparing, ignoring case, with collation rules: Case insensitive comparison.
5238                                                               (line  66)
5239 * comparing, ignoring normalization:     Normalizing comparisons.
5240                                                               (line   6)
5241 * comparing, ignoring normalization and case: Case insensitive comparison.
5242                                                               (line   6)
5243 * comparing, ignoring normalization and case, with collation rules: Case insensitive comparison.
5244                                                               (line  66)
5245 * comparing, ignoring normalization, with collation rules: Normalizing comparisons.
5246                                                               (line  23)
5247 * comparing, with collation rules:       Elementary string functions on NUL terminated strings.
5248                                                               (line 142)
5249 * comparing, with collation rules, ignoring case: Case insensitive comparison.
5250                                                               (line  66)
5251 * comparing, with collation rules, ignoring normalization: Normalizing comparisons.
5252                                                               (line  23)
5253 * comparing, with collation rules, ignoring normalization and case: Case insensitive comparison.
5254                                                               (line  66)
5255 * compiler options:                      Compiler options.    (line  24)
5256 * composing, Unicode characters:         Composition of characters.
5257                                                               (line   6)
5258 * converting <1>:                        uniconv.h.           (line  45)
5259 * converting:                            Elementary string conversions.
5260                                                               (line   6)
5261 * copying <1>:                           Elementary string functions on NUL terminated strings.
5262                                                               (line  61)
5263 * copying:                               Elementary string functions.
5264                                                               (line  72)
5265 * counting:                              Elementary string functions.
5266                                                               (line 153)
5267 * decomposing:                           Decomposition of characters.
5268                                                               (line   6)
5269 * dependencies:                          Installation.        (line   6)
5270 * detecting case:                        Case detection.      (line   6)
5271 * duplicating <1>:                       Elementary string functions on NUL terminated strings.
5272                                                               (line 168)
5273 * duplicating:                           Elementary string functions with memory allocation.
5274                                                               (line   6)
5275 * enum iconv_ilseq_handler:              uniconv.h.           (line  30)
5276 * FDL, GNU Free Documentation License:   GNU FDL.             (line   6)
5277 * formatted output:                      unistdio.h.          (line   6)
5278 * fullwidth:                             uniwidth.h.          (line  22)
5279 * general category:                      General category.    (line   6)
5280 * gl_LIBUNISTRING:                       Autoconf macro.      (line  11)
5281 * GPL, GNU General Public License:       GNU GPL.             (line   6)
5282 * halfwidth:                             uniwidth.h.          (line  22)
5283 * identifiers:                           ISO C and Java syntax.
5284                                                               (line   6)
5285 * installation:                          Installation.        (line  10)
5286 * internationalization:                  Unicode and i18n.    (line   6)
5287 * iterating <1>:                         Elementary string functions on NUL terminated strings.
5288                                                               (line  15)
5289 * iterating:                             Elementary string functions.
5290                                                               (line   6)
5291 * Java, programming language:            ISO C and Java syntax.
5292                                                               (line   6)
5293 * LGPL, GNU Lesser General Public License: GNU LGPL.          (line   6)
5294 * License, GNU FDL:                      GNU FDL.             (line   6)
5295 * License, GNU GPL:                      GNU GPL.             (line   6)
5296 * License, GNU LGPL:                     GNU LGPL.            (line   6)
5297 * Licenses:                              Licenses.            (line   6)
5298 * line breaks:                           unilbrk.h.           (line   6)
5299 * locale:                                Locale encodings.    (line   6)
5300 * locale categories:                     Locale encodings.    (line  10)
5301 * locale encoding <1>:                   uniconv.h.           (line  10)
5302 * locale encoding:                       Locale encodings.    (line  28)
5303 * locale language:                       Case mappings of strings.
5304                                                               (line  16)
5305 * locale, multibyte:                     char * strings.      (line  13)
5306 * locale_charset:                        uniconv.h.           (line  13)
5307 * lowercasing:                           Case mappings of strings.
5308                                                               (line   6)
5309 * mailing list:                          Reporting problems.  (line   6)
5310 * mirroring, of Unicode character:       Mirrored character.  (line   6)
5311 * normal forms:                          uninorm.h.           (line   6)
5312 * normalizing:                           uninorm.h.           (line   6)
5313 * output, formatted:                     unistdio.h.          (line   6)
5314 * properties, of Unicode character:      Properties.          (line   6)
5315 * regular expression:                    uniregex.h.          (line   6)
5316 * rendering:                             More functionality.  (line   9)
5317 * return value conventions:              Conventions.         (line  47)
5318 * scripts:                               Scripts.             (line   6)
5319 * searching, for a character <1>:        Elementary string functions on NUL terminated strings.
5320                                                               (line 178)
5321 * searching, for a character:            Elementary string functions.
5322                                                               (line 140)
5323 * searching, for a substring:            Elementary string functions on NUL terminated strings.
5324                                                               (line 234)
5325 * stream, normalizing a:                 Normalization of streams.
5326                                                               (line   6)
5327 * struct uninorm_filter:                 Normalization of streams.
5328                                                               (line  11)
5329 * titlecasing:                           Case mappings of strings.
5330                                                               (line   6)
5331 * u16_asnprintf:                         unistdio.h.          (line 132)
5332 * u16_asprintf:                          unistdio.h.          (line 129)
5333 * u16_casecmp:                           Case insensitive comparison.
5334                                                               (line  51)
5335 * u16_casecoll:                          Case insensitive comparison.
5336                                                               (line  95)
5337 * u16_casefold:                          Case insensitive comparison.
5338                                                               (line  15)
5339 * u16_casexfrm:                          Case insensitive comparison.
5340                                                               (line  75)
5341 * u16_casing_prefix_context:             Case mappings of substrings.
5342                                                               (line  30)
5343 * u16_casing_prefixes_context:           Case mappings of substrings.
5344                                                               (line  39)
5345 * u16_casing_suffix_context:             Case mappings of substrings.
5346                                                               (line  61)
5347 * u16_casing_suffixes_context:           Case mappings of substrings.
5348                                                               (line  70)
5349 * u16_check:                             Elementary string checks.
5350                                                               (line  11)
5351 * u16_chr:                               Elementary string functions.
5352                                                               (line 145)
5353 * u16_cmp:                               Elementary string functions.
5354                                                               (line 115)
5355 * u16_cmp2:                              Elementary string functions.
5356                                                               (line 131)
5357 * u16_conv_from_encoding:                uniconv.h.           (line  54)
5358 * u16_conv_to_encoding:                  uniconv.h.           (line  91)
5359 * u16_cpy:                               Elementary string functions.
5360                                                               (line  78)
5361 * u16_cpy_alloc:                         Elementary string functions with memory allocation.
5362                                                               (line  10)
5363 * u16_ct_casefold:                       Case insensitive comparison.
5364                                                               (line  37)
5365 * u16_ct_tolower:                        Case mappings of substrings.
5366                                                               (line 107)
5367 * u16_ct_totitle:                        Case mappings of substrings.
5368                                                               (line 125)
5369 * u16_ct_toupper:                        Case mappings of substrings.
5370                                                               (line  89)
5371 * u16_endswith:                          Elementary string functions on NUL terminated strings.
5372                                                               (line 260)
5373 * u16_is_cased:                          Case detection.      (line  57)
5374 * u16_is_casefolded:                     Case detection.      (line  44)
5375 * u16_is_lowercase:                      Case detection.      (line  24)
5376 * u16_is_titlecase:                      Case detection.      (line  34)
5377 * u16_is_uppercase:                      Case detection.      (line  14)
5378 * u16_mblen:                             Elementary string functions.
5379                                                               (line  11)
5380 * u16_mbsnlen:                           Elementary string functions.
5381                                                               (line 157)
5382 * u16_mbtouc:                            Elementary string functions.
5383                                                               (line  38)
5384 * u16_mbtouc_unsafe:                     Elementary string functions.
5385                                                               (line  23)
5386 * u16_mbtoucr:                           Elementary string functions.
5387                                                               (line  45)
5388 * u16_move:                              Elementary string functions.
5389                                                               (line  89)
5390 * u16_next:                              Elementary string functions on NUL terminated strings.
5391                                                               (line  24)
5392 * u16_normalize:                         Normalization of strings.
5393                                                               (line  50)
5394 * u16_normcmp:                           Normalizing comparisons.
5395                                                               (line  13)
5396 * u16_normcoll:                          Normalizing comparisons.
5397                                                               (line  40)
5398 * u16_normxfrm:                          Normalizing comparisons.
5399                                                               (line  27)
5400 * u16_possible_linebreaks:               unilbrk.h.           (line  46)
5401 * u16_prev:                              Elementary string functions on NUL terminated strings.
5402                                                               (line  36)
5403 * u16_set:                               Elementary string functions.
5404                                                               (line 101)
5405 * u16_snprintf:                          unistdio.h.          (line 126)
5406 * u16_sprintf:                           unistdio.h.          (line 123)
5407 * u16_startswith:                        Elementary string functions on NUL terminated strings.
5408                                                               (line 252)
5409 * u16_stpcpy:                            Elementary string functions on NUL terminated strings.
5410                                                               (line  76)
5411 * u16_stpncpy:                           Elementary string functions on NUL terminated strings.
5412                                                               (line  99)
5413 * u16_strcat:                            Elementary string functions on NUL terminated strings.
5414                                                               (line 112)
5415 * u16_strchr:                            Elementary string functions on NUL terminated strings.
5416                                                               (line 182)
5417 * u16_strcmp:                            Elementary string functions on NUL terminated strings.
5418                                                               (line 134)
5419 * u16_strcoll:                           Elementary string functions on NUL terminated strings.
5420                                                               (line 144)
5421 * u16_strconv_from_encoding:             uniconv.h.           (line 129)
5422 * u16_strconv_from_locale:               uniconv.h.           (line 157)
5423 * u16_strconv_to_encoding:               uniconv.h.           (line 142)
5424 * u16_strconv_to_locale:                 uniconv.h.           (line 167)
5425 * u16_strcpy:                            Elementary string functions on NUL terminated strings.
5426                                                               (line  66)
5427 * u16_strcspn:                           Elementary string functions on NUL terminated strings.
5428                                                               (line 203)
5429 * u16_strdup:                            Elementary string functions on NUL terminated strings.
5430                                                               (line 172)
5431 * u16_strlen:                            Elementary string functions on NUL terminated strings.
5432                                                               (line  47)
5433 * u16_strmblen:                          Elementary string functions on NUL terminated strings.
5434                                                               (line  11)
5435 * u16_strmbtouc:                         Elementary string functions on NUL terminated strings.
5436                                                               (line  17)
5437 * u16_strncat:                           Elementary string functions on NUL terminated strings.
5438                                                               (line 123)
5439 * u16_strncmp:                           Elementary string functions on NUL terminated strings.
5440                                                               (line 161)
5441 * u16_strncpy:                           Elementary string functions on NUL terminated strings.
5442                                                               (line  88)
5443 * u16_strnlen:                           Elementary string functions on NUL terminated strings.
5444                                                               (line  55)
5445 * u16_strpbrk:                           Elementary string functions on NUL terminated strings.
5446                                                               (line 227)
5447 * u16_strrchr:                           Elementary string functions on NUL terminated strings.
5448                                                               (line 190)
5449 * u16_strspn:                            Elementary string functions on NUL terminated strings.
5450                                                               (line 215)
5451 * u16_strstr:                            Elementary string functions on NUL terminated strings.
5452                                                               (line 241)
5453 * u16_strtok:                            Elementary string functions on NUL terminated strings.
5454                                                               (line 270)
5455 * u16_strwidth:                          uniwidth.h.          (line  39)
5456 * u16_to_u32:                            Elementary string conversions.
5457                                                               (line  23)
5458 * u16_to_u8:                             Elementary string conversions.
5459                                                               (line  19)
5460 * u16_tolower:                           Case mappings of strings.
5461                                                               (line  44)
5462 * u16_totitle:                           Case mappings of strings.
5463                                                               (line  58)
5464 * u16_toupper:                           Case mappings of strings.
5465                                                               (line  30)
5466 * u16_u16_asnprintf:                     unistdio.h.          (line 159)
5467 * u16_u16_asprintf:                      unistdio.h.          (line 156)
5468 * u16_u16_snprintf:                      unistdio.h.          (line 153)
5469 * u16_u16_sprintf:                       unistdio.h.          (line 150)
5470 * u16_u16_vasnprintf:                    unistdio.h.          (line 171)
5471 * u16_u16_vasprintf:                     unistdio.h.          (line 168)
5472 * u16_u16_vsnprintf:                     unistdio.h.          (line 165)
5473 * u16_u16_vsprintf:                      unistdio.h.          (line 162)
5474 * u16_uctomb:                            Elementary string functions.
5475                                                               (line  62)
5476 * u16_vasnprintf:                        unistdio.h.          (line 144)
5477 * u16_vasprintf:                         unistdio.h.          (line 141)
5478 * u16_vsnprintf:                         unistdio.h.          (line 138)
5479 * u16_vsprintf:                          unistdio.h.          (line 135)
5480 * u16_width:                             uniwidth.h.          (line  31)
5481 * u16_width_linebreaks:                  unilbrk.h.           (line  65)
5482 * u16_wordbreaks:                        Word breaks in a string.
5483                                                               (line  10)
5484 * u32_asnprintf:                         unistdio.h.          (line 185)
5485 * u32_asprintf:                          unistdio.h.          (line 182)
5486 * u32_casecmp:                           Case insensitive comparison.
5487                                                               (line  54)
5488 * u32_casecoll:                          Case insensitive comparison.
5489                                                               (line  98)
5490 * u32_casefold:                          Case insensitive comparison.
5491                                                               (line  18)
5492 * u32_casexfrm:                          Case insensitive comparison.
5493                                                               (line  78)
5494 * u32_casing_prefix_context:             Case mappings of substrings.
5495                                                               (line  32)
5496 * u32_casing_prefixes_context:           Case mappings of substrings.
5497                                                               (line  42)
5498 * u32_casing_suffix_context:             Case mappings of substrings.
5499                                                               (line  63)
5500 * u32_casing_suffixes_context:           Case mappings of substrings.
5501                                                               (line  73)
5502 * u32_check:                             Elementary string checks.
5503                                                               (line  12)
5504 * u32_chr:                               Elementary string functions.
5505                                                               (line 147)
5506 * u32_cmp:                               Elementary string functions.
5507                                                               (line 117)
5508 * u32_cmp2:                              Elementary string functions.
5509                                                               (line 133)
5510 * u32_conv_from_encoding:                uniconv.h.           (line  57)
5511 * u32_conv_to_encoding:                  uniconv.h.           (line  94)
5512 * u32_cpy:                               Elementary string functions.
5513                                                               (line  80)
5514 * u32_cpy_alloc:                         Elementary string functions with memory allocation.
5515                                                               (line  11)
5516 * u32_ct_casefold:                       Case insensitive comparison.
5517                                                               (line  42)
5518 * u32_ct_tolower:                        Case mappings of substrings.
5519                                                               (line 112)
5520 * u32_ct_totitle:                        Case mappings of substrings.
5521                                                               (line 130)
5522 * u32_ct_toupper:                        Case mappings of substrings.
5523                                                               (line  94)
5524 * u32_endswith:                          Elementary string functions on NUL terminated strings.
5525                                                               (line 262)
5526 * u32_is_cased:                          Case detection.      (line  59)
5527 * u32_is_casefolded:                     Case detection.      (line  46)
5528 * u32_is_lowercase:                      Case detection.      (line  26)
5529 * u32_is_titlecase:                      Case detection.      (line  36)
5530 * u32_is_uppercase:                      Case detection.      (line  16)
5531 * u32_mblen:                             Elementary string functions.
5532                                                               (line  12)
5533 * u32_mbsnlen:                           Elementary string functions.
5534                                                               (line 158)
5535 * u32_mbtouc:                            Elementary string functions.
5536                                                               (line  39)
5537 * u32_mbtouc_unsafe:                     Elementary string functions.
5538                                                               (line  25)
5539 * u32_mbtoucr:                           Elementary string functions.
5540                                                               (line  46)
5541 * u32_move:                              Elementary string functions.
5542                                                               (line  91)
5543 * u32_next:                              Elementary string functions on NUL terminated strings.
5544                                                               (line  25)
5545 * u32_normalize:                         Normalization of strings.
5546                                                               (line  52)
5547 * u32_normcmp:                           Normalizing comparisons.
5548                                                               (line  15)
5549 * u32_normcoll:                          Normalizing comparisons.
5550                                                               (line  42)
5551 * u32_normxfrm:                          Normalizing comparisons.
5552                                                               (line  29)
5553 * u32_possible_linebreaks:               unilbrk.h.           (line  48)
5554 * u32_prev:                              Elementary string functions on NUL terminated strings.
5555                                                               (line  38)
5556 * u32_set:                               Elementary string functions.
5557                                                               (line 102)
5558 * u32_snprintf:                          unistdio.h.          (line 179)
5559 * u32_sprintf:                           unistdio.h.          (line 176)
5560 * u32_startswith:                        Elementary string functions on NUL terminated strings.
5561                                                               (line 254)
5562 * u32_stpcpy:                            Elementary string functions on NUL terminated strings.
5563                                                               (line  78)
5564 * u32_stpncpy:                           Elementary string functions on NUL terminated strings.
5565                                                               (line 101)
5566 * u32_strcat:                            Elementary string functions on NUL terminated strings.
5567                                                               (line 114)
5568 * u32_strchr:                            Elementary string functions on NUL terminated strings.
5569                                                               (line 183)
5570 * u32_strcmp:                            Elementary string functions on NUL terminated strings.
5571                                                               (line 135)
5572 * u32_strcoll:                           Elementary string functions on NUL terminated strings.
5573                                                               (line 145)
5574 * u32_strconv_from_encoding:             uniconv.h.           (line 131)
5575 * u32_strconv_from_locale:               uniconv.h.           (line 158)
5576 * u32_strconv_to_encoding:               uniconv.h.           (line 144)
5577 * u32_strconv_to_locale:                 uniconv.h.           (line 168)
5578 * u32_strcpy:                            Elementary string functions on NUL terminated strings.
5579                                                               (line  68)
5580 * u32_strcspn:                           Elementary string functions on NUL terminated strings.
5581                                                               (line 205)
5582 * u32_strdup:                            Elementary string functions on NUL terminated strings.
5583                                                               (line 173)
5584 * u32_strlen:                            Elementary string functions on NUL terminated strings.
5585                                                               (line  48)
5586 * u32_strmblen:                          Elementary string functions on NUL terminated strings.
5587                                                               (line  12)
5588 * u32_strmbtouc:                         Elementary string functions on NUL terminated strings.
5589                                                               (line  18)
5590 * u32_strncat:                           Elementary string functions on NUL terminated strings.
5591                                                               (line 125)
5592 * u32_strncmp:                           Elementary string functions on NUL terminated strings.
5593                                                               (line 163)
5594 * u32_strncpy:                           Elementary string functions on NUL terminated strings.
5595                                                               (line  90)
5596 * u32_strnlen:                           Elementary string functions on NUL terminated strings.
5597                                                               (line  56)
5598 * u32_strpbrk:                           Elementary string functions on NUL terminated strings.
5599                                                               (line 229)
5600 * u32_strrchr:                           Elementary string functions on NUL terminated strings.
5601                                                               (line 191)
5602 * u32_strspn:                            Elementary string functions on NUL terminated strings.
5603                                                               (line 217)
5604 * u32_strstr:                            Elementary string functions on NUL terminated strings.
5605                                                               (line 243)
5606 * u32_strtok:                            Elementary string functions on NUL terminated strings.
5607                                                               (line 272)
5608 * u32_strwidth:                          uniwidth.h.          (line  40)
5609 * u32_to_u16:                            Elementary string conversions.
5610                                                               (line  31)
5611 * u32_to_u8:                             Elementary string conversions.
5612                                                               (line  27)
5613 * u32_tolower:                           Case mappings of strings.
5614                                                               (line  47)
5615 * u32_totitle:                           Case mappings of strings.
5616                                                               (line  61)
5617 * u32_toupper:                           Case mappings of strings.
5618                                                               (line  33)
5619 * u32_u32_asnprintf:                     unistdio.h.          (line 212)
5620 * u32_u32_asprintf:                      unistdio.h.          (line 209)
5621 * u32_u32_snprintf:                      unistdio.h.          (line 206)
5622 * u32_u32_sprintf:                       unistdio.h.          (line 203)
5623 * u32_u32_vasnprintf:                    unistdio.h.          (line 224)
5624 * u32_u32_vasprintf:                     unistdio.h.          (line 221)
5625 * u32_u32_vsnprintf:                     unistdio.h.          (line 218)
5626 * u32_u32_vsprintf:                      unistdio.h.          (line 215)
5627 * u32_uctomb:                            Elementary string functions.
5628                                                               (line  63)
5629 * u32_vasnprintf:                        unistdio.h.          (line 197)
5630 * u32_vasprintf:                         unistdio.h.          (line 194)
5631 * u32_vsnprintf:                         unistdio.h.          (line 191)
5632 * u32_vsprintf:                          unistdio.h.          (line 188)
5633 * u32_width:                             uniwidth.h.          (line  33)
5634 * u32_width_linebreaks:                  unilbrk.h.           (line  68)
5635 * u32_wordbreaks:                        Word breaks in a string.
5636                                                               (line  11)
5637 * u8_asnprintf:                          unistdio.h.          (line  79)
5638 * u8_asprintf:                           unistdio.h.          (line  76)
5639 * u8_casecmp:                            Case insensitive comparison.
5640                                                               (line  48)
5641 * u8_casecoll:                           Case insensitive comparison.
5642                                                               (line  92)
5643 * u8_casefold:                           Case insensitive comparison.
5644                                                               (line  12)
5645 * u8_casexfrm:                           Case insensitive comparison.
5646                                                               (line  72)
5647 * u8_casing_prefix_context:              Case mappings of substrings.
5648                                                               (line  28)
5649 * u8_casing_prefixes_context:            Case mappings of substrings.
5650                                                               (line  36)
5651 * u8_casing_suffix_context:              Case mappings of substrings.
5652                                                               (line  59)
5653 * u8_casing_suffixes_context:            Case mappings of substrings.
5654                                                               (line  67)
5655 * u8_check:                              Elementary string checks.
5656                                                               (line  10)
5657 * u8_chr:                                Elementary string functions.
5658                                                               (line 143)
5659 * u8_cmp:                                Elementary string functions.
5660                                                               (line 113)
5661 * u8_cmp2:                               Elementary string functions.
5662                                                               (line 129)
5663 * u8_conv_from_encoding:                 uniconv.h.           (line  51)
5664 * u8_conv_to_encoding:                   uniconv.h.           (line  88)
5665 * u8_cpy:                                Elementary string functions.
5666                                                               (line  76)
5667 * u8_cpy_alloc:                          Elementary string functions with memory allocation.
5668                                                               (line   9)
5669 * u8_ct_casefold:                        Case insensitive comparison.
5670                                                               (line  32)
5671 * u8_ct_tolower:                         Case mappings of substrings.
5672                                                               (line 102)
5673 * u8_ct_totitle:                         Case mappings of substrings.
5674                                                               (line 120)
5675 * u8_ct_toupper:                         Case mappings of substrings.
5676                                                               (line  84)
5677 * u8_endswith:                           Elementary string functions on NUL terminated strings.
5678                                                               (line 258)
5679 * u8_is_cased:                           Case detection.      (line  55)
5680 * u8_is_casefolded:                      Case detection.      (line  42)
5681 * u8_is_lowercase:                       Case detection.      (line  22)
5682 * u8_is_titlecase:                       Case detection.      (line  32)
5683 * u8_is_uppercase:                       Case detection.      (line  12)
5684 * u8_mblen:                              Elementary string functions.
5685                                                               (line  10)
5686 * u8_mbsnlen:                            Elementary string functions.
5687                                                               (line 156)
5688 * u8_mbtouc:                             Elementary string functions.
5689                                                               (line  37)
5690 * u8_mbtouc_unsafe:                      Elementary string functions.
5691                                                               (line  21)
5692 * u8_mbtoucr:                            Elementary string functions.
5693                                                               (line  44)
5694 * u8_move:                               Elementary string functions.
5695                                                               (line  87)
5696 * u8_next:                               Elementary string functions on NUL terminated strings.
5697                                                               (line  23)
5698 * u8_normalize:                          Normalization of strings.
5699                                                               (line  48)
5700 * u8_normcmp:                            Normalizing comparisons.
5701                                                               (line  11)
5702 * u8_normcoll:                           Normalizing comparisons.
5703                                                               (line  38)
5704 * u8_normxfrm:                           Normalizing comparisons.
5705                                                               (line  25)
5706 * u8_possible_linebreaks:                unilbrk.h.           (line  44)
5707 * u8_prev:                               Elementary string functions on NUL terminated strings.
5708                                                               (line  34)
5709 * u8_set:                                Elementary string functions.
5710                                                               (line 100)
5711 * u8_snprintf:                           unistdio.h.          (line  73)
5712 * u8_sprintf:                            unistdio.h.          (line  70)
5713 * u8_startswith:                         Elementary string functions on NUL terminated strings.
5714                                                               (line 250)
5715 * u8_stpcpy:                             Elementary string functions on NUL terminated strings.
5716                                                               (line  74)
5717 * u8_stpncpy:                            Elementary string functions on NUL terminated strings.
5718                                                               (line  97)
5719 * u8_strcat:                             Elementary string functions on NUL terminated strings.
5720                                                               (line 110)
5721 * u8_strchr:                             Elementary string functions on NUL terminated strings.
5722                                                               (line 181)
5723 * u8_strcmp:                             Elementary string functions on NUL terminated strings.
5724                                                               (line 133)
5725 * u8_strcoll:                            Elementary string functions on NUL terminated strings.
5726                                                               (line 143)
5727 * u8_strconv_from_encoding:              uniconv.h.           (line 127)
5728 * u8_strconv_from_locale:                uniconv.h.           (line 156)
5729 * u8_strconv_to_encoding:                uniconv.h.           (line 140)
5730 * u8_strconv_to_locale:                  uniconv.h.           (line 166)
5731 * u8_strcpy:                             Elementary string functions on NUL terminated strings.
5732                                                               (line  64)
5733 * u8_strcspn:                            Elementary string functions on NUL terminated strings.
5734                                                               (line 201)
5735 * u8_strdup:                             Elementary string functions on NUL terminated strings.
5736                                                               (line 171)
5737 * u8_strlen:                             Elementary string functions on NUL terminated strings.
5738                                                               (line  46)
5739 * u8_strmblen:                           Elementary string functions on NUL terminated strings.
5740                                                               (line  10)
5741 * u8_strmbtouc:                          Elementary string functions on NUL terminated strings.
5742                                                               (line  16)
5743 * u8_strncat:                            Elementary string functions on NUL terminated strings.
5744                                                               (line 121)
5745 * u8_strncmp:                            Elementary string functions on NUL terminated strings.
5746                                                               (line 159)
5747 * u8_strncpy:                            Elementary string functions on NUL terminated strings.
5748                                                               (line  86)
5749 * u8_strnlen:                            Elementary string functions on NUL terminated strings.
5750                                                               (line  54)
5751 * u8_strpbrk:                            Elementary string functions on NUL terminated strings.
5752                                                               (line 225)
5753 * u8_strrchr:                            Elementary string functions on NUL terminated strings.
5754                                                               (line 189)
5755 * u8_strspn:                             Elementary string functions on NUL terminated strings.
5756                                                               (line 213)
5757 * u8_strstr:                             Elementary string functions on NUL terminated strings.
5758                                                               (line 239)
5759 * u8_strtok:                             Elementary string functions on NUL terminated strings.
5760                                                               (line 268)
5761 * u8_strwidth:                           uniwidth.h.          (line  38)
5762 * u8_to_u16:                             Elementary string conversions.
5763                                                               (line  11)
5764 * u8_to_u32:                             Elementary string conversions.
5765                                                               (line  15)
5766 * u8_tolower:                            Case mappings of strings.
5767                                                               (line  41)
5768 * u8_totitle:                            Case mappings of strings.
5769                                                               (line  55)
5770 * u8_toupper:                            Case mappings of strings.
5771                                                               (line  27)
5772 * u8_u8_asnprintf:                       unistdio.h.          (line 106)
5773 * u8_u8_asprintf:                        unistdio.h.          (line 103)
5774 * u8_u8_snprintf:                        unistdio.h.          (line 100)
5775 * u8_u8_sprintf:                         unistdio.h.          (line  97)
5776 * u8_u8_vasnprintf:                      unistdio.h.          (line 118)
5777 * u8_u8_vasprintf:                       unistdio.h.          (line 115)
5778 * u8_u8_vsnprintf:                       unistdio.h.          (line 112)
5779 * u8_u8_vsprintf:                        unistdio.h.          (line 109)
5780 * u8_uctomb:                             Elementary string functions.
5781                                                               (line  61)
5782 * u8_vasnprintf:                         unistdio.h.          (line  91)
5783 * u8_vasprintf:                          unistdio.h.          (line  88)
5784 * u8_vsnprintf:                          unistdio.h.          (line  85)
5785 * u8_vsprintf:                           unistdio.h.          (line  82)
5786 * u8_width:                              uniwidth.h.          (line  29)
5787 * u8_width_linebreaks:                   unilbrk.h.           (line  62)
5788 * u8_wordbreaks:                         Word breaks in a string.
5789                                                               (line   9)
5790 * uc_all_blocks:                         Blocks.              (line  38)
5791 * uc_all_scripts:                        Scripts.             (line  37)
5792 * uc_bidi_category:                      Bidirectional category.
5793                                                               (line  88)
5794 * uc_bidi_category_byname:               Bidirectional category.
5795                                                               (line  82)
5796 * uc_bidi_category_name:                 Bidirectional category.
5797                                                               (line  79)
5798 * uc_block:                              Blocks.              (line  27)
5799 * uc_block_t:                            Blocks.              (line  12)
5800 * uc_c_ident_category:                   ISO C and Java syntax.
5801                                                               (line  39)
5802 * uc_canonical_decomposition:            Decomposition of characters.
5803                                                               (line  92)
5804 * uc_combining_class:                    Canonical combining class.
5805                                                               (line  89)
5806 * uc_composition:                        Composition of characters.
5807                                                               (line  10)
5808 * uc_decimal_value:                      Decimal digit value. (line  11)
5809 * uc_decomposition:                      Decomposition of characters.
5810                                                               (line  82)
5811 * uc_digit_value:                        Digit value.         (line  11)
5812 * uc_fraction_t:                         Numeric value.       (line  14)
5813 * uc_general_category:                   Object oriented API. (line 207)
5814 * uc_general_category_and:               Object oriented API. (line 179)
5815 * uc_general_category_and_not:           Object oriented API. (line 186)
5816 * uc_general_category_byname:            Object oriented API. (line 201)
5817 * uc_general_category_name:              Object oriented API. (line 195)
5818 * uc_general_category_or:                Object oriented API. (line 173)
5819 * uc_general_category_t:                 Object oriented API. (line   7)
5820 * uc_is_alnum:                           Classifications like in ISO C.
5821                                                               (line  14)
5822 * uc_is_alpha:                           Classifications like in ISO C.
5823                                                               (line  18)
5824 * uc_is_bidi_category:                   Bidirectional category.
5825                                                               (line  91)
5826 * uc_is_blank:                           Classifications like in ISO C.
5827                                                               (line  64)
5828 * uc_is_block:                           Blocks.              (line  32)
5829 * uc_is_c_whitespace:                    ISO C and Java syntax.
5830                                                               (line  10)
5831 * uc_is_cntrl:                           Classifications like in ISO C.
5832                                                               (line  24)
5833 * uc_is_digit:                           Classifications like in ISO C.
5834                                                               (line  27)
5835 * uc_is_general_category:                Object oriented API. (line 213)
5836 * uc_is_general_category_withtable:      Bit mask API.        (line  52)
5837 * uc_is_graph:                           Classifications like in ISO C.
5838                                                               (line  31)
5839 * uc_is_java_whitespace:                 ISO C and Java syntax.
5840                                                               (line  14)
5841 * uc_is_lower:                           Classifications like in ISO C.
5842                                                               (line  35)
5843 * uc_is_print:                           Classifications like in ISO C.
5844                                                               (line  41)
5845 * uc_is_property:                        Properties as objects.
5846                                                               (line 140)
5847 * uc_is_property_alphabetic:             Properties as functions.
5848                                                               (line  10)
5849 * uc_is_property_ascii_hex_digit:        Properties as functions.
5850                                                               (line  74)
5851 * uc_is_property_bidi_arabic_digit:      Properties as functions.
5852                                                               (line  60)
5853 * uc_is_property_bidi_arabic_right_to_left: Properties as functions.
5854                                                               (line  56)
5855 * uc_is_property_bidi_block_separator:   Properties as functions.
5856                                                               (line  62)
5857 * uc_is_property_bidi_boundary_neutral:  Properties as functions.
5858                                                               (line  66)
5859 * uc_is_property_bidi_common_separator:  Properties as functions.
5860                                                               (line  61)
5861 * uc_is_property_bidi_control:           Properties as functions.
5862                                                               (line  53)
5863 * uc_is_property_bidi_embedding_or_override: Properties as functions.
5864                                                               (line  68)
5865 * uc_is_property_bidi_eur_num_separator: Properties as functions.
5866                                                               (line  58)
5867 * uc_is_property_bidi_eur_num_terminator: Properties as functions.
5868                                                               (line  59)
5869 * uc_is_property_bidi_european_digit:    Properties as functions.
5870                                                               (line  57)
5871 * uc_is_property_bidi_hebrew_right_to_left: Properties as functions.
5872                                                               (line  55)
5873 * uc_is_property_bidi_left_to_right:     Properties as functions.
5874                                                               (line  54)
5875 * uc_is_property_bidi_non_spacing_mark:  Properties as functions.
5876                                                               (line  65)
5877 * uc_is_property_bidi_other_neutral:     Properties as functions.
5878                                                               (line  69)
5879 * uc_is_property_bidi_pdf:               Properties as functions.
5880                                                               (line  67)
5881 * uc_is_property_bidi_segment_separator: Properties as functions.
5882                                                               (line  63)
5883 * uc_is_property_bidi_whitespace:        Properties as functions.
5884                                                               (line  64)
5885 * uc_is_property_combining:              Properties as functions.
5886                                                               (line 104)
5887 * uc_is_property_composite:              Properties as functions.
5888                                                               (line 105)
5889 * uc_is_property_currency_symbol:        Properties as functions.
5890                                                               (line  99)
5891 * uc_is_property_dash:                   Properties as functions.
5892                                                               (line  91)
5893 * uc_is_property_decimal_digit:          Properties as functions.
5894                                                               (line 106)
5895 * uc_is_property_default_ignorable_code_point: Properties as functions.
5896                                                               (line  14)
5897 * uc_is_property_deprecated:             Properties as functions.
5898                                                               (line  17)
5899 * uc_is_property_diacritic:              Properties as functions.
5900                                                               (line 108)
5901 * uc_is_property_extender:               Properties as functions.
5902                                                               (line 109)
5903 * uc_is_property_format_control:         Properties as functions.
5904                                                               (line  90)
5905 * uc_is_property_grapheme_base:          Properties as functions.
5906                                                               (line  46)
5907 * uc_is_property_grapheme_extend:        Properties as functions.
5908                                                               (line  47)
5909 * uc_is_property_grapheme_link:          Properties as functions.
5910                                                               (line  49)
5911 * uc_is_property_hex_digit:              Properties as functions.
5912                                                               (line  73)
5913 * uc_is_property_hyphen:                 Properties as functions.
5914                                                               (line  92)
5915 * uc_is_property_id_continue:            Properties as functions.
5916                                                               (line  36)
5917 * uc_is_property_id_start:               Properties as functions.
5918                                                               (line  34)
5919 * uc_is_property_ideographic:            Properties as functions.
5920                                                               (line  78)
5921 * uc_is_property_ids_binary_operator:    Properties as functions.
5922                                                               (line  81)
5923 * uc_is_property_ids_trinary_operator:   Properties as functions.
5924                                                               (line  82)
5925 * uc_is_property_ignorable_control:      Properties as functions.
5926                                                               (line 110)
5927 * uc_is_property_iso_control:            Properties as functions.
5928                                                               (line  89)
5929 * uc_is_property_join_control:           Properties as functions.
5930                                                               (line  45)
5931 * uc_is_property_left_of_pair:           Properties as functions.
5932                                                               (line 103)
5933 * uc_is_property_line_separator:         Properties as functions.
5934                                                               (line  94)
5935 * uc_is_property_logical_order_exception: Properties as functions.
5936                                                               (line  18)
5937 * uc_is_property_lowercase:              Properties as functions.
5938                                                               (line  27)
5939 * uc_is_property_math:                   Properties as functions.
5940                                                               (line 100)
5941 * uc_is_property_non_break:              Properties as functions.
5942                                                               (line  88)
5943 * uc_is_property_not_a_character:        Properties as functions.
5944                                                               (line  12)
5945 * uc_is_property_numeric:                Properties as functions.
5946                                                               (line 107)
5947 * uc_is_property_other_alphabetic:       Properties as functions.
5948                                                               (line  11)
5949 * uc_is_property_other_default_ignorable_code_point: Properties as functions.
5950                                                               (line  16)
5951 * uc_is_property_other_grapheme_extend:  Properties as functions.
5952                                                               (line  48)
5953 * uc_is_property_other_id_continue:      Properties as functions.
5954                                                               (line  37)
5955 * uc_is_property_other_id_start:         Properties as functions.
5956                                                               (line  35)
5957 * uc_is_property_other_lowercase:        Properties as functions.
5958                                                               (line  28)
5959 * uc_is_property_other_math:             Properties as functions.
5960                                                               (line 101)
5961 * uc_is_property_other_uppercase:        Properties as functions.
5962                                                               (line  26)
5963 * uc_is_property_paired_punctuation:     Properties as functions.
5964                                                               (line 102)
5965 * uc_is_property_paragraph_separator:    Properties as functions.
5966                                                               (line  95)
5967 * uc_is_property_pattern_syntax:         Properties as functions.
5968                                                               (line  41)
5969 * uc_is_property_pattern_white_space:    Properties as functions.
5970                                                               (line  40)
5971 * uc_is_property_private_use:            Properties as functions.
5972                                                               (line  20)
5973 * uc_is_property_punctuation:            Properties as functions.
5974                                                               (line  93)
5975 * uc_is_property_quotation_mark:         Properties as functions.
5976                                                               (line  96)
5977 * uc_is_property_radical:                Properties as functions.
5978                                                               (line  80)
5979 * uc_is_property_sentence_terminal:      Properties as functions.
5980                                                               (line  97)
5981 * uc_is_property_soft_dotted:            Properties as functions.
5982                                                               (line  30)
5983 * uc_is_property_space:                  Properties as functions.
5984                                                               (line  87)
5985 * uc_is_property_terminal_punctuation:   Properties as functions.
5986                                                               (line  98)
5987 * uc_is_property_titlecase:              Properties as functions.
5988                                                               (line  29)
5989 * uc_is_property_unassigned_code_value:  Properties as functions.
5990                                                               (line  21)
5991 * uc_is_property_unified_ideograph:      Properties as functions.
5992                                                               (line  79)
5993 * uc_is_property_uppercase:              Properties as functions.
5994                                                               (line  25)
5995 * uc_is_property_variation_selector:     Properties as functions.
5996                                                               (line  19)
5997 * uc_is_property_white_space:            Properties as functions.
5998                                                               (line   9)
5999 * uc_is_property_xid_continue:           Properties as functions.
6000                                                               (line  39)
6001 * uc_is_property_xid_start:              Properties as functions.
6002                                                               (line  38)
6003 * uc_is_property_zero_width:             Properties as functions.
6004                                                               (line  86)
6005 * uc_is_punct:                           Classifications like in ISO C.
6006                                                               (line  44)
6007 * uc_is_script:                          Scripts.             (line  31)
6008 * uc_is_space:                           Classifications like in ISO C.
6009                                                               (line  49)
6010 * uc_is_upper:                           Classifications like in ISO C.
6011                                                               (line  54)
6012 * uc_is_xdigit:                          Classifications like in ISO C.
6013                                                               (line  60)
6014 * uc_java_ident_category:                ISO C and Java syntax.
6015                                                               (line  43)
6016 * uc_locale_language:                    Case mappings of strings.
6017                                                               (line  21)
6018 * uc_mirror_char:                        Mirrored character.  (line  14)
6019 * uc_numeric_value:                      Numeric value.       (line  23)
6020 * uc_property_byname:                    Properties as objects.
6021                                                               (line 123)
6022 * uc_property_is_valid:                  Properties as objects.
6023                                                               (line 133)
6024 * uc_property_t:                         Properties as objects.
6025                                                               (line   9)
6026 * uc_script:                             Scripts.             (line  20)
6027 * uc_script_byname:                      Scripts.             (line  25)
6028 * uc_script_t:                           Scripts.             (line  11)
6029 * uc_tolower:                            Case mappings of characters.
6030                                                               (line  20)
6031 * uc_totitle:                            Case mappings of characters.
6032                                                               (line  23)
6033 * uc_toupper:                            Case mappings of characters.
6034                                                               (line  17)
6035 * uc_width:                              uniwidth.h.          (line  23)
6036 * uc_wordbreak_property:                 Word break property. (line  32)
6037 * UCS-4:                                 Unicode.             (line  14)
6038 * ucs4_t:                                unitypes.h.          (line  16)
6039 * uint16_t:                              unitypes.h.          (line  10)
6040 * uint32_t:                              unitypes.h.          (line  11)
6041 * uint8_t:                               unitypes.h.          (line   9)
6042 * ulc_asnprintf:                         unistdio.h.          (line  53)
6043 * ulc_asprintf:                          unistdio.h.          (line  50)
6044 * ulc_casecmp:                           Case insensitive comparison.
6045                                                               (line  57)
6046 * ulc_casecoll:                          Case insensitive comparison.
6047                                                               (line 101)
6048 * ulc_casexfrm:                          Case insensitive comparison.
6049                                                               (line  81)
6050 * ulc_fprintf:                           unistdio.h.          (line 229)
6051 * ulc_possible_linebreaks:               unilbrk.h.           (line  50)
6052 * ulc_snprintf:                          unistdio.h.          (line  48)
6053 * ulc_sprintf:                           unistdio.h.          (line  45)
6054 * ulc_vasnprintf:                        unistdio.h.          (line  65)
6055 * ulc_vasprintf:                         unistdio.h.          (line  62)
6056 * ulc_vfprintf:                          unistdio.h.          (line 232)
6057 * ulc_vsnprintf:                         unistdio.h.          (line  59)
6058 * ulc_vsprintf:                          unistdio.h.          (line  56)
6059 * ulc_width_linebreaks:                  unilbrk.h.           (line  71)
6060 * ulc_wordbreaks:                        Word breaks in a string.
6061                                                               (line  12)
6062 * Unicode:                               Unicode.             (line   6)
6063 * Unicode character, bidirectional category: Bidirectional category.
6064                                                               (line   6)
6065 * Unicode character, block:              Blocks.              (line  24)
6066 * Unicode character, canonical combining class: Canonical combining class.
6067                                                               (line   6)
6068 * Unicode character, case mappings:      Case mappings of characters.
6069                                                               (line   6)
6070 * Unicode character, classification:     General category.    (line   6)
6071 * Unicode character, classification like in C: Classifications like in ISO C.
6072                                                               (line   6)
6073 * Unicode character, general category:   General category.    (line   6)
6074 * Unicode character, mirroring:          Mirrored character.  (line   6)
6075 * Unicode character, name:               uniname.h.           (line   6)
6076 * Unicode character, properties:         Properties.          (line   6)
6077 * Unicode character, script:             Scripts.             (line  17)
6078 * Unicode character, validity in C identifiers: ISO C and Java syntax.
6079                                                               (line  38)
6080 * Unicode character, validity in Java identifiers: ISO C and Java syntax.
6081                                                               (line  42)
6082 * Unicode character, value <1>:          Numeric value.       (line   6)
6083 * Unicode character, value <2>:          Digit value.         (line   6)
6084 * Unicode character, value:              Decimal digit value. (line   6)
6085 * Unicode character, width:              uniwidth.h.          (line  22)
6086 * unicode_character_name:                uniname.h.           (line  19)
6087 * unicode_name_character:                uniname.h.           (line  25)
6088 * uninorm_decomposing_form:              Normalization of strings.
6089                                                               (line  40)
6090 * uninorm_filter_create:                 Normalization of streams.
6091                                                               (line  19)
6092 * uninorm_filter_flush:                  Normalization of streams.
6093                                                               (line  33)
6094 * uninorm_filter_free:                   Normalization of streams.
6095                                                               (line  43)
6096 * uninorm_filter_write:                  Normalization of streams.
6097                                                               (line  29)
6098 * uninorm_is_compat_decomposing:         Normalization of strings.
6099                                                               (line  32)
6100 * uninorm_is_composing:                  Normalization of strings.
6101                                                               (line  36)
6102 * uninorm_t:                             Normalization of strings.
6103                                                               (line  10)
6104 * uppercasing:                           Case mappings of strings.
6105                                                               (line   6)
6106 * use cases:                             Introduction.        (line  44)
6107 * UTF-16:                                Unicode.             (line  14)
6108 * UTF-16, strings:                       Unicode strings.     (line   6)
6109 * UTF-32:                                Unicode.             (line  14)
6110 * UTF-32, strings:                       Unicode strings.     (line   6)
6111 * UTF-8:                                 Unicode.             (line  14)
6112 * UTF-8, strings:                        Unicode strings.     (line   6)
6113 * validity:                              Elementary string checks.
6114                                                               (line   6)
6115 * value, of libunistring:                Introduction.        (line  44)
6116 * value, of Unicode character <1>:       Numeric value.       (line   6)
6117 * value, of Unicode character <2>:       Digit value.         (line   6)
6118 * value, of Unicode character:           Decimal digit value. (line   6)
6119 * verification:                          Elementary string checks.
6120                                                               (line   6)
6121 * wchar_t, type:                         The wchar_t mess.    (line   6)
6122 * width:                                 uniwidth.h.          (line   6)
6123 * word breaks:                           uniwbrk.h.           (line   6)
6124 * wrapping:                              unilbrk.h.           (line   6)
6125
6126
6127 \1f
6128 Tag Table:
6129 Node: Top\7f270
6130 Node: Introduction\7f3239
6131 Node: Unicode\7f5236
6132 Node: Unicode and i18n\7f7116
6133 Node: Locale encodings\7f8579
6134 Node: In-memory representation\7f10787
6135 Node: char * strings\7f11896
6136 Node: The wchar_t mess\7f17153
6137 Node: Unicode strings\7f19357
6138 Node: Conventions\7f20508
6139 Node: unitypes.h\7f22708
6140 Node: unistr.h\7f23280
6141 Node: Elementary string checks\7f23837
6142 Node: Elementary string conversions\7f24459
6143 Node: Elementary string functions\7f25761
6144 Node: Elementary string functions with memory allocation\7f32732
6145 Node: Elementary string functions on NUL terminated strings\7f33354
6146 Node: uniconv.h\7f45258
6147 Node: unistdio.h\7f52969
6148 Node: uniname.h\7f61172
6149 Node: unictype.h\7f62505
6150 Node: General category\7f63414
6151 Node: Object oriented API\7f64457
6152 Node: Bit mask API\7f72919
6153 Node: Canonical combining class\7f75173
6154 Node: Bidirectional category\7f78387
6155 Node: Decimal digit value\7f81444
6156 Node: Digit value\7f82005
6157 Node: Numeric value\7f82566
6158 Node: Mirrored character\7f83457
6159 Node: Properties\7f84130
6160 Node: Properties as objects\7f84821
6161 Node: Properties as functions\7f91199
6162 Node: Scripts\7f96750
6163 Node: Blocks\7f98136
6164 Node: ISO C and Java syntax\7f99459
6165 Node: Classifications like in ISO C\7f101169
6166 Node: uniwidth.h\7f103873
6167 Node: uniwbrk.h\7f105910
6168 Node: Word breaks in a string\7f106437
6169 Node: Word break property\7f107488
6170 Node: unilbrk.h\7f108584
6171 Node: uninorm.h\7f112755
6172 Node: Decomposition of characters\7f113387
6173 Node: Composition of characters\7f116763
6174 Node: Normalization of strings\7f117472
6175 Node: Normalizing comparisons\7f119534
6176 Node: Normalization of streams\7f121890
6177 Node: unicase.h\7f123978
6178 Node: Case mappings of characters\7f124663
6179 Node: Case mappings of strings\7f126710
6180 Node: Case mappings of substrings\7f130043
6181 Node: Case insensitive comparison\7f136973
6182 Node: Case detection\7f142324
6183 Node: uniregex.h\7f145592
6184 Node: Using the library\7f145815
6185 Node: Installation\7f146226
6186 Node: Compiler options\7f146699
6187 Node: Include files\7f148258
6188 Node: Autoconf macro\7f149482
6189 Node: Reporting problems\7f151040
6190 Node: More functionality\7f151837
6191 Node: Licenses\7f152280
6192 Node: GNU GPL\7f153915
6193 Node: GNU LGPL\7f191460
6194 Node: GNU FDL\7f199906
6195 Node: Index\7f225031
6196 \1f
6197 End Tag Table
6198
6199 \1f
6200 Local Variables:
6201 coding: utf-8
6202 End: