It also serves as a reference for the free Translation Project.
@copying
-Copyright (C) 1995-1998, 2001-2012 Free Software Foundation, Inc.
+Copyright (C) 1995-1998, 2001-2015 Free Software Foundation, Inc.
This manual is free documentation. It is dually licensed under the
GNU FDL and the GNU GPL. This means that you can redistribute this
* Setting the POSIX Locale:: How to Specify the Locale According to POSIX
* Installing Localizations:: How to Install Additional Translations
-Setting the POSIX Locale
+Setting the Locale through Environment Variables
* Locale Names:: How a Locale Specification Looks Like
-* Locale Environment Variables:: Which Environment Variable Specfies What
+* Locale Environment Variables:: Which Environment Variable Specfies What
* The LANGUAGE variable:: How to Specify a Priority List of Languages
Preparing Program Sources
* gettextize Invocation:: Invoking the @code{gettextize} Program
* Adjusting Files:: Files You Must Create or Alter
* autoconf macros:: Autoconf macros for use in @file{configure.ac}
-* CVS Issues:: Integrating with CVS
+* Version Control Issues::
* Release Management:: Creating a Distribution Tarball
Files You Must Create or Alter
* AM_GNU_GETTEXT_NEED:: AM_GNU_GETTEXT_NEED in @file{gettext.m4}
* AM_GNU_GETTEXT_INTL_SUBDIR:: AM_GNU_GETTEXT_INTL_SUBDIR in @file{intldir.m4}
* AM_PO_SUBDIRS:: AM_PO_SUBDIRS in @file{po.m4}
+* AM_XGETTEXT_OPTION:: AM_XGETTEXT_OPTION in @file{po.m4}
* AM_ICONV:: AM_ICONV in @file{iconv.m4}
-Integrating with CVS
+Integrating with Version Control Systems
-* Distributed CVS:: Avoiding version mismatch in distributed development
-* Files under CVS:: Files to put under CVS version control
+* Distributed Development:: Avoiding version mismatch in distributed development
+* Files under Version Control:: Files to put under version control
+* Translations under Version Control:: Put PO Files under Version Control
* autopoint Invocation:: Invoking the @code{autopoint} Program
Other Programming Languages
* GCC-source:: GNU Compiler Collection sources
* Lua:: Lua
* JavaScript:: JavaScript
+* Vala:: Vala
sh - Shell Script
* POT:: POT - Portable Object Template
* RST:: Resource String Table
* Glade:: Glade - GNOME user interface description
+* GSettings:: GSettings - GNOME user configuration schema
+* AppData:: AppData - freedesktop.org application description
+* Preparing ITS Rules:: Preparing Rules for XML Internationalization
Concluding Remarks
@menu
* Locale Names:: How a Locale Specification Looks Like
-* Locale Environment Variables:: Which Environment Variable Specfies What
+* Locale Environment Variables:: Which Environment Variable Specfies What
* The LANGUAGE variable:: How to Specify a Priority List of Languages
@end menu
This will most probably lead to problems because now the length of the
string is regarded as the address.
-To prevent errors at runtime caused by translations the @code{msgfmt}
+To prevent errors at runtime caused by translations, the @code{msgfmt}
tool can check statically whether the arguments in the original and the
translation string match in type and number. If this is not the case
and the @samp{-c} option has been passed to @code{msgfmt}, @code{msgfmt}
-will give an error and refuse to produce a MO file. Thus consequent
+will give an error and refuse to produce a MO file. Thus consistent
use of @samp{msgfmt -c} will catch the error, so that it cannot cause
-cause problems at runtime.
+problems at runtime.
@noindent
If the word order in the above German translation would be correct one
@noindent
The routines in @code{msgfmt} know about this special notation.
-Because not all strings in a program must be format strings it is not
+Because not all strings in a program will be format strings, it is not
useful for @code{msgfmt} to test all the strings in the @file{.po} file.
This might cause problems because the string might contain what looks
like a format specifier, but the string is not used in @code{printf}.
-Therefore the @code{xgettext} adds a special tag to those messages it
+Therefore @code{xgettext} adds a special tag to those messages it
thinks might be a format string. There is no absolute rule for this,
only a heuristic. In the @file{.po} file the entry is marked using the
@code{c-format} flag in the @code{#,} comment line (@pxref{PO Files}).
@{
static const char *messages[] = @{
- gettext_noop ("some very meaningful message",
+ gettext_noop ("some very meaningful message"),
gettext_noop ("and another one")
@};
const char *string;
* Customizing less:: Customizing @code{less} for viewing PO files
@end menu
-@node The --color option, The TERM variable, , Colorizing
+@node The --color option, The TERM variable, Colorizing, Colorizing
@subsection The @code{--color} option
@opindex --color@r{, @code{msgcat} option}
are listed. But this does not necessarily mean the information can be
generalized for the whole family (as can be easily seen in the table
below).@footnote{Additions are welcome. Send appropriate information to
-@email{bug-gnu-gettext@@gnu.org} and @email{bug-glibc-manual@@gnu.org}.}
+@email{bug-gnu-gettext@@gnu.org} and @email{bug-glibc-manual@@gnu.org}.
+The Unicode CLDR Project (@uref{http://cldr.unicode.org}) provides a
+comprehensive set of plural forms in a different format. The
+@code{msginit} program has preliminary support for the format so you can
+use it as a baseline (@pxref{msginit Invocation}).}
@table @asis
@item Only one form:
Japanese, @c 122.1 million speakers
Vietnamese, @c 68.6 million speakers
Korean @c 66.3 million speakers
+@item Tai-Kadai family
+Thai @c 20.4 million speakers
@end table
@item Two forms, singular used for one only
Estonian @c 1.0 million speakers
@item Semitic family
Hebrew @c 5.3 million speakers
+@item Austronesian family
+Bahasa Indonesian @c 23.2 million speakers
@item Artificial
Esperanto @c 2 million speakers
@end table
@item Slavic family
Slovenian @c 1.9 million speakers
@end table
+
+@item Six forms, special cases for one, two, all numbers ending in 02, 03, @dots{} 10, all numbers ending in 11 @dots{} 99, and others
+The header entry would look like this:
+
+@smallexample
+Plural-Forms: nplurals=6; \
+ plural=n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 \
+ : n%100>=11 ? 4 : 5;
+@end smallexample
+
+@noindent
+Languages with this property include:
+
+@table @asis
+@item Afroasiatic family
+Arabic @c 246.0 million speakers
+@end table
@end table
You might now ask, @code{ngettext} handles only numbers @var{n} of type
* gettextize Invocation:: Invoking the @code{gettextize} Program
* Adjusting Files:: Files You Must Create or Alter
* autoconf macros:: Autoconf macros for use in @file{configure.ac}
-* CVS Issues:: Integrating with CVS
+* Version Control Issues::
* Release Management:: Creating a Distribution Tarball
@end menu
So, here comes a list of files, each one followed by a description of
all alterations it needs. Many examples are taken out from the GNU
@code{gettext} @value{VERSION} distribution itself, or from the GNU
-@code{hello} distribution (@uref{http://www.franken.de/users/gnu/ke/hello}
-or @uref{http://www.gnu.franken.de/ke/hello/}) You may indeed
-refer to the source code of the GNU @code{gettext} and GNU @code{hello}
-packages, as they are intended to be good examples for using GNU
-gettext functionality.
+@code{hello} distribution (@uref{http://www.gnu.org/software/hello}).
+You may indeed refer to the source code of the GNU @code{gettext} and
+GNU @code{hello} packages, as they are intended to be good examples for
+using GNU gettext functionality.
@menu
* po/POTFILES.in:: @file{POTFILES.in} in @file{po/}
Do not install the @code{gettext.h} file in public locations. Every
package that needs it should contain a copy of it on its own.
-@node autoconf macros, CVS Issues, Adjusting Files, Maintainers
+@node autoconf macros, Version Control Issues, Adjusting Files, Maintainers
@section Autoconf macros for use in @file{configure.ac}
@cindex autoconf macros for @code{gettext}
the GNU gettext infrastructure that is used by the package.
The use of this macro is optional; only the @code{autopoint} program makes
-use of it (@pxref{CVS Issues}).
+use of it (@pxref{Version Control Issues}).
@node AM_GNU_GETTEXT_NEED, AM_GNU_GETTEXT_INTL_SUBDIR, AM_GNU_GETTEXT_VERSION, autoconf macros
@subsection AM_GNU_GETTEXT_NEED in @file{gettext.m4}
@file{iconv.m4} is distributed with the GNU gettext package because
@file{gettext.m4} relies on it.
-@node CVS Issues, Release Management, autoconf macros, Maintainers
-@section Integrating with CVS
+@node Version Control Issues, Release Management, autoconf macros, Maintainers
+@section Integrating with Version Control Systems
-Many projects use CVS for distributed development, version control and
-source backup. This section gives some advice how to manage the uses
-of @code{cvs}, @code{gettextize}, @code{autopoint} and @code{autoconf}.
+Many projects use version control systems for distributed development
+and source backup. This section gives some advice how to manage the
+uses of @code{gettextize}, @code{autopoint} and @code{autoconf} on
+version controlled files.
@menu
-* Distributed CVS:: Avoiding version mismatch in distributed development
-* Files under CVS:: Files to put under CVS version control
+* Distributed Development:: Avoiding version mismatch in distributed development
+* Files under Version Control:: Files to put under version control
+* Translations under Version Control:: Put PO Files under Version Control
* autopoint Invocation:: Invoking the @code{autopoint} Program
@end menu
-@node Distributed CVS, Files under CVS, CVS Issues, CVS Issues
+@node Distributed Development, Files under Version Control, Version Control Issues, Version Control Issues
@subsection Avoiding version mismatch in distributed development
-In a project development with multiple developers, using CVS, there
-should be a single developer who occasionally - when there is desire to
-upgrade to a new @code{gettext} version - runs @code{gettextize} and
-performs the changes listed in @ref{Adjusting Files}, and then commits
-his changes to the CVS.
+In a project development with multiple developers, there should be a
+single developer who occasionally - when there is desire to upgrade to
+a new @code{gettext} version - runs @code{gettextize} and performs the
+changes listed in @ref{Adjusting Files}, and then commits his changes
+to the repository.
It is highly recommended that all developers on a project use the same
version of GNU @code{gettext} in the package. In other words, if a
developer runs @code{gettextize}, he should go the whole way, make the
-necessary remaining changes and commit his changes to the CVS.
+necessary remaining changes and commit his changes to the repository.
Otherwise the following damages will likely occur:
@itemize @bullet
undiscovered due to this constellation.
@end itemize
-@node Files under CVS, autopoint Invocation, Distributed CVS, CVS Issues
-@subsection Files to put under CVS version control
+@node Files under Version Control, Translations under Version Control, Distributed Development, Version Control Issues
+@subsection Files to put under version control
There are basically three ways to deal with generated files in the
-context of a CVS repository, such as @file{configure} generated from
-@file{configure.ac}, @code{@var{parser}.c} generated from
-@code{@var{parser}.y}, or @code{po/Makefile.in.in} autoinstalled by
-@code{gettextize} or @code{autopoint}.
+context of a version controlled repository, such as @file{configure}
+generated from @file{configure.ac}, @code{@var{parser}.c} generated
+from @code{@var{parser}.y}, or @code{po/Makefile.in.in} autoinstalled
+by @code{gettextize} or @code{autopoint}.
@enumerate
@item
@enumerate
@item
-The advantage is that anyone can check out the CVS at any moment and
+The advantage is that anyone can check out the source at any moment and
gets a working build. The drawbacks are: 1a. It requires some frequent
-"cvs commit" actions by the maintainers. 1b. The repository grows in size
+"push" actions by the maintainers. 1b. The repository grows in size
quite fast.
@item
-The advantage is that anyone can check out the CVS, and the usual
-"./configure; make" will work. The drawbacks are: 2a. The one who
-checks out the repository needs tools like GNU @code{automake},
-GNU @code{autoconf}, GNU @code{m4} installed in his PATH; sometimes
-he even needs particular versions of them. 2b. When a release is made
+The advantage is that anyone can check out the source, and the usual
+"./configure; make" will work. The drawbacks are: 2a. The one who
+checks out the repository needs tools like GNU @code{automake}, GNU
+@code{autoconf}, GNU @code{m4} installed in his PATH; sometimes he
+even needs particular versions of them. 2b. When a release is made
and a commit is made on the generated files, the other developers get
-conflicts on the generated files after doing "cvs update". Although
-these conflicts are easy to resolve, they are annoying.
+conflicts on the generated files when merging the local work back to
+the repository. Although these conflicts are easy to resolve, they
+are annoying.
@item
The advantage is less work for the maintainers. The drawback is that
-anyone who checks out the CVS not only needs tools like GNU @code{automake},
-GNU @code{autoconf}, GNU @code{m4} installed in his PATH, but also that
-he needs to perform a package specific pre-build step before being able
-to "./configure; make".
+anyone who checks out the source not only needs tools like GNU
+@code{automake}, GNU @code{autoconf}, GNU @code{m4} installed in his
+PATH, but also that he needs to perform a package specific pre-build
+step before being able to "./configure; make".
@end enumerate
For the first and second approach, all files modified or brought in
by the occasional @code{gettextize} invocation and update should be
-committed into the CVS.
+committed into the repository.
-For the third approach, the maintainer can omit from the CVS repository
+For the third approach, the maintainer can omit from the repository
all the files that @code{gettextize} mentions as "copy". Instead, he
adds to the @file{configure.ac} or @file{configure.in} a line of the
form
@example
-AM_GNU_GETTEXT_VERSION(@value{VERSION})
+AM_GNU_GETTEXT_VERSION(@value{ARCHIVE-VERSION})
@end example
@noindent
and adds to the package's pre-build script an invocation of
-@samp{autopoint}. For everyone who checks out the CVS, this
+@samp{autopoint}. For everyone who checks out the source, this
@code{autopoint} invocation will copy into the right place the
-@code{gettext} infrastructure files that have been omitted from the CVS.
+@code{gettext} infrastructure files that have been omitted from the repository.
The version number used as argument to @code{AM_GNU_GETTEXT_VERSION} is
the version of the @code{gettext} infrastructure that the package wants
use the CVS will henceforth need to have GNU @code{gettext} 0.12.1 or newer
installed.
-@node autopoint Invocation, , Files under CVS, CVS Issues
+@node Translations under Version Control, autopoint Invocation, Files under Version Control, Version Control Issues
+@subsection Put PO Files under Version Control
+
+Since translations are valuable assets as well as the source code, it
+would make sense to put them under version control. The GNU gettext
+infrastructure supports two ways to deal with translations in the
+context of a version controlled repository.
+
+@enumerate
+@item
+Both POT file and PO files are committed into the repository.
+
+@item
+Only PO files are committed into the repository.
+
+@end enumerate
+
+If a POT file is absent when building, it will be generated by
+scanning the source files with @code{xgettext}, and then the PO files
+are regenerated as a dependency. On the other hand, some maintainers
+want to keep the POT file unchanged during the development phase. So,
+even if a POT file is present and older than the source code, it won't
+be updated automatically. You can manually update it with @code{make
+$(DOMAIN).pot-update}, and commit it at certain point.
+
+Special advices for particular version control systems:
+
+@itemize @bullet
+@item
+Recent version control systems, Git for instance, ignore file's
+timestamp. In that case, PO files can be accidentally updated even if
+a POT file is not updated. To prevent this, you can set
+@samp{PO_DEPENDS_ON_POT} variable to @code{no} in the @file{Makevars}
+file and do @code{make update-po} manually.
+
+@item
+Location comments such as @code{#: lib/error.c:116} are sometimes
+annoying, since these comments are volatile and may introduce unwanted
+change to the working copy when building. To mitigate this, you can
+decide to omit those comments from the PO files in the repository.
+
+This is possible with the @code{--no-location} option of the
+@code{msgmerge} command @footnote{you can also use it through the
+@samp{MSGMERGE_OPTIONS} option from @file{Makevars}}. The drawback is
+that, if the location information is needed, translators have to
+recover the location comments by running @code{msgmerge} again.
+
+@end itemize
+
+@node autopoint Invocation, , Translations under Version Control, Version Control Issues
@subsection Invoking the @code{autopoint} Program
@include autopoint.texi
-@node Release Management, , CVS Issues, Maintainers
+@node Release Management, , Version Control Issues, Maintainers
@section Creating a Distribution Tarball
@cindex release
@cindex setting up @code{gettext} at build time
By default, packages fully using GNU @code{gettext}, internally,
-are installed in such a way that they to allow translation of
+are installed in such a way as to allow translation of
messages. At @emph{configuration} time, those packages should
automatically detect whether the underlying host system already provides
the GNU @code{gettext} functions. If not,
* qt-format:: Qt Format Strings
* qt-plural-format:: Qt Plural Format Strings
* kde-format:: KDE Format Strings
+* kde-kuit-format:: KUIT Format Strings
* boost-format:: Boost Format Strings
* lua-format:: Lua Format Strings
* javascript-format:: JavaScript Format Strings
Python @code{%} format strings are described in
@w{Python Library reference} /
-@w{2. Built-in Types, Exceptions and Functions} /
-@w{2.2. Built-in Types} /
-@w{2.2.6. Sequence Types} /
-@w{2.2.6.2. String Formatting Operations}.
-@uref{http://www.python.org/doc/2.2.1/lib/typesseq-strings.html}.
+@w{5. Built-in Types} /
+@w{5.6. Sequence Types} /
+@w{5.6.2. String Formatting Operations}.
+@uref{http://docs.python.org/2/library/stdtypes.html#string-formatting-operations}.
Python brace format strings are described in @w{PEP 3101 -- Advanced
String Formatting}, @uref{http://www.python.org/dev/peps/pep-3101/}.
@uref{file:/usr/lib/qt-4.3.0/doc/html/qobject.html}.
In summary, the only allowed directive is @samp{%n}.
-@node kde-format, boost-format, qt-plural-format, Translators for other Languages
+@node kde-format, kde-kuit-format, qt-plural-format, Translators for other Languages
@subsection KDE Format Strings
KDE 4 format strings are defined as follows:
If a @samp{%n} occurs in a format strings, all of @samp{%1}, ..., @samp{%(n-1)}
must occur as well, except possibly one of them.
-@node boost-format, lua-format, kde-format, Translators for other Languages
+@node kde-kuit-format, boost-format, kde-format, Translators for other Languages
+@subsection KUIT Format Strings
+
+KUIT (KDE User Interface Text) is compatible with KDE 4 format strings,
+while it also allows programmers to add semantic information to a format
+string, through XML markup tags. For example, if the first format
+directive in a string is a filename, programmers could indicate that
+with a @samp{filename} tag, like @samp{<filename>%1</filename>}.
+
+KUIT format strings are described in
+@uref{http://api.kde.org/frameworks-api/frameworks5-apidocs/ki18n/html/prg_guide.html#kuit_markup}.
+
+@node boost-format, lua-format, kde-kuit-format, Translators for other Languages
@subsection Boost Format Strings
Boost format strings are described in the documentation of the
* GCC-source:: GNU Compiler Collection sources
* Lua:: Lua
* JavaScript:: JavaScript
+* Vala:: Vala
@end menu
@node C, sh, List of Programming Languages, List of Programming Languages
@code{"abc"}
@item gettext shorthand
-@code{(_ "abc")}
+@code{(_ "abc")}, @code{_"abc"} (GIMP script-fu extension)
@item gettext/ngettext functions
@code{gettext}, @code{ngettext}
gawk 3.1 or newer
@item File extension
-@code{awk}
+@code{awk}, @code{gawk}, @code{twjr}.
+The file extension @code{twjr} is used by TexiWeb Jr
+(@uref{https://github.com/arnoldrobbins/texiwebjr}).
@item String syntax
@code{"abc"}
* Perl Pitfalls:: Bugs, Pitfalls, and Things That Do Not Work
@end menu
-@node General Problems, Default Keywords, , Perl
+@node General Problems, Default Keywords, Perl, Perl
@subsubsection General Problems Parsing Perl Code
It is often heard that only Perl can parse Perl. This is not true.
---
@end table
-@node JavaScript
+@node JavaScript, Vala, Lua, List of Programming Languages
@subsection JavaScript
@table @asis
---
@end table
+@node Vala, , JavaScript, List of Programming Languages
+@subsection Vala
+
+@table @asis
+@item RPMs
+vala
+
+@item File extension
+@code{vala}
+
+@item String syntax
+@itemize @bullet
+
+@item @code{"abc"}
+
+@item @code{"""abc"""}
+
+@end itemize
+
+@item gettext shorthand
+@code{_("abc")}
+
+@item gettext/ngettext functions
+@code{gettext}, @code{dgettext}, @code{dcgettext}, @code{ngettext},
+@code{dngettext}, @code{dpgettext}, @code{dpgettext2}
+
+@item textdomain
+@code{textdomain} function, defined under the @code{Intl} namespace
+
+@item bindtextdomain
+@code{bindtextdomain} function, defined under the @code{Intl} namespace
+
+@item setlocale
+Programmer must call @code{Intl.setlocale (LocaleCategory.ALL, "")}
+
+@item Prerequisite
+---
+
+@item Use or emulate GNU gettext
+Use
+
+@item Extractor
+@code{xgettext}
+
+@item Formatting with positions
+Same as for the C language.
+
+@item Portability
+autoconf (gettext.m4) and #if ENABLE_NLS
+
+@item po-mode marking
+yes
+@end table
+
@c This is the template for new languages.
@ignore
* POT:: POT - Portable Object Template
* RST:: Resource String Table
* Glade:: Glade - GNOME user interface description
+* GSettings:: GSettings - GNOME user configuration schema
+* AppData:: AppData - freedesktop.org application description
+* Preparing ITS Rules:: Preparing Rules for XML Internationalization
@end menu
@node POT, RST, List of Data Formats, List of Data Formats
@code{xgettext}, @code{rstconv}
@end table
-@node Glade, , RST, List of Data Formats
+@node Glade, GSettings, RST, List of Data Formats
@subsection Glade - GNOME user interface description
@table @asis
@code{xgettext}, @code{libglade-xgettext}, @code{xml-i18n-extract}, @code{intltool-extract}
@end table
+@node GSettings, AppData, Glade, List of Data Formats
+@subsection GSettings - GNOME user configuration schema
+
+@table @asis
+@item RPMs
+glib2
+
+@item File extension
+@code{gschema.xml}
+
+@item Extractor
+@code{xgettext}, @code{intltool-extract}
+@end table
+
+@node AppData, Preparing ITS Rules, GSettings, List of Data Formats
+@subsection AppData - freedesktop.org application description
+
+@table @asis
+@item RPMs
+appdata-tools, appstream, libappstream-glib, libappstream-glib-builder
+
+@item File extension
+@code{appdata.xml}
+
+@item Extractor
+@code{xgettext}, @code{intltool-extract}, @code{itstool}
+@end table
+
+@menu
+@end menu
+
+@node Preparing ITS Rules, , AppData, List of Data Formats
+@subsection Preparing Rules for XML Internationalization
+@cindex preparing rules for XML translation
+
+Marking translatable strings in an XML file is done through a separate
+"rule" file, making use of the Internationalization Tag Set standard
+(ITS, @uref{http://www.w3.org/TR/its20/}). The currently supported ITS
+data categories are: @samp{Translate}, @samp{Localization Note},
+@samp{Elements Within Text}, and @samp{Preserve Space}. In addition to
+them, @code{xgettext} also recognizes the following extended data
+categories:
+
+@table @samp
+@item Context
+
+This data category associates @code{msgctxt} to the extracted text. In
+the global rule, the @code{contextRule} element contains the following:
+
+@itemize
+@item
+A required @code{selector} attribute. It contains an absolute selector
+that selects the nodes to which this rule applies.
+
+@item
+A required @code{contextPointer} attribute that contains a relative
+selector pointing to a node that holds the @code{msgctxt} value.
+
+@item
+An optional @code{textPointer} attribute that contains a relative
+selector pointing to a node that holds the @code{msgid} value.
+@end itemize
+
+@item Escape Special Characters
+
+This data category indicates whether the special XML characters
+(@code{<}, @code{>}, @code{&}, @code{"}) are escaped with entity
+reference. In the global rule, the @code{escapeRule} element contains
+the following:
+
+@itemize
+@item
+A required @code{selector} attribute. It contains an absolute selector
+that selects the nodes to which this rule applies.
+
+@item
+A required @code{escape} attribute with the value @code{yes} or @code{no}.
+@end itemize
+
+@item Extended Preserve Space
+
+This data category extends the standard @samp{Preserve Space} data
+category with the additional value @samp{trim}. The value means to
+remove the leading and trailing whitespaces of the content, but not to
+normalize whitespaces in the middle. In the global rule, the
+@code{preserveSpaceRule} element contains the following:
+
+@itemize
+@item
+A required @code{selector} attribute. It contains an absolute selector
+that selects the nodes to which this rule applies.
+
+@item
+A required @code{space} attribute with the value @code{default},
+@code{preserve}, or @code{trim}.
+@end itemize
+
+@end table
+
+All those extended data categories can only be expressed with global
+rules, and the rule elements have to have the
+@code{https://www.gnu.org/s/gettext/ns/its/extensions/1.0} namespace.
+
+Given the following XML document in a file @file{messages.xml}:
+
+@example
+<?xml version="1.0"?>
+<messages>
+ <message>
+ <p>A translatable string</p>
+ </message>
+ <message>
+ <p translatable="no">A non-translatable string</p>
+ </message>
+</messages>
+@end example
+
+To extract the first text content ("A translatable string"), but not the
+second ("A non-translatable string"), the following ITS rules can be used:
+
+@example
+<?xml version="1.0"?>
+<its:rules xmlns:its="http://www.w3.org/2005/11/its" version="1.0">
+ <its:translateRule selector="/messages" translate="no"/>
+ <its:translateRule selector="//message/p" translate="yes"/>
+
+ <!-- If 'p' has an attribute 'translatable' with the value 'no', then
+ the content is not translatable. -->
+ <its:translateRule selector="//message/p[@@translatable = 'no']"
+ translate="no"/>
+</its:rules>
+@end example
+
+@samp{xgettext} needs another file called "locating rule" to associate
+an ITS rule with an XML file. If the above ITS file is saved as
+@file{messages.its}, the locating rule would look like:
+
+@example
+<?xml version="1.0"?>
+<locatingRules>
+ <locatingRule name="Messages" pattern="*.xml">
+ <documentRule localName="messages" target="messages.its"/>
+ </locatingRule>
+ <locatingRule name="Messages" pattern="*.msg" target="messages.its"/>
+</locatingRules>
+@end example
+
+The @code{locatingRule} element must have a @code{pattern} attribute,
+which denotes either a literal file name or a wildcard pattern of the
+XML file. The @code{locatingRule} element can have child
+@code{documentRule} element, which adds checks on the content of the XML
+file.
+
+The first rule matches any file with the @file{.xml} file extension, but
+it only applies to XML files whose root element is @samp{<messages>}.
+
+The second rule indicates that the same ITS rule file are also
+applicable to any file with the @file{.msg} file extension. The
+optional @code{name} attribute of @code{locatingRule} allows to choose
+rules by name, typically with @code{xgettext}'s @code{-L} option.
+
+The associated ITS rule file is indicated by the @code{target} attribute
+of @code{locatingRule} or @code{documentRule}. If it is specified in a
+@code{documentRule} element, the parent @code{locatingRule} shouldn't
+have the @code{target} attribute.
+
+Locating rule files must have the @file{.loc} file extension. Both ITS
+rule files and locating rule files must be installed in the
+@file{$prefix/share/gettext/its} directory. Once those files are
+properly installed, @code{xgettext} can extract translatable strings
+from the matching XML files.
+
+@subsubsection Two Use-cases of Translated Strings in XML
+
+For XML, there are two use-cases of translated strings. One is the case
+where the translated strings are directly consumed by programs, and the
+other is the case where the translated strings are merged back to the
+original XML document. In the former case, special characters in the
+extracted strings shouldn't be escaped, while they should in the latter
+case. To control wheter to escape special characters, the @samp{Escape
+Special Characters} data category can be used.
+
+To merge the translations, the @samp{msgfmt} program can be used with
+the option @code{--xml}. @xref{msgfmt Invocation}, for more details
+about how one calls the @samp{msgfmt} program. @samp{msgfmt}'s
+@code{--xml} option doesn't perform character escaping, so translated
+strings can have arbitrary XML constructs, such as elements for markup.
+
@c This is the template for new data formats.
@ignore
@end menu
@page
+@node GNU GPL, GNU LGPL, Licenses, Licenses
+@appendixsec GNU GENERAL PUBLIC LICENSE
+@cindex GPL, GNU General Public License
+@cindex License, GNU GPL
@include gpl.texi
@page
+@node GNU LGPL, GNU FDL, GNU GPL, Licenses
+@appendixsec GNU LESSER GENERAL PUBLIC LICENSE
+@cindex LGPL, GNU Lesser General Public License
+@cindex License, GNU LGPL
@include lgpl.texi
@page
+@node GNU FDL, , GNU LGPL, Licenses
+@appendixsec GNU Free Documentation License
+@cindex FDL, GNU Free Documentation License
+@cindex License, GNU FDL
@include fdl.texi
@node Program Index, Option Index, Licenses, Top