<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="AUTHOR" content="bkoz@redhat.com (Benjamin Kosnik)" />
- <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
+ <meta name="KEYWORDS" content="HOWTO, libstdc++, locale name LC_ALL" />
<meta name="DESCRIPTION" content="Notes on the locale implementation." />
<title>Notes on the locale implementation.</title>
<link rel="StyleSheet" href="../lib3styles.css" />
Notes on the locale implementation.
</h1>
<em>
-prepared by Benjamin Kosnik (bkoz@redhat.com) on August 8, 2001
+prepared by Benjamin Kosnik (bkoz@redhat.com) on October 14, 2002
</em>
<h2>
-1. Abstract Describes the basic locale object, including nested
-classes id, facet, and the reference-counted implementation object,
-class _Impl.
+1. Abstract
</h2>
<p>
+Describes the basic locale object, including nested
+classes id, facet, and the reference-counted implementation object,
+class _Impl.
</p>
<h2>
2. What the standard says
</h2>
-See Chapter 22 of the standard.
-
-
-<h2>
-3. Problems with "C" locales : global locales, termination.
-</h2>
-
-<p>
-The major problem is fitting an object-orientated and non-global locale
-design ontop of POSIX and other relevant stanards, which include the
-Single Unix (nee X/Open.)
-
-Because POSIX falls down so completely, portibility is an issue.
-</p>
-
-<h2>
-4. Design
-</h2>
-Class locale in non-templatized and has three distinct types nested
+Class locale is non-templatized and has two distinct types nested
inside of it:
+<blockquote>
+<em>
class facet
22.1.1.1.2 Class locale::facet
+</em>
+</blockquote>
+<p>
Facets actually implement locale functionality. For instance, a facet
called numpunct is the data objects that can be used to query for the
thousands separator is in the German locale.
+</p>
Literally, a facet is strictly defined:
- - containing
-public:
- static locale::id id;
-
-- or derived from another facet
-
-The only other thing of interest in this class is the memory
-management of facets. Each constructor of a facet class takes a
-std::size_t __refs argument: if __refs == 0, the facet is deleted when
-the locale containing it is destroyed. If __refs == 1, the facet is
-not destroyed, even when it is no longer referenced.
+<ul>
+ <li>containing the following public data member:
+ <p>
+ <code>static locale::id id;</code>
+ </p>
+ </li>
+
+ <li>derived from another facet:
+ <p>
+ <code> class gnu_codecvt: public std::ctype<user-defined-type></code>
+ </p>
+ </li>
+</ul>
+<p>
+Of interest in this class are the memory management options explicitly
+specified as an argument to facet's constructor. Each constructor of a
+facet class takes a std::size_t __refs argument: if __refs == 0, the
+facet is deleted when the locale containing it is destroyed. If __refs
+== 1, the facet is not destroyed, even when it is no longer
+referenced.
+</p>
+<blockquote>
+<em>
class id
+22.1.1.1.3 - Class locale::id
+</em>
+</blockquote>
+
+<p>
Provides an index for looking up specific facets.
+<p>
-class _Impl
-The internal representation of the std::locale object.
<h2>
-5. Relationship to traditional "C" locales.
+3. Interacting with "C" locales.
</h2>
+<p>
+Some help on determining the underlying support for locales on a system.
+Note, this is specific to linux (and glibc-2.3.x)
+</p>
+
+<ul>
+ <li> <code>`locale -a`</code> displays available locales.
+<blockquote>
+<pre>
+af_ZA
+ar_AE
+ar_AE.utf8
+ar_BH
+ar_BH.utf8
+ar_DZ
+ar_DZ.utf8
+ar_EG
+ar_EG.utf8
+ar_IN
+ar_IQ
+ar_IQ.utf8
+ar_JO
+ar_JO.utf8
+ar_KW
+ar_KW.utf8
+ar_LB
+ar_LB.utf8
+ar_LY
+ar_LY.utf8
+ar_MA
+ar_MA.utf8
+ar_OM
+ar_OM.utf8
+ar_QA
+ar_QA.utf8
+ar_SA
+ar_SA.utf8
+ar_SD
+ar_SD.utf8
+ar_SY
+ar_SY.utf8
+ar_TN
+ar_TN.utf8
+ar_YE
+ar_YE.utf8
+be_BY
+be_BY.utf8
+bg_BG
+bg_BG.utf8
+br_FR
+bs_BA
+C
+ca_ES
+ca_ES@euro
+ca_ES.utf8
+ca_ES.utf8@euro
+cs_CZ
+cs_CZ.utf8
+cy_GB
+da_DK
+da_DK.iso885915
+da_DK.utf8
+de_AT
+de_AT@euro
+de_AT.utf8
+de_AT.utf8@euro
+de_BE
+de_BE@euro
+de_BE.utf8
+de_BE.utf8@euro
+de_CH
+de_CH.utf8
+de_DE
+de_DE@euro
+de_DE.utf8
+de_DE.utf8@euro
+de_LU
+de_LU@euro
+de_LU.utf8
+de_LU.utf8@euro
+el_GR
+el_GR.utf8
+en_AU
+en_AU.utf8
+en_BW
+en_BW.utf8
+en_CA
+en_CA.utf8
+en_DK
+en_DK.utf8
+en_GB
+en_GB.iso885915
+en_GB.utf8
+en_HK
+en_HK.utf8
+en_IE
+en_IE@euro
+en_IE.utf8
+en_IE.utf8@euro
+en_IN
+en_NZ
+en_NZ.utf8
+en_PH
+en_PH.utf8
+en_SG
+en_SG.utf8
+en_US
+en_US.iso885915
+en_US.utf8
+en_ZA
+en_ZA.utf8
+en_ZW
+en_ZW.utf8
+es_AR
+es_AR.utf8
+es_BO
+es_BO.utf8
+es_CL
+es_CL.utf8
+es_CO
+es_CO.utf8
+es_CR
+es_CR.utf8
+es_DO
+es_DO.utf8
+es_EC
+es_EC.utf8
+es_ES
+es_ES@euro
+es_ES.utf8
+es_ES.utf8@euro
+es_GT
+es_GT.utf8
+es_HN
+es_HN.utf8
+es_MX
+es_MX.utf8
+es_NI
+es_NI.utf8
+es_PA
+es_PA.utf8
+es_PE
+es_PE.utf8
+es_PR
+es_PR.utf8
+es_PY
+es_PY.utf8
+es_SV
+es_SV.utf8
+es_US
+es_US.utf8
+es_UY
+es_UY.utf8
+es_VE
+es_VE.utf8
+et_EE
+et_EE.utf8
+eu_ES
+eu_ES@euro
+eu_ES.utf8
+eu_ES.utf8@euro
+fa_IR
+fi_FI
+fi_FI@euro
+fi_FI.utf8
+fi_FI.utf8@euro
+fo_FO
+fo_FO.utf8
+fr_BE
+fr_BE@euro
+fr_BE.utf8
+fr_BE.utf8@euro
+fr_CA
+fr_CA.utf8
+fr_CH
+fr_CH.utf8
+fr_FR
+fr_FR@euro
+fr_FR.utf8
+fr_FR.utf8@euro
+fr_LU
+fr_LU@euro
+fr_LU.utf8
+fr_LU.utf8@euro
+ga_IE
+ga_IE@euro
+ga_IE.utf8
+ga_IE.utf8@euro
+gl_ES
+gl_ES@euro
+gl_ES.utf8
+gl_ES.utf8@euro
+gv_GB
+gv_GB.utf8
+he_IL
+he_IL.utf8
+hi_IN
+hr_HR
+hr_HR.utf8
+hu_HU
+hu_HU.utf8
+id_ID
+id_ID.utf8
+is_IS
+is_IS.utf8
+it_CH
+it_CH.utf8
+it_IT
+it_IT@euro
+it_IT.utf8
+it_IT.utf8@euro
+iw_IL
+iw_IL.utf8
+ja_JP.eucjp
+ja_JP.utf8
+ka_GE
+kl_GL
+kl_GL.utf8
+ko_KR.euckr
+ko_KR.utf8
+kw_GB
+kw_GB.utf8
+lt_LT
+lt_LT.utf8
+lv_LV
+lv_LV.utf8
+mi_NZ
+mk_MK
+mk_MK.utf8
+mr_IN
+ms_MY
+ms_MY.utf8
+mt_MT
+mt_MT.utf8
+nl_BE
+nl_BE@euro
+nl_BE.utf8
+nl_BE.utf8@euro
+nl_NL
+nl_NL@euro
+nl_NL.utf8
+nl_NL.utf8@euro
+nn_NO
+nn_NO.utf8
+no_NO
+no_NO.utf8
+oc_FR
+pl_PL
+pl_PL.utf8
+POSIX
+pt_BR
+pt_BR.utf8
+pt_PT
+pt_PT@euro
+pt_PT.utf8
+pt_PT.utf8@euro
+ro_RO
+ro_RO.utf8
+ru_RU
+ru_RU.koi8r
+ru_RU.utf8
+ru_UA
+ru_UA.utf8
+se_NO
+sk_SK
+sk_SK.utf8
+sl_SI
+sl_SI.utf8
+sq_AL
+sq_AL.utf8
+sr_YU
+sr_YU@cyrillic
+sr_YU.utf8
+sr_YU.utf8@cyrillic
+sv_FI
+sv_FI@euro
+sv_FI.utf8
+sv_FI.utf8@euro
+sv_SE
+sv_SE.iso885915
+sv_SE.utf8
+ta_IN
+te_IN
+tg_TJ
+th_TH
+th_TH.utf8
+tl_PH
+tr_TR
+tr_TR.utf8
+uk_UA
+uk_UA.utf8
+ur_PK
+uz_UZ
+vi_VN
+vi_VN.tcvn
+wa_BE
+wa_BE@euro
+yi_US
+zh_CN
+zh_CN.gb18030
+zh_CN.gbk
+zh_CN.utf8
+zh_HK
+zh_HK.utf8
+zh_TW
+zh_TW.euctw
+zh_TW.utf8
+</pre>
+</blockquote>
+</li>
+
+ <li> <code>`locale`</code> displays environmental variables
+ that impact how locale("") will be deduced.
+
+<blockquote>
+<pre>
+LANG=en_US
+LC_CTYPE="en_US"
+LC_NUMERIC="en_US"
+LC_TIME="en_US"
+LC_COLLATE="en_US"
+LC_MONETARY="en_US"
+LC_MESSAGES="en_US"
+LC_PAPER="en_US"
+LC_NAME="en_US"
+LC_ADDRESS="en_US"
+LC_TELEPHONE="en_US"
+LC_MEASUREMENT="en_US"
+LC_IDENTIFICATION="en_US"
+LC_ALL=
+</pre>
+</blockquote>
+</li>
+</ul>
+
+<p>
From Josuttis, p. 697-698, which says, that "there is only *one*
relation (of the C++ locale mechanism) to the C locale mechanism: the
global C locale is modified if a named C++ locale object is set as the
global locale" (emphasis Paolo), that is:
+</p>
+ <code>std::locale::global(std::locale(""));</code>
- std::locale::global(std::locale(""));
-
-affects the C functions as if the following call was made:
+<p>affects the C functions as if the following call was made:</p>
- std::setlocale(LC_ALL, "");
+ <code>std::setlocale(LC_ALL, "");</code>
+<p>
On the other hand, there is *no* viceversa, that is, calling setlocale
has *no* whatsoever on the C++ locale mechanism, in particular on the
working of locale(""), which constructs the locale object from the
environment of the running program, that is, in practice, the set of
LC_ALL, LANG, etc. variable of the shell.
+</p>
+
<h2>
-5. Examples
+4. Design
</h2>
-<pre>
- typedef __locale_t locale;
-</pre>
+
+<p>
+The major design challenge is fitting an object-orientated and
+non-global locale design ontop of POSIX and other relevant stanards,
+which include the Single Unix (nee X/Open.)
+</p>
+
+<p>
+Because POSIX falls down so completely, portibility is an issue.
+</p>
+
+class _Impl
+The internal representation of the std::locale object.
+
+
+<h2>
+5. Examples
+</h2>
More information can be found in the following testcases:
<ul>
</h2>
<ul>
- <li> locale -a displays available locales on linux </li>
-
<li> locale initialization: at what point does _S_classic,
_S_global get initialized? Can named locales assume this
initialization has already taken place? </li>
{
bool test = true;
#ifdef _GLIBCPP_HAVE_SETENV
- const char* oldLC_ALL = getenv("LC_ALL");
+ const char* LC_ALL_orig = getenv("LC_ALL");
if (!setenv("LC_ALL", "it_IT", 1))
{
std::locale loc("");
VERIFY( loc.name() == "it_IT" );
- setenv("LC_ALL", oldLC_ALL ? oldLC_ALL : "", 1);
+ setenv("LC_ALL", LC_ALL_orig ? LC_ALL_orig : "", 1);
}
#endif
}
-// More tests for Posix locale::name.
+// More tests for locale("") == POSIX locale::name.
void test04()
{
bool test = true;
+ using namespace std;
+
#ifdef _GLIBCPP_HAVE_SETENV
- const char* oldLC_ALL = getenv("LC_ALL") ? strdup(getenv("LC_ALL")) : "";
- const char* oldLANG = getenv("LANG") ? strdup(getenv("LANG")) : "";
+ const char* LANG_orig = getenv("LANG") ? strdup(getenv("LANG")) : "";
+ const char* LC_ALL_orig = getenv("LC_ALL") ? strdup(getenv("LC_ALL")) : "";
+ const char* LC_CTYPE_orig =
+ getenv("LC_CTYPE") ? strdup(getenv("LC_CTYPE")) : "";
+ const char* LC_NUMERIC_orig =
+ getenv("LC_NUMERIC") ? strdup(getenv("LC_NUMERIC")) : "";
+ const char* LC_COLLATE_orig =
+ getenv("LC_COLLATE") ? strdup(getenv("LC_COLLATE")) : "";
+ const char* LC_TIME_orig =
+ getenv("LC_TIME") ? strdup(getenv("LC_TIME")) : "";
+ const char* LC_MONETARY_orig =
+ getenv("LC_MONETARY") ? strdup(getenv("LC_MONETARY")) : "";
+ const char* LC_MESSAGES_orig =
+ getenv("LC_MESSAGES") ? strdup(getenv("LC_MESSAGES")) : "";
+#if _GLIBCPP_NUM_CATEGORIES
+ const char* LC_PAPER_orig =
+ getenv("LC_PAPER") ? strdup(getenv("LC_PAPER")) : "";
+ const char* LC_NAME_orig =
+ getenv("LC_NAME") ? strdup(getenv("LC_NAME")) : "";
+ const char* LC_ADDRESS_orig =
+ getenv("LC_ADDRESS") ? strdup(getenv("LC_ADDRESS")) : "";
+ const char* LC_TELEPHONE_orig =
+ getenv("LC_TELEPHONE") ? strdup(getenv("LC_TELEPHONE")) : "";
+ const char* LC_MEASUREMENT_orig =
+ getenv("LC_MEASUREMENT") ? strdup(getenv("LC_MEASUREMENT")) : "";
+ const char* LC_IDENTIFICATION_orig =
+ getenv("LC_IDENTIFICATION") ? strdup(getenv("LC_IDENTIFICATION")) : "";
+#endif
// Check that a "POSIX" LC_ALL is equivalent to "C".
if (!setenv("LC_ALL", "POSIX", 1))
{
- std::locale loc("");
+ locale loc("");
VERIFY( loc.name() == "C" );
}
+ setenv("LC_ALL", "", 1);
+
+ // Check that a "en_PH" LC_ALL is equivalent to "en_PH".
+ if (!setenv("LC_ALL", "en_PH", 1))
+ {
+ locale loc("");
+ VERIFY( loc.name() == "en_PH" );
+ }
+ setenv("LC_ALL", "", 1);
+
+ // Explicit check that LC_ALL sets regardless of LC_* and LANG.
+ if (!setenv("LANG", "es_MX", 1) && !setenv("LC_COLLATE", "de_DE", 1))
+ {
+ if (!setenv("LC_ALL", "en_PH", 1))
+ {
+ locale loc("");
+ VERIFY( loc.name() == "en_PH" );
+ }
+ setenv("LC_ALL", "", 1);
+ setenv("LANG", LANG_orig ? LANG_orig : "", 1);
+ setenv("LC_COLLATE", LC_COLLATE_orig ? LC_COLLATE_orig : "", 1);
+ }
+
+ // NB: LANG checks all LC_* macro settings. As such, all LC_* macros
+ // must be cleared for these tests, and then restored.
+ setenv("LC_ALL", "", 1);
+ setenv("LC_CTYPE", "", 1);
+ setenv("LC_NUMERIC", "", 1);
+ setenv("LC_COLLATE", "", 1);
+ setenv("LC_TIME", "", 1);
+ setenv("LC_MONETARY", "", 1);
+ setenv("LC_MESSAGES", "", 1);
+#if _GLIBCPP_NUM_CATEGORIES
+ setenv("LC_PAPER", "", 1);
+ setenv("LC_NAME", "", 1);
+ setenv("LC_ADDRESS", "", 1);
+ setenv("LC_TELEPHONE", "", 1);
+ setenv("LC_MEASUREMENT", "", 1);
+ setenv("LC_IDENTIFICATION", "", 1);
+#endif
// Check the default set by LANG.
- if (!setenv("LC_ALL", "", 1) && !setenv("LANG", "fr_FR", 1))
+ if (!setenv("LANG", "fr_FR", 1))
{
- std::locale loc("");
+ locale loc("");
VERIFY( loc.name() == "fr_FR" );
}
// Check that a "POSIX" LANG is equivalent to "C".
if (!setenv("LANG", "POSIX", 1))
{
- std::locale loc("");
+ locale loc("");
VERIFY( loc.name() == "C" );
}
// Setting a category in the "C" default.
- const char* oldLC_COLLATE =
- getenv("LC_COLLATE") ? strdup(getenv("LC_COLLATE")) : "";
if (!setenv("LC_COLLATE", "de_DE", 1))
{
- std::locale loc("");
+ locale loc("");
#if _GLIBCPP_NUM_CATEGORIES
VERIFY( loc.name() == "LC_CTYPE=C;LC_NUMERIC=C;LC_COLLATE=de_DE;"
// Changing the LANG default while LC_COLLATE is set.
if (!setenv("LANG", "fr_FR", 1))
{
- std::locale loc("");
+ locale loc("");
#if _GLIBCPP_NUM_CATEGORIES
VERIFY( loc.name() == "LC_CTYPE=fr_FR;LC_NUMERIC=fr_FR;"
"LC_COLLATE=de_DE;LC_TIME=fr_FR;LC_MONETARY=fr_FR;"
}
// Changing another (C only) category.
- const char* oldLC_IDENTIFICATION =
- getenv("LC_IDENTIFICATION") ? strdup(getenv("LC_IDENTIFICATION")) : "";
#if _GLIBCPP_NUM_CATEGORIES
if (!setenv("LC_IDENTIFICATION", "it_IT", 1))
{
- std::locale loc("");
+ locale loc("");
VERIFY( loc.name() == "LC_CTYPE=fr_FR;LC_NUMERIC=fr_FR;"
"LC_COLLATE=de_DE;LC_TIME=fr_FR;LC_MONETARY=fr_FR;"
"LC_MESSAGES=fr_FR;LC_PAPER=fr_FR;LC_NAME=fr_FR;"
#endif
// Restore the environment.
- setenv("LC_ALL", oldLC_ALL ? oldLC_ALL : "", 1);
- setenv("LANG", oldLANG ? oldLANG : "", 1);
- setenv("LC_COLLATE", oldLC_COLLATE ? oldLC_COLLATE : "", 1);
- setenv("LC_IDENTIFICATION",
- oldLC_IDENTIFICATION ? oldLC_IDENTIFICATION : "", 1);
+ setenv("LANG", LANG_orig ? LANG_orig : "", 1);
+ setenv("LC_ALL", LC_ALL_orig ? LC_ALL_orig : "", 1);
+ setenv("LC_CTYPE", LC_CTYPE_orig ? LC_CTYPE_orig : "", 1);
+ setenv("LC_NUMERIC", LC_NUMERIC_orig ? LC_NUMERIC_orig : "", 1);
+ setenv("LC_COLLATE", LC_COLLATE_orig ? LC_COLLATE_orig : "", 1);
+ setenv("LC_TIME", LC_TIME_orig ? LC_TIME_orig : "", 1);
+ setenv("LC_MONETARY", LC_MONETARY_orig ? LC_MONETARY_orig : "", 1);
+ setenv("LC_MESSAGES", LC_MESSAGES_orig ? LC_MESSAGES_orig : "", 1);
+#if _GLIBCPP_NUM_CATEGORIES
+ setenv("LC_PAPER", LC_PAPER_orig ? LC_PAPER_orig : "", 1);
+ setenv("LC_NAME", LC_NAME_orig ? LC_NAME_orig : "", 1);
+ setenv("LC_ADDRESS", LC_ADDRESS_orig ? LC_ADDRESS_orig : "", 1);
+ setenv("LC_TELEPHONE", LC_TELEPHONE_orig ? LC_TELEPHONE_orig : "", 1);
+ setenv("LC_MEASUREMENT", LC_MEASUREMENT_orig ? LC_MEASUREMENT_orig : "", 1);
+ setenv("LC_IDENTIFICATION",
+ LC_IDENTIFICATION_orig ? LC_IDENTIFICATION_orig : "", 1);
+#endif
+
#endif
}