1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
4 <TITLE>OpenSP - SGML declaration</TITLE>
7 <H1>Handling of the SGML declaration in OpenSP</H1>
8 <H2>Extended Naming Rules</H2>
10 OpenSP supports the Extended Naming Rules as specified in Annex J
11 of ISO 8879:1986 (added by the 1996 technical corrigendum).
12 <H2>Web SGML Adaptations</H2>
14 OpenSP supports most of the Web SGML Adaptations as specified in
15 Annex K of ISO 8879:1996 (added by the second technical corrigendum, 1998)
16 <H2>Default SGML declaration</H2>
18 If the SGML declaration is omitted
19 and there is no applicable
20 <A HREF="catalog.htm#sgmldecl"><SAMP>SGMLDECL</SAMP></A>
21 or <A HREF="catalog.htm#dtddecl"><SAMP>DTDDECL</SAMP></A>
23 the following declaration will be implied:
25 <!SGML "ISO 8879:1986"
27 BASESET "ISO 646-1983//CHARSET
28 International Reference Version (IRV)//ESC 2/5 4/0"
36 CAPACITY PUBLIC "ISO 8879:1986//CAPACITY Reference//EN"
39 SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
40 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255
41 BASESET "ISO 646-1983//CHARSET International Reference Version
84 with the exception that all characters that are neither significant
85 nor shunned will be assigned to DATACHAR.
86 <H2><A NAME="charset">Character sets</A></H2>
88 A character in a base character set is described either by giving its
89 number in a <i>universal</i> character set, or by specifying a minimum
91 The first 65536 character numbers in the <i>universal</i> character
92 set are assumed to be the same as in Unicode 2.0 (ISO/IEC 10646).
93 The remaining character numbers can be assigned in any way convenient.
95 The public identifier of a base character set can be associated
96 with an entity that describes it by using a
98 entry in the catalog entry file.
99 The entity must be a fragment
100 of an SGML declaration
102 portion of a character set description,
103 following the DESCSET keyword,
104 that is, it must be a sequence of character descriptions,
105 where each character description specifies a described character
106 number, the number of characters and
107 either a character number in the universal character set, a minimum literal
110 Character numbers in the universal character set can be as big as
113 In addition OpenSP has built in knowledge of many character sets.
114 These are identified using the designating sequence in the
115 public identifier. The following designating sequences are
119 <SAMP>ESC 2/5 4/0</SAMP>
121 The full set of ISO 646 IRV.
122 This is not a registered character set,
123 but is recommended by ISO 8879 (clause 10.2.2.4).
125 <SAMP>ESC 2/8 4/0</SAMP>
127 G0 set of ISO 646 IRV,
128 ISO Registration Number 2.
130 <SAMP>ESC 2/8 4/2</SAMP>
133 ISO Registration Number 6.
135 <SAMP>ESC 2/1 4/0</SAMP>
138 ISO Registration Number 1.
140 <SAMP>ESC 2/13 4/1</SAMP>
144 <SAMP>ESC 2/13 4/2</SAMP>
148 <SAMP>ESC 2/13 4/3</SAMP>
152 <SAMP>ESC 2/13 4/4</SAMP>
156 <SAMP>ESC 2/13 4/12</SAMP>
160 <SAMP>ESC 2/13 4/7</SAMP>
164 <SAMP>ESC 2/13 4/6</SAMP>
168 <SAMP>ESC 2/13 4/8</SAMP>
172 <SAMP>ESC 2/13 4/13</SAMP>
176 <SAMP>ESC 2/8 4/10</SAMP>
178 Roman set from JIS-X-0202.
179 JIS version of ISO 646.
180 ISO Registration Number 14.
182 <SAMP>ESC 2/8 4/9</SAMP>
184 Katakana set from JIS X 0201.
185 ISO Registration Number 13.
187 <SAMP>ESC 2/4 4/2</SAMP>
189 <SAMP>ESC 2/6 4/0 ESC 2/4 4/2</SAMP>
192 ISO Registration Numbers 87 and 168.
194 <SAMP>ESC 2/4 2/8 4/4</SAMP>
197 ISO Registration Number 159.
199 <SAMP>ESC 2/4 4/1</SAMP>
202 ISO Registration Number 58.
204 <SAMP>ESC 2/4 2/8 4/3</SAMP>
207 ISO Registration Number 149.
209 <SAMP>ESC 2/5 2/15 4/0</SAMP>
211 <SAMP>ESC 2/5 2/15 4/3</SAMP>
213 <SAMP>ESC 2/5 2/15 4/5</SAMP>
217 <SAMP>ESC 2/5 2/15 4/1</SAMP>
219 <SAMP>ESC 2/5 2/15 4/4</SAMP>
221 <SAMP>ESC 2/5 2/15 4/6</SAMP>
226 <H2>Concrete syntaxes</H2>
228 The public identifier for a public concrete syntax can be associated
229 with an entity that describes using a
231 entry in the catalog entry file.
232 The entity must be a fragment of an SGML declaration
233 consisting of a concrete syntax description
235 <SAMP>SHUNCHAR</SAMP>
237 as in an SGML declaration.
238 The entity can also make use of the following extensions:
241 The Extended Naming Rules extensions can be used regardless of the minimum
242 literal used in the SGML declaration.
245 <I>added function</I>
246 can be expressed as a parameter literal
249 The replacement for a reference reserved name
250 can be expressed as a parameter literal instead of a name.
252 The total number of characters specified for
253 <SAMP>UCNMCHAR</SAMP>
255 <SAMP>UCNMSTRT</SAMP>
256 may exceed the total number of characters specified for
257 <SAMP>LCNMCHAR</SAMP>
259 <SAMP>LCNMSTRT</SAMP>
262 <SAMP>UCNMCHAR</SAMP>
264 <SAMP>UCNMSTRT</SAMP>
265 which does not have a corresponding character in the same position in
266 <SAMP>LCNMCHAR</SAMP>
268 <SAMP>LCNMSTRT</SAMP>
269 is simply assigned to <SAMP>UCNMCHAR</SAMP> or <SAMP>UCNMSTRT</SAMP>
270 without making it the upper-case form of any character.
272 Within the specification of the short reference delimiters,
273 a parameter literal containing exactly one character
274 may be followed by the delimiter <SAMP>-</SAMP>
275 and another parameter literal containing exactly one character.
276 This has the same meaning as a sequence of parameter literals
277 one for each character number that is greater than or equal
278 to the number of the character in the first parameter literal
279 and less than or equal to the number of the character in the
280 second parameter literal.
282 A number may be used as a delimiter in the
284 section with the same meaning as a parameter literal
285 containing just a numeric character reference with that number.
287 <H2>Capacity sets</H2>
289 The public identifier for a public capacity set can be associated
290 with an entity that describes using a
292 entry in the catalog entry file.
293 The entity must be a fragment of an SGML declaration
294 consisting of a sequence of capacity names and numbers.