1 These notes attempt to explain how to use the ASN.1 infrastructure to
2 add new ASN.1 types. ASN.1 is complicated and easy to get wrong, so
3 it is best to verify your results against another tool (such as asn1c)
4 if at all possible. These notes are up to date as of 2012-02-13.
6 If you are trying to debug a problem that shows up in the ASN.1
7 encoder or decoder, skip to the last section.
13 For the moment, a developer must hand-translate the ASN.1 module into
14 macro invocations that generate data structures used by the encoder
15 and decoder. Ideally we would have a tool to compile an ASN.1 module
16 (and probably some additional information about C identifier mappings)
17 and generate the macro invocations.
19 Currently the ASN.1 infrastructure is not visible to applications or
20 plugins. For plugin modules shipped as part of the krb5 tree, the
21 types can be added to asn1_k_encode.c and exported from libkrb5.
22 Plugin modules built separately from the krb5 tree must use another
23 tool (such as asn1c) for now if they need to do ASN.1 encoding or
30 Before you start writing macro invocations, it is important to
31 understand a little bit about ASN.1 tags. You will most commonly see
32 tag notation in a sequence definition, like:
34 TypeName ::= SEQUENCE {
35 field-name [0] IMPLICIT OCTET STRING OPTIONAL
38 Contrary to intuition, the tag notation "[0] IMPLICIT" is not a
39 property of the sequence field; instead, it specifies a type that
40 wraps the type to the right (OCTET STRING). The right way to think
41 about the above definition is:
43 TypeName is defined as a sequence type
44 which has an optional field named field-name
45 whose type is a tagged type
46 the tag's class is context-specific (by default)
49 the tagged type wraps OCTET STRING
51 The other case you are likely to see tag notation is something like:
53 AS-REQ ::= [APPLICATION 10] KDC-REQ
55 This example defines AS-REQ to be a tagged type whose class is
56 application, whose tag number is 10, and whose base type is KDC-REQ.
57 The tag may be implicit or explicit depending on the module's tag
58 environment, which we will get to in a moment.
60 Tags can have one of four classes: universal, application, private,
61 and context-specific. Universal tags are used for built-in ASN.1
62 types. Application and context-specific tags are the most common to
63 see in ASN.1 modules; private is rarely used. If no tag class is
64 specified, the default is context-specific.
66 Tags can be explicit or implicit, and the distinction is important to
67 the wire encoding. If a tag's closing bracket is followed by the word
68 IMPLICIT or EXPLICIT, then it is clear which kind of tag it is, but
69 usually there will be no such annotation. If not, the default depends
70 on the header of the ASN.1 module. Look at the top of the module for
71 the word DEFINITIONS. It may be followed by one of three phrases:
73 * EXPLICIT TAGS -- in this case, tags default to explicit
74 * IMPLICIT TAGS -- in this case, tags default to implicit (usually)
75 * AUTOMATIC TAGS -- tags default to implicit (usually) and are also
76 automatically added to sequence fields (usually)
78 If none of those phrases appear, the default is explicit tags.
80 Even if a module defaults to implicit tags, a tag defaults to explicit
81 if its base type is a choice type or ANY type (or the information
82 object equivalent of an ANY type).
84 If the module's default is AUTOMATIC TAGS, sequence and set fields
85 should have ascending context-specific tags wrapped around the field
86 types, starting from 0, unless one of the fields of the sequence or
87 set is already a tagged type. See ITU X.680 section 24.2 for details,
88 particularly if COMPONENTS OF is used in the sequence definition.
94 In our infrastructure, a type descriptor specifies a mapping between
95 an ASN.1 type and a C type. The first step is to ensure that type
96 descriptors are defined for the basic types used by your ASN.1 module,
97 as mapped to the C types used in your structures, in asn1_k_encode.c.
98 If not, you will need to create it. For a BOOLEAN or INTEGER ASN.1
99 type, you will use one of these macros:
101 DEFBOOLTYPE(descname, ctype)
102 DEFINTTYPE(descname, ctype)
103 DEFUINTTYPE(descname, ctype)
105 where "descname" is an identifier you make up and "ctype" is the
106 integer type of the C object you want to map the ASN.1 value to. For
107 integers, use DEFINTTYPE if the C type is a signed integer type and
108 DEFUINTTYPE if it is an unsigned type. (For booleans, the distinction
109 is unimportant since all integer types can hold the values 0 and 1.)
110 We don't generally define integer mappings for every typedef name of
111 an integer type. For example, we use the type descriptor int32, which
112 maps an ASN.1 INTEGER to a krb5_int32, for krb5_enctype values.
114 String types are a little more complicated. Our practice is to store
115 strings in a krb5_data structure (rather than a zero-terminated C
116 string), so our infrastructure currently assumes that all strings are
117 represented as "counted types", meaning the C representation is a
118 combination of a pointer and an integer type. So, first you must
119 declare a counted type descriptor (we will describe those in more
120 detail later) with something like:
122 DEFCOUNTEDSTRINGTYPE(generalstring, char *, unsigned int,
123 k5_asn1_encode_bytestring, k5_asn1_decode_bytestring,
126 The first parameter is an identifier you make up. The second and
127 third parameters are the C types of the pointer and integer holding
128 the string; for a krb5_data object, those should be the types in the
129 example. The pointer type must be char * or unsigned char *. The
130 fourth and fifth parameters reference primitive encoder and decoder
131 functions; these should almost always be the ones in the example,
132 unless the ASN.1 type is BIT STRING. The sixth parameter is the
133 universal tag number of the ASN.1 type, as defined in krbasn1.h.
135 Once you have defined the counted type, you can define a normal type
136 descriptor to wrap it in a krb5_data structure with something like:
138 DEFCOUNTEDTYPE(gstring_data, krb5_data, data, length, generalstring);
144 In our infrastructure, we model ASN.1 sequences using an array of
145 normal type descriptors. Each type descriptor is applied in turn to
146 the C object to generate (or consume) an encoding of an ASN.1 value.
148 Of course, each value needs to be stored in a different place within
149 the C object, or they would just overwrite each other. To address
150 this, you must create an offset type wrapper for each sequence field:
152 DEFOFFSETTYPE(descname, structuretype, fieldname, basedesc)
154 where "descname" is an identifier you make up, "structuretype" and
155 "fieldtype" are used to compute the offset and type-check the
156 structure field, and "basedesc" is the type of the ASN.1 object to be
157 stored at that offset.
159 If your C structure contains a pointer to another C object, you will
160 need to first define a pointer wrapper, which is very simple:
162 DEFPTRTYPE(descname, basedesc)
164 Then wrap the defined pointer type in an offset type as described
165 above. Once a pointer descriptor is defined for a base descriptor, it
166 can be reused many times, so pointer descriptors are usually defined
167 right after the types they wrap. When decoding, pointer wrappers
168 cause a pointer to be allocated with a block of memory equal to the
169 size of the C type corresponding to the base type. (For offset types,
170 the corresponding C type is the structure type inside which the offset
171 is computed.) It is okay for several fields of a sequence to
172 reference the same pointer field within a structure, as long as the
173 pointer types all wrap base types with the same corresponding C type.
175 If the sequence field has a context tag attached to its type, you will
176 also need to create a tag wrapper for it:
178 DEFCTAGGEDTYPE(descname, tagnum, basedesc)
179 DEFCTAGGEDTYPE_IMPLICIT(descname, tagnum, basedesc)
181 Use the first macro for explicit context tags and the second for
182 implicit context tags. "tagnum" is the number of the context-specific
183 tag, and "basedesc" is the name you chose for the offset type above.
185 You don't actually need to separately write out DEFOFFSETTYPE and
186 DEFCTAGGEDTYPE for each field. The combination of offset and context
187 tag is so common that we have a macro to combine them:
189 DEFFIELD(descname, structuretype, fieldname, tagnum, basedesc)
190 DEFFIELD_IMPLICIT(descname, structuretype, fieldname, tagnum, basedesc)
192 Once you have defined tag and offset wrappers for each sequence field,
193 combine them together in an array and use the DEFSEQTYPE macro to
194 define the sequence type descriptor:
196 static const struct atype_info *my_sequence_fields[] = {
197 &k5_atype_my_sequence_0, &k5_atype_my_sequence_1,
199 DEFSEQTYPE(my_sequence, structuretype, my_sequence_fields)
201 Each field name must by prefixed by "&k5_atype_" to get a pointer to
202 the actual variable used to hold the type descriptor.
204 ASN.1 sequence types may or may not be defined to be extensible, and
205 may group extensions together in blocks which must appear together.
206 Our model does not distinguish these cases. Our decoder treats all
207 sequence types as extensible. Extension blocks must be modeled by
208 making all of the extension fields optional, and the decoder will not
209 enforce that they appear together.
211 If your ASN.1 sequence contains optional fields, keep reading.
214 Optional sequence fields
215 ------------------------
217 ASN.1 sequence fields can be annotated with OPTIONAL or, less
218 commonly, with DEFAULT VALUE. (Be aware that if DEFAULT VALUE is
219 specified for a sequence field, DER mandates that fields with that
220 value not be encoded within the sequence. Most standards in the
221 Kerberos ecosystem avoid the use of DEFAULT VALUE for this reason.)
222 Although optionality is a property of sequence or set fields, not
223 types, we still model optional sequence fields using type wrappers.
224 Optional type wrappers must only be used as members of a sequence,
225 although they can be nested in offset or pointer wrappers first.
227 The simplest way to represent an optional value in a C structure is
228 with a pointer which takes the value NULL if the field is not present.
229 In this case, you can just use DEFOPTIONALZEROTYPE to wrap the pointer
232 DEFPTRTYPE(ptr_basetype, basetype);
233 DEFOPTIONALZEROTYPE(opt_ptr_basetype, ptr_basetype);
235 and then use opt_ptr_basetype in the DEFFIELD invocation for the
236 sequence field. DEFOPTIONALZEROTYPE can also be used for integer
237 types, if it is okay for the value 0 to represent that the
238 corresponding ASN.1 value is omitted. Optional-zero wrappers, like
239 pointer wrappers, are usually defined just after the types they wrap.
241 For null-terminated sequences, you can use a wrapper like this:
243 DEFOPTIONALEMPTYTYPE(opt_seqof_basetype, seqof_basetype)
245 to omit the sequence if it is either NULL or of zero length.
247 A more general way to wrap optional types is:
249 DEFOPTIONALTYPE(descname, predicatefn, initfn, basedesc);
251 where "predicatefn" has the signature "int (*fn)(const void *p)" and
252 is used by the encoder to test whether the ASN.1 value is present in
253 the C object. "initfn" has the signature "void (*fn)(void *p)" and is
254 used by the decoder to initialize the C object field if the
255 corresponding ASN.1 value is omitted in the wire encoding. "initfn"
256 can be NULL, in which case the C object will simply be left alone.
257 All C objects are initialized to zero-filled memory when they are
258 allocated by the decoder.
260 An optional string type, represented in a krb5_data structure, can be
261 wrapped using the nonempty_data function already defined in
262 asn1_k_encode.c, like so:
264 DEFOPTIONALTYPE(opt_ostring_data, nonempty_data, NULL, ostring_data);
270 ASN.1 sequence-of types can be represented as C types in two ways.
271 The simplest is to use an array of pointers terminated in a null
272 pointer. A descriptor for a sequence-of represented this way is
273 defined in three steps:
275 DEFPTRTYPE(ptr_basetype, basetype);
276 DEFNULLTERMSEQOFTYPE(seqof_basetype, ptr_basetype);
277 DEFPTRTYPE(ptr_seqof_basetype, seqof_basetype);
279 If the C type corresponding to basetype is "ctype", then the C type
280 corresponding to ptr_seqof_basetype will be "ctype **". The middle
281 type sort of corresponds to "ctype *", but not exactly, as it
282 describes an object of variable size.
284 You can also use DEFNONEMPTYNULLTERMSEQOFTYPE in the second step. In
285 this case, the encoder will throw an error if the sequence is empty.
286 For historical reasons, the decoder will *not* throw an error if the
287 sequence is empty, so the calling code must check before assuming a
288 first element is present.
290 The other way of representing sequences is through a combination of
291 pointer and count. This pattern is most often used for compactness
292 when the base type is an integer type. A descriptor for a sequence-of
293 represented this way is defined using a counted type descriptor:
295 DEFCOUNTEDSEQOFTYPE(descname, lentype, basedesc)
297 where "lentype" is the C type of the length and "basedesc" is a
298 pointer wrapper for the sequence element type (*not* the element type
299 itself). For example, an array of 32-bit signed integers is defined
302 DEFINTTYPE(int32, krb5_int32);
303 DEFPTRTYPE(int32_ptr, int32);
304 DEFCOUNTEDSEQOFTYPE(cseqof_int32, krb5_int32, int32_ptr);
306 To use a counted sequence-of type in a sequence, use DEFCOUNTEDTYPE:
308 DEFCOUNTEDTYPE(descname, structuretype, ptrfield, lenfield, cdesc)
310 where "structuretype", "ptrfield", and "lenfield" are used to compute
311 the field offsets and type-check the structure fields, and "cdesc" is
312 the name of the counted type descriptor.
314 The combination of DEFCOUNTEDTYPE and DEFCTAGGEDTYPE can be
315 abbreviated using DEFCNFIELD:
317 DEFCNFIELD(descname, structuretype, ptrfield, lenfield, tagnum, cdesc)
323 We've previously covered DEFCTAGGEDTYPE and DEFCTAGGEDTYPE_IMPLICIT,
324 which are used to define context-specific tag wrappers. There are
325 two other macros for creating tag wrappers. The first is:
327 DEFAPPTAGGEDTYPE(descname, tagnum, basedesc)
329 Use this macro to model an "[APPLICATION tagnum]" tag wrapper in an
332 There is also a general tag wrapper macro:
334 DEFTAGGEDTYPE(descname, class, construction, tag, implicit, basedesc)
336 where "class" is one of UNIVERSAL, APPLICATION, CONTEXT_SPECIFIC, or
337 PRIVATE, "construction" is one of PRIMITIVE or CONSTRUCTED, "tag" is
338 the tag number, "implicit" is 1 for an implicit tag and 0 for an
339 explicit tag, and "basedesc" is the wrapped type. Note that that
340 primitive vs. constructed is not a concept within the abstract ASN.1
341 type model, but is instead a concept used in DER. In general, all
342 explicit tags should be constructed (but see the section on "Dirty
343 tricks" below). The construction parameter is ignored for implicit
350 ASN.1 CHOICE types are represented in C using a signed integer
351 distinguisher and a union. Modeling a choice type happens in three
354 1. Define type descriptors for each alternative of the choice,
355 typically using DEFCTAGGEDTYPE to create a tag wrapper for an existing
356 type. There is no need to create offset type wrappers, as union
357 fields always have an offset of 0. For example:
359 DEFCTAGGEDTYPE(my_choice_0, 0, firstbasedesc);
360 DEFCTAGGEDTYPE(my_choice_1, 1, secondbasedesc);
362 2. Assemble them into an array, similar to how you would for a
363 sequence, and use DEFCHOICETYPE to create a counted type descriptor:
365 static const struct atype_info *my_choice_alternatives[] = {
366 &k5_atype_my_choice_0, &k5_atype_my_choice_1
368 DEFCHOICETYPE(my_choice, union my_choice_choices, enum my_choice_selector,
369 my_choice_alternatives);
371 The second and third parameters to DEFCHOICETYPE are the C types of
372 the union and distinguisher fields.
374 3. Wrap the counted type descriptor in a type descriptor for the
375 structure containing the distinguisher and union:
377 DEFCOUNTEDTYPE_SIGNED(descname, structuretype, u, choice, my_choice);
379 The third and fourth parameters to DEFCOUNTEDTYPE_SIGNED are the field
380 names of the union and distinguisher fields within structuretype.
382 ASN.1 choice types may be defined to be extensible, or may not be.
383 Our model does not distinguish between the two cases. Our decoder
384 treats all choice types as extensible.
386 Our encoder will throw an error if the distinguisher is not within the
387 range of valid offsets of the alternatives array. Our decoder will
388 set the distinguisher to -1 if the tag of the ASN.1 value is not
389 matched by any of the alternatives, and will leave the union
390 zero-filled in that case.
393 Counted type descriptors
394 ------------------------
396 Several times in earlier sections we've referred to the notion of
397 "counted type descriptors" without defining what they are. Counted
398 type descriptors live in a separate namespace from normal type
399 descriptors, and specify a mapping between an ASN.1 type and two C
400 objects, one of them having integer type. There are four kinds of
401 counted type descriptors, defined using the following macros:
403 DEFCOUNTEDSTRINGTYPE(descname, ptrtype, lentype, encfn, decfn, tagnum)
404 DEFCOUNTEDDERTYPE(descname, ptrtype, lentype)
405 DEFCOUNTEDSEQOFTYPE(descname, lentype, baseptrdesc)
406 DEFCHOICETYPE(descname, uniontype, distinguishertype, fields)
408 DEFCOUNTEDDERTYPE is described in the "Dirty tricks" section below.
409 The other three kinds of counted types have been covered previously.
411 Counted types are always used by wrapping them in a normal type
412 descriptor with one of these macros:
414 DEFCOUNTEDTYPE(descname, structuretype, datafield, countfield, cdesc)
415 DEFCOUNTEDTYPE_SIGNED(descname, structuretype, datafield, countfield, cdesc)
417 These macros are similar in concept to an offset type, only with two
418 offsets. Use DEFCOUNTEDTYPE if the count field is unsigned,
419 DEFCOUNTEDTYPE_SIGNED if it is signed.
422 Defining encoder and decoder functions
423 --------------------------------------
425 After you have created a type descriptor for your types, you need to
426 create encoder or decoder functions for the ones you want calling code
427 to be able to process. Do this with one of the following macros:
429 MAKE_ENCODER(funcname, desc)
430 MAKE_DECODER(funcname, desc)
431 MAKE_CODEC(typename, desc)
433 MAKE_ENCODER and MAKE_DECODER allow you to choose function names.
434 MAKE_CODEC defines encoder and decoder functions with the names
435 "encode_typename" and "decode_typename".
437 If you are defining functions for a null-terminated sequence, use the
438 descriptor created with DEFNULLTERMSEQOFTYPE or
439 DEFNONEMPTYNULLTERMSEQOFTYPE, rather than the pointer to it. This is
440 because encoder and decoder functions implicitly traffic in pointers
441 to the C object being encoded or decoded.
443 Encoder and decoder functions must be prototyped separately, either in
444 k5-int.h or in a subsidiary included by it. Encoder functions have
447 krb5_error_code encode_typename(const ctype *rep, krb5_data **code_out);
449 where "ctype" is the C type corresponding to desc. Decoder functions
452 krb5_error_code decode_typename(const krb5_data *code, ctype **rep_out);
454 Decoder functions allocate a container for the C type of the object
455 being decoded and return a pointer to it in *rep_out.
461 New ASN.1 types in libkrb5 will typically only be accepted with test
462 cases. Our current test framework lives in src/tests/asn.1. Adding
463 new types to this framework involves the following steps:
465 1. Define an initializer for a sample value of the type in ktest.c,
466 named ktest_make_sample_typename(). Also define a contents-destructor
467 for it, named ktest_empty_typename(). Prototype these functions in
470 2. Define an equality test for the type in ktest_equal.c. Prototype
471 this in ktest_equal.h. (This step is not necessary if the type has no
474 3. Add a test case to krb5_encode_test.c, following the examples of
475 existing test cases there. Update reference_encode.out and
476 trval_reference.out to contain the output generated by your test case.
478 4. Add a test case to krb5_decode_test.c, following the examples of
479 existing test cases there, and using the output generated by your
482 5. Add a test case to krb5_decode_leak.c, following the examples of
483 existing test cases there.
485 Following these steps will not ensure the correctness of your
486 translation of the ASN.1 module to macro invocations; it only lets us
487 detect unintentional changes to the encodings after they are defined.
488 To ensure that your translations are correct, you should extend
489 tests/asn.1/make-vectors.c and use "make test-vectors" to create
496 In rare cases you may want to represent the raw DER encoding of a
497 value in the C structure. If so, you can use DEFCOUNTEDDERTYPE (or
498 more likely, the existing der_data type descriptor). The encoder and
499 decoder will throw errors if the wire encoding doesn't have a valid
500 outermost tag, so be sure to use valid DER encodings in your test
501 cases (see ktest_make_sample_algorithm_identifier for an example).
503 Conversely, the ASN.1 module may define an OCTET STRING wrapper around
504 a DER encoding which you want to represent as the decoded value. (The
505 existing example of this is in PKINIT hash agility, where the
506 PartyUInfo and PartyVInfo fields of OtherInfo are defined as octet
507 strings which contain the DER encodings of KRB5PrincipalName values.)
508 In this case you can use a DEFTAGGEDTYPE wrapper like so:
510 DEFTAGGEDTYPE(descname, UNIVERSAL, PRIMITIVE, ASN1_OCTETSTRING, 0,
517 We cannot currently encode or decode SET or SET OF types.
519 We cannot model self-referential types (like "MATHSET ::= SET OF
522 If a sequence uses an optional field that is a choice field (without
523 a context tag wrapper), or an optional field that uses a stored DER
524 encoding (again, without a context tag wrapper), our decoder may
525 assign a value to the choice or stored-DER field when the correct
526 behavior is to skip that field and assign the value to a subsequent
527 field. It should be very rare for ASN.1 modules to use choice or open
530 For historical interoperability reasons, our decoder accepts the
531 indefinite length form for constructed tags, which is allowed by BER
532 but not DER. We still require the primitive forms of basic scalar
533 types, however, so we do not accept all BER encodings of ASN.1 values.
539 If you are looking at a stack trace with a bunch of ASN.1 encoder or
540 decoder calls at the top, here are some notes that might help with
543 1. You may have noticed that the entry point into the encoder is
544 defined by a macro like MAKE_CODEC. Don't worry about this; those
545 macros just define thin wrappers around k5_asn1_full_encode and
546 k5_asn1_full_decode. If you are stepping through code and hit a
547 wrapper function, just enter "step" to get into the actual encoder or
550 2. If you are in the encoder, look for stack frames in
551 encode_sequence(), and print the value of i within those stack frames.
552 You should be able to subtract 1 from those values and match them up
553 with the sequence field offsets in asn1_k_encode.c for the type being
554 encoded. For example, if an as-req is being encoded and the i values
555 (starting with the one closest to encode_krb5_as_req) are 4, 2, and 2,
556 you could match those up as following:
558 * as_req_encode wraps untagged_as_req, whose field at offset 3 is the
559 descriptor for kdc_req_4, which wraps kdc_req_body.
561 * kdc_req_body is a function wrapper around kdc_req_hack, whose field
562 at offset 1 is the descriptor for req_body_1, which wraps
565 * opt_principal wraps principal, which wraps principal_data, whose
566 field at offset 1 is the descriptor for princname_1.
568 * princname_1 is a sequence of general strings represented in the data
569 and length fields of the krb5_principal_data structure.
571 So the problem would likely be in the data components of the client
572 principal in the kdc_req structure.
574 3. If you are in the decoder, look for stacks frames in
575 decode_sequence(), and again print the values of i. You can match
576 these up just as above, except without subtracting 1 from the i