libs/endian/doc/endian/buffers.adoc

   1 ////
   2 Copyright 2011-2016 Beman Dawes
   3
   4 Distributed under the Boost Software License, Version 1.0.
   5 (http://www.boost.org/LICENSE_1_0.txt)
   6 ////
   7
   8 [#buffers]
   9 # Endian Buffer Types
  10 :idprefix: buffers_
  11
  12 ## Introduction
  13
  14 The internal byte order of arithmetic types is traditionally called
  15 *endianness*. See the http://en.wikipedia.org/wiki/Endian[Wikipedia] for a full
  16 exploration of *endianness*, including definitions of *big endian* and *little
  17 endian*.
  18
  19 Header `boost/endian/buffers.hpp` provides `endian_buffer`, a portable endian
  20 integer binary buffer class template with control over byte order, value type,
  21 size, and alignment independent of the platform's native endianness. Typedefs
  22 provide easy-to-use names for common configurations.
  23
  24 Use cases primarily involve data portability, either via files or network
  25 connections, but these byte-holders may also be used to reduce memory use, file
  26 size, or network activity since they provide binary numeric sizes not otherwise
  27 available.
  28
  29 Class `endian_buffer` is aimed at users who wish explicit control over when
  30 endianness conversions occur. It also serves as the base class for the
  31 <<arithmetic,endian_arithmetic>> class template, which is aimed at users who
  32 wish fully automatic endianness conversion and direct support for all normal
  33 arithmetic operations.
  34
  35 ## Example
  36
  37 The `example/endian_example.cpp` program writes a binary file containing
  38 four-byte, big-endian and little-endian integers:
  39
  40 ```
  41 #include <iostream>
  42 #include <cstdio>
  43 #include <boost/endian/buffers.hpp>  // see Synopsis below
  44 #include <boost/static_assert.hpp>
  45
  46 using namespace boost::endian;
  47
  48 namespace
  49 {
  50   //  This is an extract from a very widely used GIS file format.
  51   //  Why the designer decided to mix big and little endians in
  52   //  the same file is not known. But this is a real-world format
  53   //  and users wishing to write low level code manipulating these
  54   //  files have to deal with the mixed endianness.
  55
  56   struct header
  57   {
  58     big_int32_buf_t     file_code;
  59     big_int32_buf_t     file_length;
  60     little_int32_buf_t  version;
  61     little_int32_buf_t  shape_type;
  62   };
  63
  64   const char* filename = "test.dat";
  65 }
  66
  67 int main(int, char* [])
  68 {
  69   header h;
  70
  71   BOOST_STATIC_ASSERT(sizeof(h) == 16U);  // reality check
  72
  73   h.file_code   = 0x01020304;
  74   h.file_length = sizeof(header);
  75   h.version     = 1;
  76   h.shape_type  = 0x01020304;
  77
  78   //  Low-level I/O such as POSIX read/write or <cstdio>
  79   //  fread/fwrite is sometimes used for binary file operations
  80   //  when ultimate efficiency is important. Such I/O is often
  81   //  performed in some C++ wrapper class, but to drive home the
  82   //  point that endian integers are often used in fairly
  83   //  low-level code that does bulk I/O operations, <cstdio>
  84   //  fopen/fwrite is used for I/O in this example.
  85
  86   std::FILE* fi = std::fopen(filename, "wb");  // MUST BE BINARY
  87
  88   if (!fi)
  89   {
  90     std::cout << "could not open " << filename << '\n';
  91     return 1;
  92   }
  93
  94   if (std::fwrite(&h, sizeof(header), 1, fi) != 1)
  95   {
  96     std::cout << "write failure for " << filename << '\n';
  97     return 1;
  98   }
  99
 100   std::fclose(fi);
 101
 102   std::cout << "created file " << filename << '\n';
 103
 104   return 0;
 105 }
 106 ```
 107
 108 After compiling and executing `example/endian_example.cpp`, a hex dump of
 109 `test.dat` shows:
 110
 111 ```
 112 01020304 00000010 01000000 04030201
 113 ```
 114
 115 Notice that the first two 32-bit integers are big endian while the second two
 116 are little endian, even though the machine this was compiled and run on was
 117 little endian.
 118
 119 ## Limitations
 120
 121 Requires `<climits>`, `CHAR_BIT == 8`. If `CHAR_BIT` is some other value,
 122 compilation will result in an `#error`. This restriction is in place because the
 123 design, implementation, testing, and documentation has only considered issues
 124 related to 8-bit bytes, and there have been no real-world use cases presented
 125 for other sizes.
 126
 127 In {cpp}03, `endian_buffer` does not meet the requirements for POD types because
 128 it has constructors and a private data member. This means that
 129 common use cases are relying on unspecified behavior in that the {cpp} Standard
 130 does not guarantee memory layout for non-POD types. This has not been a problem
 131 in practice since all known {cpp} compilers  lay out memory as if `endian` were
 132 a POD type. In {cpp}11, it is possible to specify the default constructor as
 133 trivial, and private data members and base classes  no longer disqualify a type
 134 from being a POD type. Thus under {cpp}11, `endian_buffer` will no longer be
 135 relying on unspecified behavior.
 136
 137 ## Feature set
 138
 139 * Big endian| little endian | native endian byte ordering.
 140 * Signed | unsigned
 141 * Unaligned | aligned
 142 * 1-8 byte (unaligned) | 1, 2, 4, 8 byte (aligned)
 143 * Choice of  value type
 144
 145 ## Enums and typedefs
 146
 147 Two scoped enums are provided:
 148
 149 ```
 150 enum class order { big, little, native };
 151
 152 enum class align { no, yes };
 153 ```
 154
 155 One class template is provided:
 156
 157 ```
 158 template <order Order, typename T, std::size_t Nbits,
 159   align Align = align::no>
 160 class endian_buffer;
 161 ```
 162
 163 Typedefs, such as `big_int32_buf_t`, provide convenient naming conventions for
 164 common use cases:
 165
 166 [%header,cols=5*]
 167 |===
 168 |Name |Alignment |Endianness |Sign |Sizes in bits (n)
 169 |`big_intN_buf_t` |no |big |signed |8,16,24,32,40,48,56,64
 170 |`big_uintN_buf_t` |no |big |unsigned |8,16,24,32,40,48,56,64
 171 |`little_intN_buf_t` |no |little |signed |8,16,24,32,40,48,56,64
 172 |`little_uintN_buf_t` |no |little |unsigned |8,16,24,32,40,48,56,64
 173 |`native_intN_buf_t` |no |native |signed |8,16,24,32,40,48,56,64
 174 |`native_uintN_buf_t` |no |native |unsigned |8,16,24,32,40,48,56,64
 175 |`big_intN_buf_at` |yes |big |signed |8,16,32,64
 176 |`big_uintN_buf_at` |yes |big |unsigned |8,16,32,64
 177 |`little_intN_buf_at` |yes |little |signed |8,16,32,64
 178 |`little_uintN_buf_at` |yes |little |unsigned |8,16,32,64
 179 |===
 180
 181 The unaligned types do not cause compilers to insert padding bytes in classes
 182 and structs. This is an important characteristic that can be exploited to
 183 minimize wasted space in memory, files, and network transmissions.
 184
 185 CAUTION: Code that uses aligned types is possibly non-portable because alignment
 186 requirements vary between hardware architectures and because alignment may be
 187 affected by compiler switches or pragmas. For example, alignment of an 64-bit
 188 integer may be to a 32-bit boundary on a 32-bit machine and to a 64-bit boundary
 189 on a 64-bit machine. Furthermore, aligned types are only available on
 190 architectures with 8, 16, 32, and 64-bit integer types.
 191
 192 TIP: Prefer unaligned buffer types.
 193
 194 TIP: Protect yourself against alignment ills. For example:
 195 [none]
 196 {blank}::
 197 +
 198 ```
 199 static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
 200 ```
 201
 202 Note: One-byte big and little buffer types have identical layout on all
 203 platforms, so they never actually reverse endianness. They are provided to
 204 enable generic code, and to improve code readability and searchability.
 205
 206 ## Class template `endian_buffer`
 207
 208 An `endian_buffer` is a byte-holder for arithmetic types with
 209 user-specified endianness, value type, size, and alignment.
 210
 211 ### Synopsis
 212
 213 ```
 214 namespace boost
 215 {
 216   namespace endian
 217   {
 218     //  C++11 features emulated if not available
 219
 220     enum class align { no, yes };
 221
 222     template <order Order, class T, std::size_t Nbits,
 223       align Align = align::no>
 224     class endian_buffer
 225     {
 226     public:
 227
 228       typedef T value_type;
 229
 230       endian_buffer() noexcept = default;
 231       explicit endian_buffer(T v) noexcept;
 232
 233       endian_buffer& operator=(T v) noexcept;
 234       value_type value() const noexcept;
 235       unsigned char* data() noexcept;
 236       unsigned char const* data() const noexcept;
 237
 238     private:
 239
 240       unsigned char value_[Nbits / CHAR_BIT]; // exposition only
 241     };
 242
 243     //  stream inserter
 244     template <class charT, class traits, order Order, class T,
 245       std::size_t n_bits, align Align>
 246     std::basic_ostream<charT, traits>&
 247       operator<<(std::basic_ostream<charT, traits>& os,
 248         const endian_buffer<Order, T, n_bits, Align>& x);
 249
 250     //  stream extractor
 251     template <class charT, class traits, order Order, class T,
 252       std::size_t n_bits, align A>
 253     std::basic_istream<charT, traits>&
 254       operator>>(std::basic_istream<charT, traits>& is,
 255         endian_buffer<Order, T, n_bits, Align>& x);
 256
 257     // typedefs
 258
 259     // unaligned big endian signed integer buffers
 260     typedef endian_buffer<order::big, int_least8_t, 8>        big_int8_buf_t;
 261     typedef endian_buffer<order::big, int_least16_t, 16>      big_int16_buf_t;
 262     typedef endian_buffer<order::big, int_least32_t, 24>      big_int24_buf_t;
 263     typedef endian_buffer<order::big, int_least32_t, 32>      big_int32_buf_t;
 264     typedef endian_buffer<order::big, int_least64_t, 40>      big_int40_buf_t;
 265     typedef endian_buffer<order::big, int_least64_t, 48>      big_int48_buf_t;
 266     typedef endian_buffer<order::big, int_least64_t, 56>      big_int56_buf_t;
 267     typedef endian_buffer<order::big, int_least64_t, 64>      big_int64_buf_t;
 268
 269     // unaligned big endian unsigned integer buffers
 270     typedef endian_buffer<order::big, uint_least8_t, 8>       big_uint8_buf_t;
 271     typedef endian_buffer<order::big, uint_least16_t, 16>     big_uint16_buf_t;
 272     typedef endian_buffer<order::big, uint_least32_t, 24>     big_uint24_buf_t;
 273     typedef endian_buffer<order::big, uint_least32_t, 32>     big_uint32_buf_t;
 274     typedef endian_buffer<order::big, uint_least64_t, 40>     big_uint40_buf_t;
 275     typedef endian_buffer<order::big, uint_least64_t, 48>     big_uint48_buf_t;
 276     typedef endian_buffer<order::big, uint_least64_t, 56>     big_uint56_buf_t;
 277     typedef endian_buffer<order::big, uint_least64_t, 64>     big_uint64_buf_t;
 278
 279     // unaligned big endian floating point buffers
 280     typedef endian_buffer<order::big, float, 32>              big_float32_buf_t;
 281     typedef endian_buffer<order::big, double, 64>             big_float64_buf_t;
 282
 283     // unaligned little endian signed integer buffers
 284     typedef endian_buffer<order::little, int_least8_t, 8>     little_int8_buf_t;
 285     typedef endian_buffer<order::little, int_least16_t, 16>   little_int16_buf_t;
 286     typedef endian_buffer<order::little, int_least32_t, 24>   little_int24_buf_t;
 287     typedef endian_buffer<order::little, int_least32_t, 32>   little_int32_buf_t;
 288     typedef endian_buffer<order::little, int_least64_t, 40>   little_int40_buf_t;
 289     typedef endian_buffer<order::little, int_least64_t, 48>   little_int48_buf_t;
 290     typedef endian_buffer<order::little, int_least64_t, 56>   little_int56_buf_t;
 291     typedef endian_buffer<order::little, int_least64_t, 64>   little_int64_buf_t;
 292
 293     // unaligned little endian unsigned integer buffers
 294     typedef endian_buffer<order::little, uint_least8_t, 8>    little_uint8_buf_t;
 295     typedef endian_buffer<order::little, uint_least16_t, 16>  little_uint16_buf_t;
 296     typedef endian_buffer<order::little, uint_least32_t, 24>  little_uint24_buf_t;
 297     typedef endian_buffer<order::little, uint_least32_t, 32>  little_uint32_buf_t;
 298     typedef endian_buffer<order::little, uint_least64_t, 40>  little_uint40_buf_t;
 299     typedef endian_buffer<order::little, uint_least64_t, 48>  little_uint48_buf_t;
 300     typedef endian_buffer<order::little, uint_least64_t, 56>  little_uint56_buf_t;
 301     typedef endian_buffer<order::little, uint_least64_t, 64>  little_uint64_buf_t;
 302
 303     // unaligned little endian floating point buffers
 304     typedef endian_buffer<order::little, float, 32>           little_float32_buf_t;
 305     typedef endian_buffer<order::little, double, 64>          little_float64_buf_t;
 306
 307     // unaligned native endian signed integer types
 308     typedef implementation-defined_int8_buf_t   native_int8_buf_t;
 309     typedef implementation-defined_int16_buf_t  native_int16_buf_t;
 310     typedef implementation-defined_int24_buf_t  native_int24_buf_t;
 311     typedef implementation-defined_int32_buf_t  native_int32_buf_t;
 312     typedef implementation-defined_int40_buf_t  native_int40_buf_t;
 313     typedef implementation-defined_int48_buf_t  native_int48_buf_t;
 314     typedef implementation-defined_int56_buf_t  native_int56_buf_t;
 315     typedef implementation-defined_int64_buf_t  native_int64_buf_t;
 316
 317     // unaligned native endian unsigned integer types
 318     typedef implementation-defined_uint8_buf_t   native_uint8_buf_t;
 319     typedef implementation-defined_uint16_buf_t  native_uint16_buf_t;
 320     typedef implementation-defined_uint24_buf_t  native_uint24_buf_t;
 321     typedef implementation-defined_uint32_buf_t  native_uint32_buf_t;
 322     typedef implementation-defined_uint40_buf_t  native_uint40_buf_t;
 323     typedef implementation-defined_uint48_buf_t  native_uint48_buf_t;
 324     typedef implementation-defined_uint56_buf_t  native_uint56_buf_t;
 325     typedef implementation-defined_uint64_buf_t  native_uint64_buf_t;
 326
 327     // unaligned native endian floating point types
 328     typedef implementation-defined_float32_buf_t  native_float32_buf_t;
 329     typedef implementation-defined_float64_buf_t  native_float64_buf_t;
 330
 331     // aligned big endian signed integer buffers
 332     typedef endian_buffer<order::big, int8_t, 8, align::yes>       big_int8_buf_at;
 333     typedef endian_buffer<order::big, int16_t, 16, align::yes>     big_int16_buf_at;
 334     typedef endian_buffer<order::big, int32_t, 32, align::yes>     big_int32_buf_at;
 335     typedef endian_buffer<order::big, int64_t, 64, align::yes>     big_int64_buf_at;
 336
 337     // aligned big endian unsigned integer buffers
 338     typedef endian_buffer<order::big, uint8_t, 8, align::yes>      big_uint8_buf_at;
 339     typedef endian_buffer<order::big, uint16_t, 16, align::yes>    big_uint16_buf_at;
 340     typedef endian_buffer<order::big, uint32_t, 32, align::yes>    big_uint32_buf_at;
 341     typedef endian_buffer<order::big, uint64_t, 64, align::yes>    big_uint64_buf_at;
 342
 343     // aligned big endian floating point buffers
 344     typedef endian_buffer<order::big, float, 32, align::yes>       big_float32_buf_at;
 345     typedef endian_buffer<order::big, double, 64, align::yes>      big_float64_buf_at;
 346
 347     // aligned little endian signed integer buffers
 348     typedef endian_buffer<order::little, int8_t, 8, align::yes>    little_int8_buf_at;
 349     typedef endian_buffer<order::little, int16_t, 16, align::yes>  little_int16_buf_at;
 350     typedef endian_buffer<order::little, int32_t, 32, align::yes>  little_int32_buf_at;
 351     typedef endian_buffer<order::little, int64_t, 64, align::yes>  little_int64_buf_at;
 352
 353     // aligned little endian unsigned integer buffers
 354     typedef endian_buffer<order::little, uint8_t, 8, align::yes>   little_uint8_buf_at;
 355     typedef endian_buffer<order::little, uint16_t, 16, align::yes> little_uint16_buf_at;
 356     typedef endian_buffer<order::little, uint32_t, 32, align::yes> little_uint32_buf_at;
 357     typedef endian_buffer<order::little, uint64_t, 64, align::yes> little_uint64_buf_at;
 358
 359     // aligned little endian floating point buffers
 360     typedef endian_buffer<order::little, float, 32, align::yes>    little_float32_buf_at;
 361     typedef endian_buffer<order::little, double, 64, align::yes>   little_float64_buf_at;
 362
 363     // aligned native endian typedefs are not provided because
 364     // <cstdint> types are superior for this use case
 365
 366   } // namespace endian
 367 } // namespace boost
 368 ```
 369
 370 The `implementation-defined` text in typedefs above is either `big` or `little`
 371 according to the native endianness of the platform.
 372
 373 The expository data member `value_` stores the current value of the
 374 `endian_buffer` object as a sequence of bytes ordered as specified by the
 375 `Order` template parameter. The `CHAR_BIT` macro is defined in `<climits>`.
 376 The only supported value of `CHAR_BIT` is 8.
 377
 378 The valid values of `Nbits` are as follows:
 379
 380 * When `sizeof(T)` is 1, `Nbits` shall be 8;
 381 * When `sizeof(T)` is 2, `Nbits` shall be 16;
 382 * When `sizeof(T)` is 4, `Nbits` shall be 24 or 32;
 383 * When `sizeof(T)` is 8, `Nbits` shall be 40, 48, 56, or 64.
 384
 385 Other values of `sizeof(T)` are not supported.
 386
 387 When `Nbits` is equal to `sizeof(T)*8`, `T` must be a trivially copyable type
 388 (such as `float`) that is assumed to have the same endianness as `uintNbits_t`.
 389
 390 When `Nbits` is less than `sizeof(T)*8`, `T` must be either a standard integral
 391 type ({cpp}std, [basic.fundamental]) or an `enum`.
 392
 393 ### Members
 394
 395 ```
 396 endian_buffer() noexcept = default;
 397 ```
 398 [none]
 399 * {blank}
 400 +
 401 Effects:: Constructs an uninitialized object.
 402
 403 ```
 404 explicit endian_buffer(T v) noexcept;
 405 ```
 406 [none]
 407 * {blank}
 408 +
 409 Effects:: `endian_store<T, Nbits/8, Order>( value_, v )`.
 410
 411 ```
 412 endian_buffer& operator=(T v) noexcept;
 413 ```
 414 [none]
 415 * {blank}
 416 +
 417 Effects:: `endian_store<T, Nbits/8, Order>( value_, v )`.
 418 Returns:: `*this`.
 419
 420 ```
 421 value_type value() const noexcept;
 422 ```
 423 [none]
 424 * {blank}
 425 +
 426 Returns:: `endian_load<T, Nbits/8, Order>( value_ )`.
 427
 428 ```
 429 unsigned char* data() noexcept;
 430 ```
 431 ```
 432 unsigned char const* data() const noexcept;
 433 ```
 434 [none]
 435 * {blank}
 436 +
 437 Returns::
 438   A pointer to the first byte of `value_`.
 439
 440 ### Non-member functions
 441
 442 ```
 443 template <class charT, class traits, order Order, class T,
 444   std::size_t n_bits, align Align>
 445 std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os,
 446   const endian_buffer<Order, T, n_bits, Align>& x);
 447 ```
 448 [none]
 449 * {blank}
 450 +
 451 Returns:: `os << x.value()`.
 452
 453 ```
 454 template <class charT, class traits, order Order, class T,
 455   std::size_t n_bits, align A>
 456 std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is,
 457   endian_buffer<Order, T, n_bits, Align>& x);
 458 ```
 459 [none]
 460 * {blank}
 461 +
 462 Effects:: As if:
 463 +
 464 ```
 465 T i;
 466 if (is >> i)
 467   x = i;
 468 ```
 469 Returns:: `is`.
 470
 471 ## FAQ
 472
 473 See the <<overview_faq,Overview FAQ>> for a library-wide FAQ.
 474
 475 Why not just use Boost.Serialization?::
 476 Serialization involves a conversion for every object involved in I/O. Endian
 477 integers require no conversion or copying. They are already in the desired
 478 format for binary I/O. Thus they can be read or written in bulk.
 479
 480 Are endian types PODs?::
 481 Yes for {cpp}11. No for {cpp}03, although several
 482 <<buffers_compilation,macros>> are available to force PODness in all cases.
 483
 484 What are the implications of endian integer types not being PODs with {cpp}03 compilers?::
 485 They can't be used in unions. Also, compilers aren't required to align or lay
 486 out storage in portable ways, although this potential problem hasn't prevented
 487 use of Boost.Endian with real compilers.
 488
 489 What good is native endianness?::
 490 It  provides alignment and size guarantees not available from the built-in
 491 types. It eases generic  programming.
 492
 493 Why bother with the aligned endian types?::
 494 Aligned integer operations may be faster (as much as 10 to 20 times faster) if
 495 the endianness and alignment of  the type matches the endianness and alignment
 496 requirements of the machine. The code, however, is likely to be somewhat less
 497 portable than with the unaligned types.
 498
 499 ## Design considerations for Boost.Endian buffers
 500
 501 * Must be suitable for I/O - in other words, must be memcpyable.
 502 * Must provide exactly the size and internal byte ordering specified.
 503 * Must work correctly when the internal integer representation has more bits
 504 that the sum of the bits in the external byte representation. Sign extension
 505 must work correctly when the internal integer representation type has more
 506 bits than the sum of the bits in the external bytes. For example, using
 507 a 64-bit integer internally to represent 40-bit (5 byte) numbers must work for
 508 both positive and negative values.
 509 * Must work correctly (including using the same defined external
 510 representation) regardless of whether a compiler treats char as signed or
 511 unsigned.
 512 * Unaligned types must not cause compilers to insert padding bytes.
 513 * The implementation should supply optimizations with great care. Experience
 514 has shown that optimizations of endian integers often become pessimizations
 515 when changing  machines or compilers. Pessimizations can also happen when
 516 changing compiler switches, compiler versions, or CPU models of the same
 517 architecture.
 518
 519 ## {cpp}11
 520
 521 The availability of the {cpp}11
 522 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
 523 Functions] feature is detected automatically, and will be used if present to
 524 ensure that objects of `class endian_buffer` are trivial, and thus
 525 PODs.
 526
 527 ## Compilation
 528
 529 Boost.Endian is implemented entirely within headers, with no need to link to
 530 any Boost object libraries.
 531
 532 Several macros allow user control over features:
 533
 534 * `BOOST_ENDIAN_NO_CTORS` causes `class endian_buffer` to have no
 535 constructors. The intended use is for compiling user code that must be
 536 portable between compilers regardless of {cpp}11
 537 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
 538 Functions] support. Use of constructors will always fail,
 539 * `BOOST_ENDIAN_FORCE_PODNESS` causes `BOOST_ENDIAN_NO_CTORS` to be defined if
 540 the compiler does not support {cpp}11
 541 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm[Defaulted
 542 Functions]. This is ensures that objects of `class endian_buffer` are PODs, and
 543 so can be used in {cpp}03 unions. In {cpp}11, `class endian_buffer` objects are
 544 PODs, even though they have constructors, so can always be used in unions.