bfd/doc/bfdint.texi

   1 \input texinfo
   2 @setfilename bfdint.info
   3
   4 @settitle BFD Internals
   5 @iftex
   6 @title{BFD Internals}
   7 @author{Ian Lance Taylor}
   8 @author{Cygnus Solutions}
   9 @end iftex
  10
  11 @node Top
  12 @top BFD Internals
  13 @raisesections
  14 @cindex bfd internals
  15
  16 This document describes some BFD internal information which may be
  17 helpful when working on BFD.  It is very incomplete.
  18
  19 This document is not updated regularly, and may be out of date.  It was
  20 last modified on $Date$.
  21
  22 The initial version of this document was written by Ian Lance Taylor
  23 @email{ian@@cygnus.com}.
  24
  25 @menu
  26 * BFD glossary::                BFD glossary
  27 * BFD guidelines::              BFD programming guidelines
  28 * BFD target vector::           BFD target vector
  29 * BFD generated files::         BFD generated files
  30 * BFD multiple compilations::   Files compiled multiple times in BFD
  31 * BFD relocation handling::     BFD relocation handling
  32 * BFD ELF support::             BFD ELF support
  33 * Index::                       Index
  34 @end menu
  35
  36 @node BFD glossary
  37 @section BFD glossary
  38 @cindex glossary for bfd
  39 @cindex bfd glossary
  40
  41 This is a short glossary of some BFD terms.
  42
  43 @table @asis
  44 @item a.out
  45 The a.out object file format.  The original Unix object file format.
  46 Still used on SunOS, though not Solaris.  Supports only three sections.
  47
  48 @item archive
  49 A collection of object files produced and manipulated by the @samp{ar}
  50 program.
  51
  52 @item BFD
  53 The BFD library itself.  Also, each object file, archive, or exectable
  54 opened by the BFD library has the type @samp{bfd *}, and is sometimes
  55 referred to as a bfd.
  56
  57 @item COFF
  58 The Common Object File Format.  Used on Unix SVR3.  Used by some
  59 embedded targets, although ELF is normally better.
  60
  61 @item DLL
  62 A shared library on Windows.
  63
  64 @item dynamic linker
  65 When a program linked against a shared library is run, the dynamic
  66 linker will locate the appropriate shared library and arrange to somehow
  67 include it in the running image.
  68
  69 @item dynamic object
  70 Another name for an ELF shared library.
  71
  72 @item ECOFF
  73 The Extended Common Object File Format.  Used on Alpha Digital Unix
  74 (formerly OSF/1), as well as Ultrix and Irix 4.  A variant of COFF.
  75
  76 @item ELF
  77 The Executable and Linking Format.  The object file format used on most
  78 modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4.  Also
  79 used on many embedded systems.
  80
  81 @item executable
  82 A program, with instructions and symbols, and perhaps dynamic linking
  83 information.  Normally produced by a linker.
  84
  85 @item NLM
  86 NetWare Loadable Module.  Used to describe the format of an object which
  87 be loaded into NetWare, which is some kind of PC based network server
  88 program.
  89
  90 @item object file
  91 A binary file including machine instructions, symbols, and relocation
  92 information.  Normally produced by an assembler.
  93
  94 @item object file format
  95 The format of an object file.  Typically object files and executables
  96 for a particular system are in the same format, although executables
  97 will not contain any relocation information.
  98
  99 @item PE
 100 The Portable Executable format.  This is the object file format used for
 101 Windows (specifically, Win32) object files.  It is based closely on
 102 COFF, but has a few significant differences.
 103
 104 @item PEI
 105 The Portable Executable Image format.  This is the object file format
 106 used for Windows (specifically, Win32) executables.  It is very similar
 107 to PE, but includes some additional header information.
 108
 109 @item relocations
 110 Information used by the linker to adjust section contents.  Also called
 111 relocs.
 112
 113 @item section
 114 Object files and executable are composed of sections.  Sections have
 115 optional data and optional relocation information.
 116
 117 @item shared library
 118 A library of functions which may be used by many executables without
 119 actually being linked into each executable.  There are several different
 120 implementations of shared libraries, each having slightly different
 121 features.
 122
 123 @item symbol
 124 Each object file and executable may have a list of symbols, often
 125 referred to as the symbol table.  A symbol is basically a name and an
 126 address.  There may also be some additional information like the type of
 127 symbol, although the type of a symbol is normally something simple like
 128 function or object, and should be confused with the more complex C
 129 notion of type.  Typically every global function and variable in a C
 130 program will have an associated symbol.
 131
 132 @item Win32
 133 The current Windows API, implemented by Windows 95 and later and Windows
 134 NT 3.51 and later, but not by Windows 3.1.
 135
 136 @item XCOFF
 137 The eXtended Common Object File Format.  Used on AIX.  A variant of
 138 COFF, with a completely different symbol table implementation.
 139 @end table
 140
 141 @node BFD guidelines
 142 @section BFD programming guidelines
 143 @cindex bfd programming guidelines
 144 @cindex programming guidelines for bfd
 145 @cindex guidelines, bfd programming
 146
 147 There is a lot of poorly written and confusing code in BFD.  New BFD
 148 code should be written to a higher standard.  Merely because some BFD
 149 code is written in a particular manner does not mean that you should
 150 emulate it.
 151
 152 Here are some general BFD programming guidelines:
 153
 154 @itemize @bullet
 155 @item
 156 Follow the GNU coding standards.
 157
 158 @item
 159 Avoid global variables.  We ideally want BFD to be fully reentrant, so
 160 that it can be used in multiple threads.  All uses of global or static
 161 variables interfere with that.  Initialized constant variables are OK,
 162 and they should be explicitly marked with const.  Instead of global
 163 variables, use data attached to a BFD or to a linker hash table.
 164
 165 @item
 166 All externally visible functions should have names which start with
 167 @samp{bfd_}.  All such functions should be declared in some header file,
 168 typically @file{bfd.h}.  See, for example, the various declarations near
 169 the end of @file{bfd-in.h}, which mostly declare functions required by
 170 specific linker emulations.
 171
 172 @item
 173 All functions which need to be visible from one file to another within
 174 BFD, but should not be visible outside of BFD, should start with
 175 @samp{_bfd_}.  Although external names beginning with @samp{_} are
 176 prohibited by the ANSI standard, in practice this usage will always
 177 work, and it is required by the GNU coding standards.
 178
 179 @item
 180 Always remember that people can compile using --enable-targets to build
 181 several, or all, targets at once.  It must be possible to link together
 182 the files for all targets.
 183
 184 @item
 185 BFD code should compile with few or no warnings using @samp{gcc -Wall}.
 186 Some warnings are OK, like the absence of certain function declarations
 187 which may or may not be declared in system header files.  Warnings about
 188 ambiguous expressions and the like should always be fixed.
 189 @end itemize
 190
 191 @node BFD target vector
 192 @section BFD target vector
 193 @cindex bfd target vector
 194 @cindex target vector in bfd
 195
 196 BFD supports multiple object file formats by using the @dfn{target
 197 vector}.  This is simply a set of function pointers which implement
 198 behaviour that is specific to a particular object file format.
 199
 200 In this section I list all of the entries in the target vector and
 201 describe what they do.
 202
 203 @menu
 204 * BFD target vector miscellaneous::     Miscellaneous constants
 205 * BFD target vector swap::              Swapping functions
 206 * BFD target vector format::            Format type dependent functions
 207 * BFD_JUMP_TABLE macros::               BFD_JUMP_TABLE macros
 208 * BFD target vector generic::           Generic functions
 209 * BFD target vector copy::              Copy functions
 210 * BFD target vector core::              Core file support functions
 211 * BFD target vector archive::           Archive functions
 212 * BFD target vector symbols::           Symbol table functions
 213 * BFD target vector relocs::            Relocation support
 214 * BFD target vector write::             Output functions
 215 * BFD target vector link::              Linker functions
 216 * BFD target vector dynamic::           Dynamic linking information functions
 217 @end menu
 218
 219 @node BFD target vector miscellaneous
 220 @subsection Miscellaneous constants
 221
 222 The target vector starts with a set of constants.
 223
 224 @table @samp
 225 @item name
 226 The name of the target vector.  This is an arbitrary string.  This is
 227 how the target vector is named in command line options for tools which
 228 use BFD, such as the @samp{-oformat} linker option.
 229
 230 @item flavour
 231 A general description of the type of target.  The following flavours are
 232 currently defined:
 233 @table @samp
 234 @item bfd_target_unknown_flavour
 235 Undefined or unknown.
 236 @item bfd_target_aout_flavour
 237 a.out.
 238 @item bfd_target_coff_flavour
 239 COFF.
 240 @item bfd_target_ecoff_flavour
 241 ECOFF.
 242 @item bfd_target_elf_flavour
 243 ELF.
 244 @item bfd_target_ieee_flavour
 245 IEEE-695.
 246 @item bfd_target_nlm_flavour
 247 NLM.
 248 @item bfd_target_oasys_flavour
 249 OASYS.
 250 @item bfd_target_tekhex_flavour
 251 Tektronix hex format.
 252 @item bfd_target_srec_flavour
 253 Motorola S-record format.
 254 @item bfd_target_ihex_flavour
 255 Intel hex format.
 256 @item bfd_target_som_flavour
 257 SOM (used on HP/UX).
 258 @item bfd_target_os9k_flavour
 259 os9000.
 260 @item bfd_target_versados_flavour
 261 VERSAdos.
 262 @item bfd_target_msdos_flavour
 263 MS-DOS.
 264 @item bfd_target_evax_flavour
 265 openVMS.
 266 @end table
 267
 268 @item byteorder
 269 The byte order of data in the object file.  One of
 270 @samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or
 271 @samp{BFD_ENDIAN_UNKNOWN}.  The latter would be used for a format such
 272 as S-records which do not record the architecture of the data.
 273
 274 @item header_byteorder
 275 The byte order of header information in the object file.  Normally the
 276 same as the @samp{byteorder} field, but there are certain cases where it
 277 may be different.
 278
 279 @item object_flags
 280 Flags which may appear in the @samp{flags} field of a BFD with this
 281 format.
 282
 283 @item section_flags
 284 Flags which may appear in the @samp{flags} field of a section within a
 285 BFD with this format.
 286
 287 @item symbol_leading_char
 288 A character which the C compiler normally puts before a symbol.  For
 289 example, an a.out compiler will typically generate the symbol
 290 @samp{_foo} for a function named @samp{foo} in the C source, in which
 291 case this field would be @samp{_}.  If there is no such character, this
 292 field will be @samp{0}.
 293
 294 @item ar_pad_char
 295 The padding character to use at the end of an archive name.  Normally
 296 @samp{/}.
 297
 298 @item ar_max_namelen
 299 The maximum length of a short name in an archive.  Normally @samp{14}.
 300
 301 @item backend_data
 302 A pointer to constant backend data.  This is used by backends to store
 303 whatever additional information they need to distinguish similar target
 304 vectors which use the same sets of functions.
 305 @end table
 306
 307 @node BFD target vector swap
 308 @subsection Swapping functions
 309
 310 Every target vector has fuction pointers used for swapping information
 311 in and out of the target representation.  There are two sets of
 312 functions: one for data information, and one for header information.
 313 Each set has three sizes: 64-bit, 32-bit, and 16-bit.  Each size has
 314 three actual functions: put, get unsigned, and get signed.
 315
 316 These 18 functions are used to convert data between the host and target
 317 representations.
 318
 319 @node BFD target vector format
 320 @subsection Format type dependent functions
 321
 322 Every target vector has three arrays of function pointers which are
 323 indexed by the BFD format type.  The BFD format types are as follows:
 324 @table @samp
 325 @item bfd_unknown
 326 Unknown format.  Not used for anything useful.
 327 @item bfd_object
 328 Object file.
 329 @item bfd_archive
 330 Archive file.
 331 @item bfd_core
 332 Core file.
 333 @end table
 334
 335 The three arrays of function pointers are as follows:
 336 @table @samp
 337 @item bfd_check_format
 338 Check whether the BFD is of a particular format (object file, archive
 339 file, or core file) corresponding to this target vector.  This is called
 340 by the @samp{bfd_check_format} function when examining an existing BFD.
 341 If the BFD matches the desired format, this function will initialize any
 342 format specific information such as the @samp{tdata} field of the BFD.
 343 This function must be called before any other BFD target vector function
 344 on a file opened for reading.
 345
 346 @item bfd_set_format
 347 Set the format of a BFD which was created for output.  This is called by
 348 the @samp{bfd_set_format} function after creating the BFD with a
 349 function such as @samp{bfd_openw}.  This function will initialize format
 350 specific information required to write out an object file or whatever of
 351 the given format.  This function must be called before any other BFD
 352 target vector function on a file opened for writing.
 353
 354 @item bfd_write_contents
 355 Write out the contents of the BFD in the given format.  This is called
 356 by @samp{bfd_close} function for a BFD opened for writing.  This really
 357 should not be an array selected by format type, as the
 358 @samp{bfd_set_format} function provides all the required information.
 359 In fact, BFD will fail if a different format is used when calling
 360 through the @samp{bfd_set_format} and the @samp{bfd_write_contents}
 361 arrays; fortunately, since @samp{bfd_close} gets it right, this is a
 362 difficult error to make.
 363 @end table
 364
 365 @node BFD_JUMP_TABLE macros
 366 @subsection @samp{BFD_JUMP_TABLE} macros
 367 @cindex @samp{BFD_JUMP_TABLE}
 368
 369 Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros.
 370 These macros take a single argument, which is a prefix applied to a set
 371 of functions.  The macros are then used to initialize the fields in the
 372 target vector.
 373
 374 For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three
 375 functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc},
 376 and @samp{_bfd_reloc_type_lookup}.  A reference like
 377 @samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions
 378 prefixed with @samp{foo}: @samp{foo_get_reloc_upper_found}, etc.  The
 379 @samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three
 380 functions initialize the appropriate fields in the BFD target vector.
 381
 382 This is done because it turns out that many different target vectors can
 383 shared certain classes of functions.  For example, archives are similar
 384 on most platforms, so most target vectors can use the same archive
 385 functions.  Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE}
 386 with the same argument, calling a set of functions which is defined in
 387 @file{archive.c}.
 388
 389 Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with
 390 the description of the function pointers which it defines.  The function
 391 pointers will be described using the name without the prefix which the
 392 @samp{BFD_JUMP_TABLE} macro defines.  This name is normally the same as
 393 the name of the field in the target vector structure.  Any differences
 394 will be noted.
 395
 396 @node BFD target vector generic
 397 @subsection Generic functions
 398 @cindex @samp{BFD_JUMP_TABLE_GENERIC}
 399
 400 The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all
 401 functions which don't easily fit into other categories.
 402
 403 @table @samp
 404 @item _close_and_cleanup
 405 Free any target specific information associated with the BFD.  This is
 406 called when any BFD is closed (the @samp{bfd_write_contents} function
 407 mentioned earlier is only called for a BFD opened for writing).  Most
 408 targets use @samp{bfd_alloc} to allocate all target specific
 409 information, and therefore don't have to do anything in this function.
 410 This function pointer is typically set to
 411 @samp{_bfd_generic_close_and_cleanup}, which simply returns true.
 412
 413 @item _bfd_free_cached_info
 414 Free any cached information associated with the BFD which can be
 415 recreated later if necessary.  This is used to reduce the memory
 416 consumption required by programs using BFD.  This is normally called via
 417 the @samp{bfd_free_cached_info} macro.  It is used by the default
 418 archive routines when computing the archive map.  Most targets do not
 419 do anything special for this entry point, and just set it to
 420 @samp{_bfd_generic_free_cached_info}, which simply returns true.
 421
 422 @item _new_section_hook
 423 This is called from @samp{bfd_make_section_anyway} whenever a new
 424 section is created.  Most targets use it to initialize section specific
 425 information.  This function is called whether or not the section
 426 corresponds to an actual section in an actual BFD.
 427
 428 @item _get_section_contents
 429 Get the contents of a section.  This is called from
 430 @samp{bfd_get_section_contents}.  Most targets set this to
 431 @samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek}
 432 based on the section's @samp{filepos} field and a @samp{bfd_read}.  The
 433 corresponding field in the target vector is named
 434 @samp{_bfd_get_section_contents}.
 435
 436 @item _get_section_contents_in_window
 437 Set a @samp{bfd_window} to hold the contents of a section.  This is
 438 called from @samp{bfd_get_section_contents_in_window}.  The
 439 @samp{bfd_window} idea never really caught in, and I don't think this is
 440 ever called.  Pretty much all targets implement this as
 441 @samp{bfd_generic_get_section_contents_in_window}, which uses
 442 @samp{bfd_get_section_contents} to do the right thing.  The
 443 corresponding field in the target vector is named
 444 @samp{_bfd_get_section_contents_in_window}.
 445 @end table
 446
 447 @node BFD target vector copy
 448 @subsection Copy functions
 449 @cindex @samp{BFD_JUMP_TABLE_COPY}
 450
 451 The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are
 452 called when copying BFDs, and for a couple of functions which deal with
 453 internal BFD information.
 454
 455 @table @samp
 456 @item _bfd_copy_private_bfd_data
 457 This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}.
 458 If the input and output BFDs have the same format, this will copy any
 459 private information over.  This is called after all the section contents
 460 have been written to the output file.  Only a few targets do anything in
 461 this function.
 462
 463 @item _bfd_merge_private_bfd_data
 464 This is called when linking, via @samp{bfd_merge_private_bfd_data}.  It
 465 gives the backend linker code a chance to set any special flags in the
 466 output file based on the contents of the input file.  Only a few targets
 467 do anything in this function.
 468
 469 @item _bfd_copy_private_section_data
 470 This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called
 471 for each section, via @samp{bfd_copy_private_section_data}.  This
 472 function is called before any section contents have been written.  Only
 473 a few targets do anything in this function.
 474
 475 @item _bfd_copy_private_symbol_data
 476 This is called via @samp{bfd_copy_private_symbol_data}, but I don't
 477 think anything actually calls it.  If it were defined, it could be used
 478 to copy private symbol data from one BFD to another.  However, most BFDs
 479 store extra symbol information by allocating space which is larger than
 480 the @samp{asymbol} structure and storing private information in the
 481 extra space.  Since @samp{objcopy} and other programs copy symbol
 482 information by copying pointers to @samp{asymbol} structures, the
 483 private symbol information is automatically copied as well.  Most
 484 targets do not do anything in this function.
 485
 486 @item _bfd_set_private_flags
 487 This is called via @samp{bfd_set_private_flags}.  It is basically a hook
 488 for the assembler to set magic information.  For example, the PowerPC
 489 ELF assembler uses it to set flags which appear in the e_flags field of
 490 the ELF header.  Most targets do not do anything in this function.
 491
 492 @item _bfd_print_private_bfd_data
 493 This is called by @samp{objdump} when the @samp{-p} option is used.  It
 494 is called via @samp{bfd_print_private_data}.  It prints any interesting
 495 information about the BFD which can not be otherwise represented by BFD
 496 and thus can not be printed by @samp{objdump}.  Most targets do not do
 497 anything in this function.
 498 @end table
 499
 500 @node BFD target vector core
 501 @subsection Core file support functions
 502 @cindex @samp{BFD_JUMP_TABLE_CORE}
 503
 504 The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal
 505 with core files.  Obviously, these functions only do something
 506 interesting for targets which have core file support.
 507
 508 @table @samp
 509 @item _core_file_failing_command
 510 Given a core file, this returns the command which was run to produce the
 511 core file.
 512
 513 @item _core_file_failing_signal
 514 Given a core file, this returns the signal number which produced the
 515 core file.
 516
 517 @item _core_file_matches_executable_p
 518 Given a core file and a BFD for an executable, this returns whether the
 519 core file was generated by the executable.
 520 @end table
 521
 522 @node BFD target vector archive
 523 @subsection Archive functions
 524 @cindex @samp{BFD_JUMP_TABLE_ARCHIVE}
 525
 526 The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal
 527 with archive files.  Most targets use COFF style archive files
 528 (including ELF targets), and these use @samp{_bfd_archive_coff} as the
 529 argument to @samp{BFD_JUMP_TABLE_ARCHIVE}.  Some targets use BSD/a.out
 530 style archives, and these use @samp{_bfd_archive_bsd}.  (The main
 531 difference between BSD and COFF archives is the format of the archive
 532 symbol table).  Targets with no archive support use
 533 @samp{_bfd_noarchive}.  Finally, a few targets have unusual archive
 534 handling.
 535
 536 @table @samp
 537 @item _slurp_armap
 538 Read in the archive symbol table, storing it in private BFD data.  This
 539 is normally called from the archive @samp{check_format} routine.  The
 540 corresponding field in the target vector is named
 541 @samp{_bfd_slurp_armap}.
 542
 543 @item _slurp_extended_name_table
 544 Read in the extended name table from the archive, if there is one,
 545 storing it in private BFD data.  This is normally called from the
 546 archive @samp{check_format} routine.  The corresponding field in the
 547 target vector is named @samp{_bfd_slurp_extended_name_table}.
 548
 549 @item construct_extended_name_table
 550 Build and return an extended name table if one is needed to write out
 551 the archive.  This also adjusts the archive headers to refer to the
 552 extended name table appropriately.  This is normally called from the
 553 archive @samp{write_contents} routine.  The corresponding field in the
 554 target vector is named @samp{_bfd_construct_extended_name_table}.
 555
 556 @item _truncate_arname
 557 This copies a file name into an archive header, truncating it as
 558 required.  It is normally called from the archive @samp{write_contents}
 559 routine.  This function is more interesting in targets which do not
 560 support extended name tables, but I think the GNU @samp{ar} program
 561 always uses extended name tables anyhow.  The corresponding field in the
 562 target vector is named @samp{_bfd_truncate_arname}.
 563
 564 @item _write_armap
 565 Write out the archive symbol table using calls to @samp{bfd_write}.
 566 This is normally called from the archive @samp{write_contents} routine.
 567 The corresponding field in the target vector is named @samp{write_armap}
 568 (no leading underscore).
 569
 570 @item _read_ar_hdr
 571 Read and parse an archive header.  This handles expanding the archive
 572 header name into the real file name using the extended name table.  This
 573 is called by routines which read the archive symbol table or the archive
 574 itself.  The corresponding field in the target vector is named
 575 @samp{_bfd_read_ar_hdr_fn}.
 576
 577 @item _openr_next_archived_file
 578 Given an archive and a BFD representing a file stored within the
 579 archive, return a BFD for the next file in the archive.  This is called
 580 via @samp{bfd_openr_next_archived_file}.  The corresponding field in the
 581 target vector is named @samp{openr_next_archived_file} (no leading
 582 underscore).
 583
 584 @item _get_elt_at_index
 585 Given an archive and an index, return a BFD for the file in the archive
 586 corresponding to that entry in the archive symbol table.  This is called
 587 via @samp{bfd_get_elt_at_index}.  The corresponding field in the target
 588 vector is named @samp{_bfd_get_elt_at_index}.
 589
 590 @item _generic_stat_arch_elt
 591 Do a stat on an element of an archive, returning information read from
 592 the archive header (modification time, uid, gid, file mode, size).  This
 593 is called via @samp{bfd_stat_arch_elt}.  The corresponding field in the
 594 target vector is named @samp{_bfd_stat_arch_elt}.
 595
 596 @item _update_armap_timestamp
 597 After the entire contents of an archive have been written out, update
 598 the timestamp of the archive symbol table to be newer than that of the
 599 file.  This is required for a.out style archives.  This is normally
 600 called by the archive @samp{write_contents} routine.  The corresponding
 601 field in the target vector is named @samp{_bfd_update_armap_timestamp}.
 602 @end table
 603
 604 @node BFD target vector symbols
 605 @subsection Symbol table functions
 606 @cindex @samp{BFD_JUMP_TABLE_SYMBOLS}
 607
 608 The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal
 609 with symbols.
 610
 611 @table @samp
 612 @item _get_symtab_upper_bound
 613 Return a sensible upper bound on the amount of memory which will be
 614 required to read the symbol table.  In practice most targets return the
 615 amount of memory required to hold @samp{asymbol} pointers for all the
 616 symbols plus a trailing @samp{NULL} entry, and store the actual symbol
 617 information in BFD private data.  This is called via
 618 @samp{bfd_get_symtab_upper_bound}.  The corresponding field in the
 619 target vector is named @samp{_bfd_get_symtab_upper_bound}.
 620
 621 @item _get_symtab
 622 Read in the symbol table.  This is called via
 623 @samp{bfd_canonicalize_symtab}.  The corresponding field in the target
 624 vector is named @samp{_bfd_canonicalize_symtab}.
 625
 626 @item _make_empty_symbol
 627 Create an empty symbol for the BFD.  This is needed because most targets
 628 store extra information with each symbol by allocating a structure
 629 larger than an @samp{asymbol} and storing the extra information at the
 630 end.  This function will allocate the right amount of memory, and return
 631 what looks like a pointer to an empty @samp{asymbol}.  This is called
 632 via @samp{bfd_make_empty_symbol}.  The corresponding field in the target
 633 vector is named @samp{_bfd_make_empty_symbol}.
 634
 635 @item _print_symbol
 636 Print information about the symbol.  This is called via
 637 @samp{bfd_print_symbol}.  One of the arguments indicates what sort of
 638 information should be printed:
 639 @table @samp
 640 @item bfd_print_symbol_name
 641 Just print the symbol name.
 642 @item bfd_print_symbol_more
 643 Print the symbol name and some interesting flags.  I don't think
 644 anything actually uses this.
 645 @item bfd_print_symbol_all
 646 Print all information about the symbol.  This is used by @samp{objdump}
 647 when run with the @samp{-t} option.
 648 @end table
 649 The corresponding field in the target vector is named
 650 @samp{_bfd_print_symbol}.
 651
 652 @item _get_symbol_info
 653 Return a standard set of information about the symbol.  This is called
 654 via @samp{bfd_symbol_info}.  The corresponding field in the target
 655 vector is named @samp{_bfd_get_symbol_info}.
 656
 657 @item _bfd_is_local_label_name
 658 Return whether the given string would normally represent the name of a
 659 local label.  This is called via @samp{bfd_is_local_label} and
 660 @samp{bfd_is_local_label_name}.  Local labels are normally discarded by
 661 the assembler.  In the linker, this defines the difference between the
 662 @samp{-x} and @samp{-X} options.
 663
 664 @item _get_lineno
 665 Return line number information for a symbol.  This is only meaningful
 666 for a COFF target.  This is called when writing out COFF line numbers.
 667
 668 @item _find_nearest_line
 669 Given an address within a section, use the debugging information to find
 670 the matching file name, function name, and line number, if any.  This is
 671 called via @samp{bfd_find_nearest_line}.  The corresponding field in the
 672 target vector is named @samp{_bfd_find_nearest_line}.
 673
 674 @item _bfd_make_debug_symbol
 675 Make a debugging symbol.  This is only meaningful for a COFF target,
 676 where it simply returns a symbol which will be placed in the
 677 @samp{N_DEBUG} section when it is written out.  This is called via
 678 @samp{bfd_make_debug_symbol}.
 679
 680 @item _read_minisymbols
 681 Minisymbols are used to reduce the memory requirements of programs like
 682 @samp{nm}.  A minisymbol is a cookie pointing to internal symbol
 683 information which the caller can use to extract complete symbol
 684 information.  This permits BFD to not convert all the symbols into
 685 generic form, but to instead convert them one at a time.  This is called
 686 via @samp{bfd_read_minisymbols}.  Most targets do not implement this,
 687 and just use generic support which is based on using standard
 688 @samp{asymbol} structures.
 689
 690 @item _minisymbol_to_symbol
 691 Convert a minisymbol to a standard @samp{asymbol}.  This is called via
 692 @samp{bfd_minisymbol_to_symbol}.
 693 @end table
 694
 695 @node BFD target vector relocs
 696 @subsection Relocation support
 697 @cindex @samp{BFD_JUMP_TABLE_RELOCS}
 698
 699 The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal
 700 with relocations.
 701
 702 @table @samp
 703 @item _get_reloc_upper_bound
 704 Return a sensible upper bound on the amount of memory which will be
 705 required to read the relocations for a section.  In practice most
 706 targets return the amount of memory required to hold @samp{arelent}
 707 pointers for all the relocations plus a trailing @samp{NULL} entry, and
 708 store the actual relocation information in BFD private data.  This is
 709 called via @samp{bfd_get_reloc_upper_bound}.
 710
 711 @item _canonicalize_reloc
 712 Return the relocation information for a section.  This is called via
 713 @samp{bfd_canonicalize_reloc}.  The corresponding field in the target
 714 vector is named @samp{_bfd_canonicalize_reloc}.
 715
 716 @item _bfd_reloc_type_lookup
 717 Given a relocation code, return the corresponding howto structure
 718 (@pxref{BFD relocation codes}).  This is called via
 719 @samp{bfd_reloc_type_lookup}.  The corresponding field in the target
 720 vector is named @samp{reloc_type_lookup}.
 721 @end table
 722
 723 @node BFD target vector write
 724 @subsection Output functions
 725 @cindex @samp{BFD_JUMP_TABLE_WRITE}
 726
 727 The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal
 728 with writing out a BFD.
 729
 730 @table @samp
 731 @item _set_arch_mach
 732 Set the architecture and machine number for a BFD.  This is called via
 733 @samp{bfd_set_arch_mach}.  Most targets implement this by calling
 734 @samp{bfd_default_set_arch_mach}.  The corresponding field in the target
 735 vector is named @samp{_bfd_set_arch_mach}.
 736
 737 @item _set_section_contents
 738 Write out the contents of a section.  This is called via
 739 @samp{bfd_set_section_contents}.  The corresponding field in the target
 740 vector is named @samp{_bfd_set_section_contents}.
 741 @end table
 742
 743 @node BFD target vector link
 744 @subsection Linker functions
 745 @cindex @samp{BFD_JUMP_TABLE_LINK}
 746
 747 The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the
 748 linker.
 749
 750 @table @samp
 751 @item _sizeof_headers
 752 Return the size of the header information required for a BFD.  This is
 753 used to implement the @samp{SIZEOF_HEADERS} linker script function.  It
 754 is normally used to align the first section at an efficient position on
 755 the page.  This is called via @samp{bfd_sizeof_headers}.  The
 756 corresponding field in the target vector is named
 757 @samp{_bfd_sizeof_headers}.
 758
 759 @item _bfd_get_relocated_section_contents
 760 Read the contents of a section and apply the relocation information.
 761 This handles both a final link and a relocateable link; in the latter
 762 case, it adjust the relocation information as well.  This is called via
 763 @samp{bfd_get_relocated_section_contents}.  Most targets implement it by
 764 calling @samp{bfd_generic_get_relocated_section_contents}.
 765
 766 @item _bfd_relax_section
 767 Try to use relaxation to shrink the size of a section.  This is called
 768 by the linker when the @samp{-relax} option is used.  This is called via
 769 @samp{bfd_relax_section}.  Most targets do not support any sort of
 770 relaxation.
 771
 772 @item _bfd_link_hash_table_create
 773 Create the symbol hash table to use for the linker.  This linker hook
 774 permits the backend to control the size and information of the elements
 775 in the linker symbol hash table.  This is called via
 776 @samp{bfd_link_hash_table_create}.
 777
 778 @item _bfd_link_add_symbols
 779 Given an object file or an archive, add all symbols into the linker
 780 symbol hash table.  Use callbacks to the linker to include archive
 781 elements in the link.  This is called via @samp{bfd_link_add_symbols}.
 782
 783 @item _bfd_final_link
 784 Finish the linking process.  The linker calls this hook after all of the
 785 input files have been read, when it is ready to finish the link and
 786 generate the output file.  This is called via @samp{bfd_final_link}.
 787
 788 @item _bfd_link_split_section
 789 I don't know what this is for.  Nothing seems to call it.  The only
 790 non-trivial definition is in @file{som.c}.
 791 @end table
 792
 793 @node BFD target vector dynamic
 794 @subsection Dynamic linking information functions
 795 @cindex @samp{BFD_JUMP_TABLE_DYNAMIC}
 796
 797 The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read
 798 dynamic linking information.
 799
 800 @table @samp
 801 @item _get_dynamic_symtab_upper_bound
 802 Return a sensible upper bound on the amount of memory which will be
 803 required to read the dynamic symbol table.  In practice most targets
 804 return the amount of memory required to hold @samp{asymbol} pointers for
 805 all the symbols plus a trailing @samp{NULL} entry, and store the actual
 806 symbol information in BFD private data.  This is called via
 807 @samp{bfd_get_dynamic_symtab_upper_bound}.  The corresponding field in
 808 the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}.
 809
 810 @item _canonicalize_dynamic_symtab
 811 Read the dynamic symbol table.  This is called via
 812 @samp{bfd_canonicalize_dynamic_symtab}.  The corresponding field in the
 813 target vector is named @samp{_bfd_canonicalize_dynamic_symtab}.
 814
 815 @item _get_dynamic_reloc_upper_bound
 816 Return a sensible upper bound on the amount of memory which will be
 817 required to read the dynamic relocations.  In practice most targets
 818 return the amount of memory required to hold @samp{arelent} pointers for
 819 all the relocations plus a trailing @samp{NULL} entry, and store the
 820 actual relocation information in BFD private data.  This is called via
 821 @samp{bfd_get_dynamic_reloc_upper_bound}.  The corresponding field in
 822 the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}.
 823
 824 @item _canonicalize_dynamic_reloc
 825 Read the dynamic relocations.  This is called via
 826 @samp{bfd_canonicalize_dynamic_reloc}.  The corresponding field in the
 827 target vector is named @samp{_bfd_canonicalize_dynamic_reloc}.
 828 @end table
 829
 830 @node BFD generated files
 831 @section BFD generated files
 832 @cindex generated files in bfd
 833 @cindex bfd generated files
 834
 835 BFD contains several automatically generated files.  This section
 836 describes them.  Some files are created at configure time, when you
 837 configure BFD.  Some files are created at make time, when you build
 838 time.  Some files are automatically rebuilt at make time, but only if
 839 you configure with the @samp{--enable-maintainer-mode} option.  Some
 840 files live in the object directory---the directory from which you run
 841 configure---and some live in the source directory.  All files that live
 842 in the source directory are checked into the CVS repository.
 843
 844 @table @file
 845 @item bfd.h
 846 @cindex @file{bfd.h}
 847 @cindex @file{bfd-in3.h}
 848 Lives in the object directory.  Created at make time from
 849 @file{bfd-in2.h} via @file{bfd-in3.h}.  @file{bfd-in3.h} is created at
 850 configure time from @file{bfd-in2.h}.  There are automatic dependencies
 851 to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h}
 852 changes, so you can normally ignore @file{bfd-in3.h}, and just think
 853 about @file{bfd-in2.h} and @file{bfd.h}.
 854
 855 @file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}.
 856 To see them, search for @samp{@@} in @file{bfd-in2.h}.  They mainly
 857 control whether BFD is built for a 32 bit target or a 64 bit target.
 858
 859 @item bfd-in2.h
 860 @cindex @file{bfd-in2.h}
 861 Lives in the source directory.  Created from @file{bfd-in.h} and several
 862 other BFD source files.  If you configure with the
 863 @samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt
 864 automatically when a source file changes.
 865
 866 @item elf32-target.h
 867 @itemx elf64-target.h
 868 @cindex @file{elf32-target.h}
 869 @cindex @file{elf64-target.h}
 870 Live in the object directory.  Created from @file{elfxx-target.h}.
 871 These files are versions of @file{elfxx-target.h} customized for either
 872 a 32 bit ELF target or a 64 bit ELF target.
 873
 874 @item libbfd.h
 875 @cindex @file{libbfd.h}
 876 Lives in the source directory.  Created from @file{libbfd-in.h} and
 877 several other BFD source files.  If you configure with the
 878 @samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt
 879 automatically when a source file changes.
 880
 881 @item libcoff.h
 882 @cindex @file{libcoff.h}
 883 Lives in the source directory.  Created from @file{libcoff-in.h} and
 884 @file{coffcode.h}.  If you configure with the
 885 @samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt
 886 automatically when a source file changes.
 887
 888 @item targmatch.h
 889 @cindex @file{targmatch.h}
 890 Lives in the object directory.  Created at make time from
 891 @file{config.bfd}.  This file is used to map configuration triplets into
 892 BFD target vector variable names at run time.
 893 @end table
 894
 895 @node BFD multiple compilations
 896 @section Files compiled multiple times in BFD
 897 Several files in BFD are compiled multiple times.  By this I mean that
 898 there are header files which contain function definitions.  These header
 899 filesare included by other files, and thus the functions are compiled
 900 once per file which includes them.
 901
 902 Preprocessor macros are used to control the compilation, so that each
 903 time the files are compiled the resulting functions are slightly
 904 different.  Naturally, if they weren't different, there would be no
 905 reason to compile them multiple times.
 906
 907 This is a not a particularly good programming technique, and future BFD
 908 work should avoid it.
 909
 910 @itemize @bullet
 911 @item
 912 Since this technique is rarely used, even experienced C programmers find
 913 it confusing.
 914
 915 @item
 916 It is difficult to debug programs which use BFD, since there is no way
 917 to describe which version of a particular function you are looking at.
 918
 919 @item
 920 Programs which use BFD wind up incorporating two or more slightly
 921 different versions of the same function, which wastes space in the
 922 executable.
 923
 924 @item
 925 This technique is never required nor is it especially efficient.  It is
 926 always possible to use statically initialized structures holding
 927 function pointers and magic constants instead.
 928 @end itemize
 929
 930 The following is a list of the files which are compiled multiple times.
 931
 932 @table @file
 933 @item aout-target.h
 934 @cindex @file{aout-target.h}
 935 Describes a few functions and the target vector for a.out targets.  This
 936 is used by individual a.out targets with different definitions of
 937 @samp{N_TXTADDR} and similar a.out macros.
 938
 939 @item aoutf1.h
 940 @cindex @file{aoutf1.h}
 941 Implements standard SunOS a.out files.  In principle it supports 64 bit
 942 a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but
 943 since all known a.out targets are 32 bits, this code may or may not
 944 work.  This file is only included by a few other files, and it is
 945 difficult to justify its existence.
 946
 947 @item aoutx.h
 948 @cindex @file{aoutx.h}
 949 Implements basic a.out support routines.  This file can be compiled for
 950 either 32 or 64 bit support.  Since all known a.out targets are 32 bits,
 951 the 64 bit support may or may not work.  I believe the original
 952 intention was that this file would only be included by @samp{aout32.c}
 953 and @samp{aout64.c}, and that other a.out targets would simply refer to
 954 the functions it defined.  Unfortunately, some other a.out targets
 955 started including it directly, leading to a somewhat confused state of
 956 affairs.
 957
 958 @item coffcode.h
 959 @cindex @file{coffcode.h}
 960 Implements basic COFF support routines.  This file is included by every
 961 COFF target.  It implements code which handles COFF magic numbers as
 962 well as various hook functions called by the generic COFF functions in
 963 @file{coffgen.c}.  This file is controlled by a number of different
 964 macros, and more are added regularly.
 965
 966 @item coffswap.h
 967 @cindex @file{coffswap.h}
 968 Implements COFF swapping routines.  This file is included by
 969 @file{coffcode.h}, and thus by every COFF target.  It implements the
 970 routines which swap COFF structures between internal and external
 971 format.  The main control for this file is the external structure
 972 definitions in the files in the @file{include/coff} directory.  A COFF
 973 target file will include one of those files before including
 974 @file{coffcode.h} and thus @file{coffswap.h}.  There are a few other
 975 macros which affect @file{coffswap.h} as well, mostly describing whether
 976 certain fields are present in the external structures.
 977
 978 @item ecoffswap.h
 979 @cindex @file{ecoffswap.h}
 980 Implements ECOFF swapping routines.  This is like @file{coffswap.h}, but
 981 for ECOFF.  It is included by the ECOFF target files (of which there are
 982 only two).  The control is the preprocessor macro @samp{ECOFF_32} or
 983 @samp{ECOFF_64}.
 984
 985 @item elfcode.h
 986 @cindex @file{elfcode.h}
 987 Implements ELF functions that use external structure definitions.  This
 988 file is included by two other files: @file{elf32.c} and @file{elf64.c}.
 989 It is controlled by the @samp{ARCH_SIZE} macro which is defined to be
 990 @samp{32} or @samp{64} before including it.  The @samp{NAME} macro is
 991 used internally to give the functions different names for the two target
 992 sizes.
 993
 994 @item elfcore.h
 995 @cindex @file{elfcore.h}
 996 Like @file{elfcode.h}, but for functions that are specific to ELF core
 997 files.  This is included only by @file{elfcode.h}.
 998
 999 @item elflink.h
1000 @cindex @file{elflink.h}
1001 Like @file{elfcode.h}, but for functions used by the ELF linker.  This
1002 is included only by @file{elfcode.h}.
1003
1004 @item elfxx-target.h
1005 @cindex @file{elfxx-target.h}
1006 This file is the source for the generated files @file{elf32-target.h}
1007 and @file{elf64-target.h}, one of which is included by every ELF target.
1008 It defines the ELF target vector.
1009
1010 @item freebsd.h
1011 @cindex @file{freebsd.h}
1012 Presumably intended to be included by all FreeBSD targets, but in fact
1013 there is only one such target, @samp{i386-freebsd}.  This defines a
1014 function used to set the right magic number for FreeBSD, as well as
1015 various macros, and includes @file{aout-target.h}.
1016
1017 @item netbsd.h
1018 @cindex @file{netbsd.h}
1019 Like @file{freebsd.h}, except that there are several files which include
1020 it.
1021
1022 @item nlm-target.h
1023 @cindex @file{nlm-target.h}
1024 Defines the target vector for a standard NLM target.
1025
1026 @item nlmcode.h
1027 @cindex @file{nlmcode.h}
1028 Like @file{elfcode.h}, but for NLM targets.  This is only included by
1029 @file{nlm32.c} and @file{nlm64.c}, both of which define the macro
1030 @samp{ARCH_SIZE} to an appropriate value.  There are no 64 bit NLM
1031 targets anyhow, so this is sort of useless.
1032
1033 @item nlmswap.h
1034 @cindex @file{nlmswap.h}
1035 Like @file{coffswap.h}, but for NLM targets.  This is included by each
1036 NLM target, but I think it winds up compiling to the exact same code for
1037 every target, and as such is fairly useless.
1038
1039 @item peicode.h
1040 @cindex @file{peicode.h}
1041 Provides swapping routines and other hooks for PE targets.
1042 @file{coffcode.h} will include this rather than @file{coffswap.h} for a
1043 PE target.  This defines PE specific versions of the COFF swapping
1044 routines, and also defines some macros which control @file{coffcode.h}
1045 itself.
1046 @end table
1047
1048 @node BFD relocation handling
1049 @section BFD relocation handling
1050 @cindex bfd relocation handling
1051 @cindex relocations in bfd
1052
1053 The handling of relocations is one of the more confusing aspects of BFD.
1054 Relocation handling has been implemented in various different ways, all
1055 somewhat incompatible, none perfect.
1056
1057 @menu
1058 * BFD relocation concepts::     BFD relocation concepts
1059 * BFD relocation functions::    BFD relocation functions
1060 * BFD relocation codes::        BFD relocation codes
1061 * BFD relocation future::       BFD relocation future
1062 @end menu
1063
1064 @node BFD relocation concepts
1065 @subsection BFD relocation concepts
1066
1067 A relocation is an action which the linker must take when linking.  It
1068 describes a change to the contents of a section.  The change is normally
1069 based on the final value of one or more symbols.  Relocations are
1070 created by the assembler when it creates an object file.
1071
1072 Most relocations are simple.  A typical simple relocation is to set 32
1073 bits at a given offset in a section to the value of a symbol.  This type
1074 of relocation would be generated for code like @code{int *p = &i;} where
1075 @samp{p} and @samp{i} are global variables.  A relocation for the symbol
1076 @samp{i} would be generated such that the linker would initialize the
1077 area of memory which holds the value of @samp{p} to the value of the
1078 symbol @samp{i}.
1079
1080 Slightly more complex relocations may include an addend, which is a
1081 constant to add to the symbol value before using it.  In some cases a
1082 relocation will require adding the symbol value to the existing contents
1083 of the section in the object file.  In others the relocation will simply
1084 replace the contents of the section with the symbol value.  Some
1085 relocations are PC relative, so that the value to be stored in the
1086 section is the difference between the value of a symbol and the final
1087 address of the section contents.
1088
1089 In general, relocations can be arbitrarily complex.  For
1090 example,relocations used in dynamic linking systems often require the
1091 linker to allocate space in a different section and use the offset
1092 within that section as the value to store.  In the IEEE object file
1093 format, relocations may involve arbitrary expressions.
1094
1095 When doing a relocateable link, the linker may or may not have to do
1096 anything with a relocation, depending upon the definition of the
1097 relocation.  Simple relocations generally do not require any special
1098 action.
1099
1100 @node BFD relocation functions
1101 @subsection BFD relocation functions
1102
1103 In BFD, each section has an array of @samp{arelent} structures.  Each
1104 structure has a pointer to a symbol, an address within the section, an
1105 addend, and a pointer to a @samp{reloc_howto_struct} structure.  The
1106 howto structure has a bunch of fields describing the reloc, including a
1107 type field.  The type field is specific to the object file format
1108 backend; none of the generic code in BFD examines it.
1109
1110 Originally, the function @samp{bfd_perform_relocation} was supposed to
1111 handle all relocations.  In theory, many relocations would be simple
1112 enough to be described by the fields in the howto structure.  For those
1113 that weren't, the howto structure included a @samp{special_function}
1114 field to use as an escape.
1115
1116 While this seems plausible, a look at @samp{bfd_perform_relocation}
1117 shows that it failed.  The function has odd special cases.  Some of the
1118 fields in the howto structure, such as @samp{pcrel_offset}, were not
1119 adequately documented.
1120
1121 The linker uses @samp{bfd_perform_relocation} to do all relocations when
1122 the input and output file have different formats (e.g., when generating
1123 S-records).  The generic linker code, which is used by all targets which
1124 do not define their own special purpose linker, uses
1125 @samp{bfd_get_relocated_section_contents}, which for most targets turns
1126 into a call to @samp{bfd_generic_get_relocated_section_contents}, which
1127 calls @samp{bfd_perform_relocation}.  So @samp{bfd_perform_relocation}
1128 is still widely used, which makes it difficult to change, since it is
1129 difficult to test all possible cases.
1130
1131 The assembler used @samp{bfd_perform_relocation} for a while.  This
1132 turned out to be the wrong thing to do, since
1133 @samp{bfd_perform_relocation} was written to handle relocations on an
1134 existing object file, while the assembler needed to create relocations
1135 in a new object file.  The assembler was changed to use the new function
1136 @samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation}
1137 was created as a copy of @samp{bfd_perform_relocation}.
1138
1139 Unfortunately, the work did not progress any farther, so
1140 @samp{bfd_install_relocation} remains a simple copy of
1141 @samp{bfd_perform_relocation}, with all the odd special cases and
1142 confusing code.  This again is difficult to change, because again any
1143 change can affect any assembler target, and so is difficult to test.
1144
1145 The new linker, when using the same object file format for all input
1146 files and the output file, does not convert relocations into
1147 @samp{arelent} structures, so it can not use
1148 @samp{bfd_perform_relocation} at all.  Instead, users of the new linker
1149 are expected to write a @samp{relocate_section} function which will
1150 handle relocations in a target specific fashion.
1151
1152 There are two helper functions for target specific relocation:
1153 @samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}.
1154 These functions use a howto structure, but they @emph{do not} use the
1155 @samp{special_function} field.  Since the functions are normally called
1156 from target specific code, the @samp{special_function} field adds
1157 little; any relocations which require special handling can be handled
1158 without calling those functions.
1159
1160 So, if you want to add a new target, or add a new relocation to an
1161 existing target, you need to do the following:
1162 @itemize @bullet
1163 @item
1164 Make sure you clearly understand what the contents of the section should
1165 look like after assembly, after a relocateable link, and after a final
1166 link.  Make sure you clearly understand the operations the linker must
1167 perform during a relocateable link and during a final link.
1168
1169 @item
1170 Write a howto structure for the relocation.  The howto structure is
1171 flexible enough to represent any relocation which should be handled by
1172 setting a contiguous bitfield in the destination to the value of a
1173 symbol, possibly with an addend, possibly adding the symbol value to the
1174 value already present in the destination.
1175
1176 @item
1177 Change the assembler to generate your relocation.  The assembler will
1178 call @samp{bfd_install_relocation}, so your howto structure has to be
1179 able to handle that.  You may need to set the @samp{special_function}
1180 field to handle assembly correctly.  Be careful to ensure that any code
1181 you write to handle the assembler will also work correctly when doing a
1182 relocateable link.  For example, see @samp{bfd_elf_generic_reloc}.
1183
1184 @item
1185 Test the assembler.  Consider the cases of relocation against an
1186 undefined symbol, a common symbol, a symbol defined in the object file
1187 in the same section, and a symbol defined in the object file in a
1188 different section.  These cases may not all be applicable for your
1189 reloc.
1190
1191 @item
1192 If your target uses the new linker, which is recommended, add any
1193 required handling to the target specific relocation function.  In simple
1194 cases this will just involve a call to @samp{_bfd_final_link_relocate}
1195 or @samp{_bfd_relocate_contents}, depending upon the definition of the
1196 relocation and whether the link is relocateable or not.
1197
1198 @item
1199 Test the linker.  Test the case of a final link.  If the relocation can
1200 overflow, use a linker script to force an overflow and make sure the
1201 error is reported correctly.  Test a relocateable link, whether the
1202 symbol is defined or undefined in the relocateable output.  For both the
1203 final and relocateable link, test the case when the symbol is a common
1204 symbol, when the symbol looked like a common symbol but became a defined
1205 symbol, when the symbol is defined in a different object file, and when
1206 the symbol is defined in the same object file.
1207
1208 @item
1209 In order for linking to another object file format, such as S-records,
1210 to work correctly, @samp{bfd_perform_relocation} has to do the right
1211 thing for the relocation.  You may need to set the
1212 @samp{special_function} field to handle this correctly.  Test this by
1213 doing a link in which the output object file format is S-records.
1214
1215 @item
1216 Using the linker to generate relocateable output in a different object
1217 file format is impossible in the general case, so you generally don't
1218 have to worry about that.  Linking input files of different object file
1219 formats together is quite unusual, but if you're really dedicated you
1220 may want to consider testing this case, both when the output object file
1221 format is the same as your format, and when it is different.
1222 @end itemize
1223
1224 @node BFD relocation codes
1225 @subsection BFD relocation codes
1226
1227 BFD has another way of describing relocations besides the howto
1228 structures described above: the enum @samp{bfd_reloc_code_real_type}.
1229
1230 Every known relocation type can be described as a value in this
1231 enumeration.  The enumeration contains many target specific relocations,
1232 but where two or more targets have the same relocation, a single code is
1233 used.  For example, the single value @samp{BFD_RELOC_32} is used for all
1234 simple 32 bit relocation types.
1235
1236 The main purpose of this relocation code is to give the assembler some
1237 mechanism to create @samp{arelent} structures.  In order for the
1238 assembler to create an @samp{arelent} structure, it has to be able to
1239 obtain a howto structure.  The function @samp{bfd_reloc_type_lookup},
1240 which simply calls the target vector entry point
1241 @samp{reloc_type_lookup}, takes a relocation code and returns a howto
1242 structure.
1243
1244 The function @samp{bfd_get_reloc_code_name} returns the name of a
1245 relocation code.  This is mainly used in error messages.
1246
1247 Using both howto structures and relocation codes can be somewhat
1248 confusing.  There are many processor specific relocation codes.
1249 However, the relocation is only fully defined by the howto structure.
1250 The same relocation code will map to different howto structures in
1251 different object file formats.  For example, the addend handling may be
1252 different.
1253
1254 Most of the relocation codes are not really general.  The assembler can
1255 not use them without already understanding what sorts of relocations can
1256 be used for a particular target.  It might be possible to replace the
1257 relocation codes with something simpler.
1258
1259 @node BFD relocation future
1260 @subsection BFD relocation future
1261
1262 Clearly the current BFD relocation support is in bad shape.  A
1263 wholescale rewrite would be very difficult, because it would require
1264 thorough testing of every BFD target.  So some sort of incremental
1265 change is required.
1266
1267 My vague thoughts on this would involve defining a new, clearly defined,
1268 howto structure.  Some mechanism would be used to determine which type
1269 of howto structure was being used by a particular format.
1270
1271 The new howto structure would clearly define the relocation behaviour in
1272 the case of an assembly, a relocateable link, and a final link.  At
1273 least one special function would be defined as an escape, and it might
1274 make sense to define more.
1275
1276 One or more generic functions similar to @samp{bfd_perform_relocation}
1277 would be written to handle the new howto structure.
1278
1279 This should make it possible to write a generic version of the relocate
1280 section functions used by the new linker.  The target specific code
1281 would provide some mechanism (a function pointer or an initial
1282 conversion) to convert target specific relocations into howto
1283 structures.
1284
1285 Ideally it would be possible to use this generic relocate section
1286 function for the generic linker as well.  That is, it would replace the
1287 @samp{bfd_generic_get_relocated_section_contents} function which is
1288 currently normally used.
1289
1290 For the special case of ELF dynamic linking, more consideration needs to
1291 be given to writing ELF specific but ELF target generic code to handle
1292 special relocation types such as GOT and PLT.
1293
1294 @node BFD ELF support
1295 @section BFD ELF support
1296 @cindex elf support in bfd
1297 @cindex bfd elf support
1298
1299 The ELF object file format is defined in two parts: a generic ABI and a
1300 processor specific supplement.  The ELF support in BFD is split in a
1301 similar fashion.  The processor specific support is largely kept within
1302 a single file.  The generic support is provided by several other file.
1303 The processor specific support provides a set of function pointers and
1304 constants used by the generic support.
1305
1306 @menu
1307 * BFD ELF generic support::             BFD ELF generic support
1308 * BFD ELF processor specific support::  BFD ELF processor specific support
1309 * BFD ELF future::                      BFD ELF future
1310 @end menu
1311
1312 @node BFD ELF generic support
1313 @subsection BFD ELF generic support
1314
1315 In general, functions which do not read external data from the ELF file
1316 are found in @file{elf.c}.  They operate on the internal forms of the
1317 ELF structures, which are defined in @file{include/elf/internal.h}.  The
1318 internal structures are defined in terms of @samp{bfd_vma}, and so may
1319 be used for both 32 bit and 64 bit ELF targets.
1320
1321 The file @file{elfcode.h} contains functions which operate on the
1322 external data.  @file{elfcode.h} is compiled twice, once via
1323 @file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via
1324 @file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}.
1325 @file{elfcode.h} includes functions to swap the ELF structures in and
1326 out of external form, as well as a few more complex functions.
1327
1328 Linker support is found in @file{elflink.c} and @file{elflink.h}.  The
1329 latter file is compiled twice, for both 32 and 64 bit support.  The
1330 linker support is only used if the processor specific file defines
1331 @samp{elf_backend_relocate_section}, which is required to relocate the
1332 section contents.  If that macro is not defined, the generic linker code
1333 is used, and relocations are handled via @samp{bfd_perform_relocation}.
1334
1335 The core file support is in @file{elfcore.h}, which is compiled twice,
1336 for both 32 and 64 bit support.  The more interesting cases of core file
1337 support only work on a native system which has the @file{sys/procfs.h}
1338 header file.  Without that file, the core file support does little more
1339 than read the ELF program segments as BFD sections.
1340
1341 The BFD internal header file @file{elf-bfd.h} is used for communication
1342 among these files and the processor specific files.
1343
1344 The default entries for the BFD ELF target vector are found mainly in
1345 @file{elf.c}.  Some functions are found in @file{elfcode.h}.
1346
1347 The processor specific files may override particular entries in the
1348 target vector, but most do not, with one exception: the
1349 @samp{bfd_reloc_type_lookup} entry point is always processor specific.
1350
1351 @node BFD ELF processor specific support
1352 @subsection BFD ELF processor specific support
1353
1354 By convention, the processor specific support for a particular processor
1355 will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is
1356 either 32 or 64, and @var{cpu} is the name of the processor.
1357
1358 @menu
1359 * BFD ELF processor required::  Required processor specific support
1360 * BFD ELF processor linker::    Processor specific linker support
1361 * BFD ELF processor other::     Other processor specific support options
1362 @end menu
1363
1364 @node BFD ELF processor required
1365 @subsubsection Required processor specific support
1366
1367 When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the
1368 following:
1369 @itemize @bullet
1370 @item
1371 Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or
1372 both, to a unique C name to use for the target vector.  This name should
1373 appear in the list of target vectors in @file{targets.c}, and will also
1374 have to appear in @file{config.bfd} and @file{configure.in}.  Define
1375 @samp{TARGET_BIG_SYM} for a big-endian processor,
1376 @samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both
1377 for a bi-endian processor.
1378 @item
1379 Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or
1380 both, to a string used as the name of the target vector.  This is the
1381 name which a user of the BFD tool would use to specify the object file
1382 format.  It would normally appear in a linker emulation parameters
1383 file.
1384 @item
1385 Define @samp{ELF_ARCH} to the BFD architecture (an element of the
1386 @samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}).
1387 @item
1388 Define @samp{ELF_MACHINE_CODE} to the magic number which should appear
1389 in the @samp{e_machine} field of the ELF header.  As of this writing,
1390 these magic numbers are assigned by SCO; if you want to get a magic
1391 number for a particular processor, try sending a note to
1392 @email{registry@@sco.com}.  In the BFD sources, the magic numbers are
1393 found in @file{include/elf/common.h}; they have names beginning with
1394 @samp{EM_}.
1395 @item
1396 Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in
1397 memory.  This can normally be found at the start of chapter 5 in the
1398 processor specific supplement.  For a processor which will only be used
1399 in an embedded system, or which has no memory management hardware, this
1400 can simply be @samp{1}.
1401 @item
1402 If the format should use @samp{Rel} rather than @samp{Rela} relocations,
1403 define @samp{USE_REL}.  This is normally defined in chapter 4 of the
1404 processor specific supplement.  In the absence of a supplement, it's
1405 usually easier to work with @samp{Rela} relocations, although they will
1406 require more space in object files (but not in executables, except when
1407 using dynamic linking).  It is possible, though somewhat awkward, to
1408 support both @samp{Rel} and @samp{Rela} relocations for a single target;
1409 @file{elf64-mips.c} does it by overriding the relocation reading and
1410 writing routines.
1411 @item
1412 Define howto structures for all the relocation types.
1413 @item
1414 Define a @samp{bfd_reloc_type_lookup} routine.  This must be named
1415 @samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a
1416 function or a macro.  It must translate a BFD relocation code into a
1417 howto structure.  This is normally a table lookup or a simple switch.
1418 @item
1419 If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}.
1420 If using @samp{Rela} relocations, define @samp{elf_info_to_howto}.
1421 Either way, this is a macro defined as the name of a function which
1422 takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and
1423 sets the @samp{howto} field of the @samp{arelent} based on the
1424 @samp{Rel} or @samp{Rela} structure.  This is normally uses
1425 @samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as
1426 an index into a table of howto structures.
1427 @end itemize
1428
1429 You must also add the magic number for this processor to the
1430 @samp{prep_headers} function in @file{elf.c}.
1431
1432 @node BFD ELF processor linker
1433 @subsubsection Processor specific linker support
1434
1435 The linker will be much more efficient if you define a relocate section
1436 function.  This will permit BFD to use the ELF specific linker support.
1437
1438 If you do not define a relocate section function, BFD must use the
1439 generic linker support, which requires converting all symbols and
1440 relocations into BFD @samp{asymbol} and @samp{arelent} structures.  In
1441 this case, relocations will be handled by calling
1442 @samp{bfd_perform_relocation}, which will use the howto structures you
1443 have defined.  @xref{BFD relocation handling}.
1444
1445 In order to support linking into a different object file format, such as
1446 S-records, @samp{bfd_perform_relocation} must work correctly with your
1447 howto structures, so you can't skip that step.  However, if you define
1448 the relocate section function, then in the normal case of linking into
1449 an ELF file the linker will not need to convert symbols and relocations,
1450 and will be much more efficient.
1451
1452 To use a relocation section function, define the macro
1453 @samp{elf_backend_relocate_section} as the name of a function which will
1454 take the contents of a section, as well as relocation, symbol, and other
1455 information, and modify the section contents according to the relocation
1456 information.  In simple cases, this is little more than a loop over the
1457 relocations which computes the value of each relocation and calls
1458 @samp{_bfd_final_link_relocate}.  The function must check for a
1459 relocateable link, and in that case normally needs to do nothing other
1460 than adjust the addend for relocations against a section symbol.
1461
1462 The complex cases generally have to do with dynamic linker support.  GOT
1463 and PLT relocations must be handled specially, and the linker normally
1464 arranges to set up the GOT and PLT sections while handling relocations.
1465 When generating a shared library, random relocations must normally be
1466 copied into the shared library, or converted to RELATIVE relocations
1467 when possible.
1468
1469 @node BFD ELF processor other
1470 @subsubsection Other processor specific support options
1471
1472 There are many other macros which may be defined in
1473 @file{elf@var{nn}-@var{cpu}.c}.  These macros may be found in
1474 @file{elfxx-target.h}.
1475
1476 Macros may be used to override some of the generic ELF target vector
1477 functions.
1478
1479 Several processor specific hook functions which may be defined as
1480 macros.  These functions are found as function pointers in the
1481 @samp{elf_backend_data} structure defined in @file{elf-bfd.h}.  In
1482 general, a hook function is set by defining a macro
1483 @samp{elf_backend_@var{name}}.
1484
1485 There are a few processor specific constants which may also be defined.
1486 These are again found in the @samp{elf_backend_data} structure.
1487
1488 I will not define the various functions and constants here; see the
1489 comments in @file{elf-bfd.h}.
1490
1491 Normally any odd characteristic of a particular ELF processor is handled
1492 via a hook function.  For example, the special @samp{SHN_MIPS_SCOMMON}
1493 section number found in MIPS ELF is handled via the hooks
1494 @samp{section_from_bfd_section}, @samp{symbol_processing},
1495 @samp{add_symbol_hook}, and @samp{output_symbol_hook}.
1496
1497 Dynamic linking support, which involves processor specific relocations
1498 requiring special handling, is also implemented via hook functions.
1499
1500 @node BFD ELF future
1501 @subsection BFD ELF future
1502
1503 The current dynamic linking support has too much code duplication.
1504 While each processor has particular differences, much of the dynamic
1505 linking support is quite similar for each processor.  The GOT and PLT
1506 are handled in fairly similar ways, the details of -Bsymbolic linking
1507 are generally similar, etc.  This code should be reworked to use more
1508 generic functions, eliminating the duplication.
1509
1510 Similarly, the relocation handling has too much duplication.  Many of
1511 the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are
1512 quite similar.  The relocate section functions are also often quite
1513 similar, both in the standard linker handling and the dynamic linker
1514 handling.  Many of the COFF processor specific backends share a single
1515 relocate section function (@samp{_bfd_coff_generic_relocate_section}),
1516 and it should be possible to do something like this for the ELF targets
1517 as well.
1518
1519 The appearance of the processor specific magic number in
1520 @samp{prep_headers} in @file{elf.c} is somewhat bogus.  It should be
1521 possible to add support for a new processor without changing the generic
1522 support.
1523
1524 The processor function hooks and constants are ad hoc and need better
1525 documentation.
1526
1527 @node Index
1528 @unnumberedsec Index
1529 @printindex cp
1530
1531 @contents
1532 @bye