\input texinfo
@setfilename ldint.info
+@c Copyright (C) 1992-2014 Free Software Foundation, Inc.
-@ifinfo
-@format
-START-INFO-DIR-ENTRY
+@ifnottex
+@dircategory Software development
+@direntry
* Ld-Internals: (ldint). The GNU linker internals.
-END-INFO-DIR-ENTRY
-@end format
-@end ifinfo
+@end direntry
+@end ifnottex
-@ifinfo
+@copying
This file documents the internals of the GNU linker ld.
-Copyright (C) 1992, 93, 94, 95, 96, 97, 1998 Free Software Foundation, Inc.
+Copyright @copyright{} 1992-2014 Free Software Foundation, Inc.
Contributed by Cygnus Support.
-Permission is granted to make and distribute verbatim copies of
-this manual provided the copyright notice and this permission notice
-are preserved on all copies.
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``Funding
+Free Software'', the Front-Cover texts being (a) (see below), and with
+the Back-Cover Texts being (b) (see below). A copy of the license is
+included in the section entitled ``GNU Free Documentation License''.
-@ignore
-Permission is granted to process this file through Tex and print the
-results, provided the printed document carries copying permission
-notice identical to this one except for the removal of this paragraph
-(this paragraph not being relevant to the printed manual).
+(a) The FSF's Front-Cover Text is:
-@end ignore
-Permission is granted to copy or distribute modified versions of this
-manual under the terms of the GPL (for which purpose this text may be
-regarded as a program in the language TeX).
-@end ifinfo
+ A GNU Manual
+
+(b) The FSF's Back-Cover Text is:
+
+ You have freedom to copy and modify this GNU Manual, like GNU
+ software. Copies published by the Free Software Foundation raise
+ funds for GNU development.
+@end copying
@iftex
@finalout
@tex
\def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
-\xdef\manvers{\$Revision$} % For use in headers, footers too
+\xdef\manvers{2.10.91} % For use in headers, footers too
{\parskip=0pt
\hfill Cygnus Support\par
\hfill \manvers\par
@end tex
@vskip 0pt plus 1filll
-Copyright @copyright{} 1992, 93, 94, 95, 96, 97, 1998
-Free Software Foundation, Inc.
+Copyright @copyright{} 1992-2014 Free Software Foundation, Inc.
-Permission is granted to make and distribute verbatim copies of
-this manual provided the copyright notice and this permission notice
-are preserved on all copies.
+ Permission is granted to copy, distribute and/or modify this document
+ under the terms of the GNU Free Documentation License, Version 1.3
+ or any later version published by the Free Software Foundation;
+ with no Invariant Sections, with no Front-Cover Texts, and with no
+ Back-Cover Texts. A copy of the license is included in the
+ section entitled "GNU Free Documentation License".
@end titlepage
@end iftex
Mostly, it is a repository into which you can put information about
GNU @code{ld} as you discover it (or as you design changes to @code{ld}).
+This document is distributed under the terms of the GNU Free
+Documentation License. A copy of the license is included in the
+section entitled "GNU Free Documentation License".
+
@menu
* README:: The README File
* Emulations:: How linker emulations are generated
* Emulation Walkthrough:: A Walkthrough of a Typical Emulation
+* Architecture Specific:: Some Architecture Specific Notes
+* GNU Free Documentation License:: GNU Free Documentation License
@end menu
@node README
@item SCRIPT_NAME
This is the name of the @file{scripttempl} script to use. If
@code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use
-the script @file{scriptteml/@var{script}.sc}.
+the script @file{scripttempl/@var{script}.sc}.
@item TEMPLATE_NAME
-This is the name of the @file{emultemlp} script to use. If
+This is the name of the @file{emultempl} script to use. If
@code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will
use the script @file{emultempl/@var{template}.em}. If this variable is
not set, the default value is @samp{generic}.
Some @file{scripttempl} scripts use this to set the start address of the
@samp{.text} section.
-@item NONPAGED_TEXT_START_ADDR
-If this is defined, the @file{genscripts.sh} script sets
-@code{TEXT_START_ADDR} to its value before running the
-@file{scripttempl} script for the @code{-n} and @code{-N} options
-(@pxref{linker scripts}).
-
@item SEGMENT_SIZE
The @file{genscripts.sh} script uses this to set the default value of
@code{DATA_ALIGNMENT} when running the @file{scripttempl} script.
invoke @file{scripttempl/@var{script}.sc}.
The @file{genscripts.sh} script will invoke the @file{scripttempl}
-script 5 or 6 times. Each time it will set the shell variable
+script 5 to 9 times. Each time it will set the shell variable
@code{LD_FLAG} to a different value. When the linker is run, the
options used will direct it to select a particular script. (Script
selection is controlled by the @code{get_script} emulation entry point;
this script at the appropriate time, normally when the linker is invoked
with the @code{-shared} option. The output has an extension of
@file{.xs}.
+@item c
+The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
+this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
+@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The
+@file{emultempl} script must arrange to use this script at the appropriate
+time, normally when the linker is invoked with the @code{-z combreloc}
+option. The output has an extension of
+@file{.xc}.
+@item cshared
+The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
+this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
+@file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and
+@code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file.
+The @file{emultempl} script must arrange to use this script at the
+appropriate time, normally when the linker is invoked with the @code{-shared
+-z combreloc} option. The output has an extension of @file{.xsc}.
+@item auto_import
+The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
+this value if @code{GENERATE_AUTO_IMPORT_SCRIPT} is defined in the
+@file{emulparams} file. The @file{emultempl} script must arrange to
+use this script at the appropriate time, normally when the linker is
+invoked with the @code{--enable-auto-import} option. The output has
+an extension of @file{.xa}.
@end table
Besides the shell variables set by the @file{emulparams} script, and the
@item CREATE_SHLIB
This will be set to a non-empty string when generating a @code{-shared}
script.
+
+@item COMBRELOC
+This will be set to a non-empty string when generating @code{-z combreloc}
+scripts to a temporary file name which can be used during script generation.
@end table
The conventional way to write a @file{scripttempl} script is to first
variable substitution based on @code{RELOCATING}. For example, on many
targets special symbols such as @code{_end} should be defined when doing
a final link. Naturally, those symbols should not be defined when doing
-a relocateable link using @code{-r}. The @file{scripttempl} script
+a relocatable link using @code{-r}. The @file{scripttempl} script
could use a construct like this to define those symbols:
@smallexample
$@{RELOCATING+ _end = .;@}
@end itemize
+@node Architecture Specific
+@chapter Some Architecture Specific Notes
+
+This is the place for notes on the behavior of @code{ld} on
+specific platforms. Currently, only Intel x86 is documented (and
+of that, only the auto-import behavior for DLLs).
+
+@menu
+* ix86:: Intel x86
+@end menu
+
+@node ix86
+@section Intel x86
+
+@table @emph
+@code{ld} can create DLLs that operate with various runtimes available
+on a common x86 operating system. These runtimes include native (using
+the mingw "platform"), cygwin, and pw.
+
+@item auto-import from DLLs
+@enumerate
+@item
+With this feature on, DLL clients can import variables from DLL
+without any concern from their side (for example, without any source
+code modifications). Auto-import can be enabled using the
+@code{--enable-auto-import} flag, or disabled via the
+@code{--disable-auto-import} flag. Auto-import is disabled by default.
+
+@item
+This is done completely in bounds of the PE specification (to be fair,
+there's a minor violation of the spec at one point, but in practice
+auto-import works on all known variants of that common x86 operating
+system) So, the resulting DLL can be used with any other PE
+compiler/linker.
+
+@item
+Auto-import is fully compatible with standard import method, in which
+variables are decorated using attribute modifiers. Libraries of either
+type may be mixed together.
+
+@item
+Overhead (space): 8 bytes per imported symbol, plus 20 for each
+reference to it; Overhead (load time): negligible; Overhead
+(virtual/physical memory): should be less than effect of DLL
+relocation.
+@end enumerate
+
+Motivation
+
+The obvious and only way to get rid of dllimport insanity is
+to make client access variable directly in the DLL, bypassing
+the extra dereference imposed by ordinary DLL runtime linking.
+I.e., whenever client contains something like
+
+@code{mov dll_var,%eax,}
+
+address of dll_var in the command should be relocated to point
+into loaded DLL. The aim is to make OS loader do so, and than
+make ld help with that. Import section of PE made following
+way: there's a vector of structures each describing imports
+from particular DLL. Each such structure points to two other
+parallel vectors: one holding imported names, and one which
+will hold address of corresponding imported name. So, the
+solution is de-vectorize these structures, making import
+locations be sparse and pointing directly into code.
+
+Implementation
+
+For each reference of data symbol to be imported from DLL (to
+set of which belong symbols with name <sym>, if __imp_<sym> is
+found in implib), the import fixup entry is generated. That
+entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3
+subsection. Each fixup entry contains pointer to symbol's address
+within .text section (marked with __fuN_<sym> symbol, where N is
+integer), pointer to DLL name (so, DLL name is referenced by
+multiple entries), and pointer to symbol name thunk. Symbol name
+thunk is singleton vector (__nm_th_<symbol>) pointing to
+IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing
+imported name. Here comes that "om the edge" problem mentioned above:
+PE specification rambles that name vector (OriginalFirstThunk) should
+run in parallel with addresses vector (FirstThunk), i.e. that they
+should have same number of elements and terminated with zero. We violate
+this, since FirstThunk points directly into machine code. But in
+practice, OS loader implemented the sane way: it goes thru
+OriginalFirstThunk and puts addresses to FirstThunk, not something
+else. It once again should be noted that dll and symbol name
+structures are reused across fixup entries and should be there
+anyway to support standard import stuff, so sustained overhead is
+20 bytes per reference. Other question is whether having several
+IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes,
+it is done even by native compiler/linker (libth32's functions are in
+fact resident in windows9x kernel32.dll, so if you use it, you have
+two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is
+whether referencing the same PE structures several times is valid.
+The answer is why not, prohibiting that (detecting violation) would
+require more work on behalf of loader than not doing it.
+
+@end table
+
+@node GNU Free Documentation License
+@chapter GNU Free Documentation License
+
+@include fdl.texi
+
@contents
@bye