llvm/docs/LangRef.rst

   1 ==============================
   2 LLVM Language Reference Manual
   3 ==============================
   4
   5 .. contents::
   6    :local:
   7    :depth: 4
   8
   9 Abstract
  10 ========
  11
  12 This document is a reference manual for the LLVM assembly language. LLVM
  13 is a Static Single Assignment (SSA) based representation that provides
  14 type safety, low-level operations, flexibility, and the capability of
  15 representing 'all' high-level languages cleanly. It is the common code
  16 representation used throughout all phases of the LLVM compilation
  17 strategy.
  18
  19 Introduction
  20 ============
  21
  22 The LLVM code representation is designed to be used in three different
  23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
  24 (suitable for fast loading by a Just-In-Time compiler), and as a human
  25 readable assembly language representation. This allows LLVM to provide a
  26 powerful intermediate representation for efficient compiler
  27 transformations and analysis, while providing a natural means to debug
  28 and visualize the transformations. The three different forms of LLVM are
  29 all equivalent. This document describes the human readable
  30 representation and notation.
  31
  32 The LLVM representation aims to be light-weight and low-level while
  33 being expressive, typed, and extensible at the same time. It aims to be
  34 a "universal IR" of sorts, by being at a low enough level that
  35 high-level ideas may be cleanly mapped to it (similar to how
  36 microprocessors are "universal IR's", allowing many source languages to
  37 be mapped to them). By providing type information, LLVM can be used as
  38 the target of optimizations: for example, through pointer analysis, it
  39 can be proven that a C automatic variable is never accessed outside of
  40 the current function, allowing it to be promoted to a simple SSA value
  41 instead of a memory location.
  42
  43 .. _wellformed:
  44
  45 Well-Formedness
  46 ---------------
  47
  48 It is important to note that this document describes 'well formed' LLVM
  49 assembly language. There is a difference between what the parser accepts
  50 and what is considered 'well formed'. For example, the following
  51 instruction is syntactically okay, but not well formed:
  52
  53 .. code-block:: llvm
  54
  55     %x = add i32 1, %x
  56
  57 because the definition of ``%x`` does not dominate all of its uses. The
  58 LLVM infrastructure provides a verification pass that may be used to
  59 verify that an LLVM module is well formed. This pass is automatically
  60 run by the parser after parsing input assembly and by the optimizer
  61 before it outputs bitcode. The violations pointed out by the verifier
  62 pass indicate bugs in transformation passes or input to the parser.
  63
  64 .. _identifiers:
  65
  66 Identifiers
  67 ===========
  68
  69 LLVM identifiers come in two basic types: global and local. Global
  70 identifiers (functions, global variables) begin with the ``'@'``
  71 character. Local identifiers (register names, types) begin with the
  72 ``'%'`` character. Additionally, there are three different formats for
  73 identifiers, for different purposes:
  74
  75 #. Named values are represented as a string of characters with their
  76    prefix. For example, ``%foo``, ``@DivisionByZero``,
  77    ``%a.really.long.identifier``. The actual regular expression used is
  78    '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
  79    characters in their names can be surrounded with quotes. Special
  80    characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
  81    code for the character in hexadecimal. In this way, any character can
  82    be used in a name value, even quotes themselves. The ``"\01"`` prefix
  83    can be used on global values to suppress mangling.
  84 #. Unnamed values are represented as an unsigned numeric value with
  85    their prefix. For example, ``%12``, ``@2``, ``%44``.
  86 #. Constants, which are described in the section Constants_ below.
  87
  88 LLVM requires that values start with a prefix for two reasons: Compilers
  89 don't need to worry about name clashes with reserved words, and the set
  90 of reserved words may be expanded in the future without penalty.
  91 Additionally, unnamed identifiers allow a compiler to quickly come up
  92 with a temporary variable without having to avoid symbol table
  93 conflicts.
  94
  95 Reserved words in LLVM are very similar to reserved words in other
  96 languages. There are keywords for different opcodes ('``add``',
  97 '``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
  98 '``i32``', etc...), and others. These reserved words cannot conflict
  99 with variable names, because none of them start with a prefix character
 100 (``'%'`` or ``'@'``).
 101
 102 Here is an example of LLVM code to multiply the integer variable
 103 '``%X``' by 8:
 104
 105 The easy way:
 106
 107 .. code-block:: llvm
 108
 109     %result = mul i32 %X, 8
 110
 111 After strength reduction:
 112
 113 .. code-block:: llvm
 114
 115     %result = shl i32 %X, 3
 116
 117 And the hard way:
 118
 119 .. code-block:: llvm
 120
 121     %0 = add i32 %X, %X           ; yields i32:%0
 122     %1 = add i32 %0, %0           ; yields i32:%1
 123     %result = add i32 %1, %1
 124
 125 This last way of multiplying ``%X`` by 8 illustrates several important
 126 lexical features of LLVM:
 127
 128 #. Comments are delimited with a '``;``' and go until the end of line.
 129 #. Unnamed temporaries are created when the result of a computation is
 130    not assigned to a named value.
 131 #. Unnamed temporaries are numbered sequentially (using a per-function
 132    incrementing counter, starting with 0). Note that basic blocks and unnamed
 133    function parameters are included in this numbering. For example, if the
 134    entry basic block is not given a label name and all function parameters are
 135    named, then it will get number 0.
 136
 137 It also shows a convention that we follow in this document. When
 138 demonstrating instructions, we will follow an instruction with a comment
 139 that defines the type and name of value produced.
 140
 141 High Level Structure
 142 ====================
 143
 144 Module Structure
 145 ----------------
 146
 147 LLVM programs are composed of ``Module``'s, each of which is a
 148 translation unit of the input programs. Each module consists of
 149 functions, global variables, and symbol table entries. Modules may be
 150 combined together with the LLVM linker, which merges function (and
 151 global variable) definitions, resolves forward declarations, and merges
 152 symbol table entries. Here is an example of the "hello world" module:
 153
 154 .. code-block:: llvm
 155
 156     ; Declare the string constant as a global constant.
 157     @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
 158
 159     ; External declaration of the puts function
 160     declare i32 @puts(i8* nocapture) nounwind
 161
 162     ; Definition of main function
 163     define i32 @main() {   ; i32()*
 164       ; Convert [13 x i8]* to i8*...
 165       %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
 166
 167       ; Call puts function to write out the string to stdout.
 168       call i32 @puts(i8* %cast210)
 169       ret i32 0
 170     }
 171
 172     ; Named metadata
 173     !0 = !{i32 42, null, !"string"}
 174     !foo = !{!0}
 175
 176 This example is made up of a :ref:`global variable <globalvars>` named
 177 "``.str``", an external declaration of the "``puts``" function, a
 178 :ref:`function definition <functionstructure>` for "``main``" and
 179 :ref:`named metadata <namedmetadatastructure>` "``foo``".
 180
 181 In general, a module is made up of a list of global values (where both
 182 functions and global variables are global values). Global values are
 183 represented by a pointer to a memory location (in this case, a pointer
 184 to an array of char, and a pointer to a function), and have one of the
 185 following :ref:`linkage types <linkage>`.
 186
 187 .. _linkage:
 188
 189 Linkage Types
 190 -------------
 191
 192 All Global Variables and Functions have one of the following types of
 193 linkage:
 194
 195 ``private``
 196     Global values with "``private``" linkage are only directly
 197     accessible by objects in the current module. In particular, linking
 198     code into a module with a private global value may cause the
 199     private to be renamed as necessary to avoid collisions. Because the
 200     symbol is private to the module, all references can be updated. This
 201     doesn't show up in any symbol table in the object file.
 202 ``internal``
 203     Similar to private, but the value shows as a local symbol
 204     (``STB_LOCAL`` in the case of ELF) in the object file. This
 205     corresponds to the notion of the '``static``' keyword in C.
 206 ``available_externally``
 207     Globals with "``available_externally``" linkage are never emitted into
 208     the object file corresponding to the LLVM module. From the linker's
 209     perspective, an ``available_externally`` global is equivalent to
 210     an external declaration. They exist to allow inlining and other
 211     optimizations to take place given knowledge of the definition of the
 212     global, which is known to be somewhere outside the module. Globals
 213     with ``available_externally`` linkage are allowed to be discarded at
 214     will, and allow inlining and other optimizations. This linkage type is
 215     only allowed on definitions, not declarations.
 216 ``linkonce``
 217     Globals with "``linkonce``" linkage are merged with other globals of
 218     the same name when linkage occurs. This can be used to implement
 219     some forms of inline functions, templates, or other code which must
 220     be generated in each translation unit that uses it, but where the
 221     body may be overridden with a more definitive definition later.
 222     Unreferenced ``linkonce`` globals are allowed to be discarded. Note
 223     that ``linkonce`` linkage does not actually allow the optimizer to
 224     inline the body of this function into callers because it doesn't
 225     know if this definition of the function is the definitive definition
 226     within the program or whether it will be overridden by a stronger
 227     definition. To enable inlining and other optimizations, use
 228     "``linkonce_odr``" linkage.
 229 ``weak``
 230     "``weak``" linkage has the same merging semantics as ``linkonce``
 231     linkage, except that unreferenced globals with ``weak`` linkage may
 232     not be discarded. This is used for globals that are declared "weak"
 233     in C source code.
 234 ``common``
 235     "``common``" linkage is most similar to "``weak``" linkage, but they
 236     are used for tentative definitions in C, such as "``int X;``" at
 237     global scope. Symbols with "``common``" linkage are merged in the
 238     same way as ``weak symbols``, and they may not be deleted if
 239     unreferenced. ``common`` symbols may not have an explicit section,
 240     must have a zero initializer, and may not be marked
 241     ':ref:`constant <globalvars>`'. Functions and aliases may not have
 242     common linkage.
 243
 244 .. _linkage_appending:
 245
 246 ``appending``
 247     "``appending``" linkage may only be applied to global variables of
 248     pointer to array type. When two global variables with appending
 249     linkage are linked together, the two global arrays are appended
 250     together. This is the LLVM, typesafe, equivalent of having the
 251     system linker append together "sections" with identical names when
 252     .o files are linked.
 253
 254     Unfortunately this doesn't correspond to any feature in .o files, so it
 255     can only be used for variables like ``llvm.global_ctors`` which llvm
 256     interprets specially.
 257
 258 ``extern_weak``
 259     The semantics of this linkage follow the ELF object file model: the
 260     symbol is weak until linked, if not linked, the symbol becomes null
 261     instead of being an undefined reference.
 262 ``linkonce_odr``, ``weak_odr``
 263     Some languages allow differing globals to be merged, such as two
 264     functions with different semantics. Other languages, such as
 265     ``C++``, ensure that only equivalent globals are ever merged (the
 266     "one definition rule" --- "ODR"). Such languages can use the
 267     ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
 268     global will only be merged with equivalent globals. These linkage
 269     types are otherwise the same as their non-``odr`` versions.
 270 ``external``
 271     If none of the above identifiers are used, the global is externally
 272     visible, meaning that it participates in linkage and can be used to
 273     resolve external symbol references.
 274
 275 It is illegal for a global variable or function *declaration* to have any
 276 linkage type other than ``external`` or ``extern_weak``.
 277
 278 .. _callingconv:
 279
 280 Calling Conventions
 281 -------------------
 282
 283 LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
 284 :ref:`invokes <i_invoke>` can all have an optional calling convention
 285 specified for the call. The calling convention of any pair of dynamic
 286 caller/callee must match, or the behavior of the program is undefined.
 287 The following calling conventions are supported by LLVM, and more may be
 288 added in the future:
 289
 290 "``ccc``" - The C calling convention
 291     This calling convention (the default if no other calling convention
 292     is specified) matches the target C calling conventions. This calling
 293     convention supports varargs function calls and tolerates some
 294     mismatch in the declared prototype and implemented declaration of
 295     the function (as does normal C).
 296 "``fastcc``" - The fast calling convention
 297     This calling convention attempts to make calls as fast as possible
 298     (e.g. by passing things in registers). This calling convention
 299     allows the target to use whatever tricks it wants to produce fast
 300     code for the target, without having to conform to an externally
 301     specified ABI (Application Binary Interface). `Tail calls can only
 302     be optimized when this, the tailcc, the GHC or the HiPE convention is
 303     used. <CodeGenerator.html#id80>`_ This calling convention does not
 304     support varargs and requires the prototype of all callees to exactly
 305     match the prototype of the function definition.
 306 "``coldcc``" - The cold calling convention
 307     This calling convention attempts to make code in the caller as
 308     efficient as possible under the assumption that the call is not
 309     commonly executed. As such, these calls often preserve all registers
 310     so that the call does not break any live ranges in the caller side.
 311     This calling convention does not support varargs and requires the
 312     prototype of all callees to exactly match the prototype of the
 313     function definition. Furthermore the inliner doesn't consider such function
 314     calls for inlining.
 315 "``cc 10``" - GHC convention
 316     This calling convention has been implemented specifically for use by
 317     the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
 318     It passes everything in registers, going to extremes to achieve this
 319     by disabling callee save registers. This calling convention should
 320     not be used lightly but only for specific situations such as an
 321     alternative to the *register pinning* performance technique often
 322     used when implementing functional programming languages. At the
 323     moment only X86 supports this convention and it has the following
 324     limitations:
 325
 326     -  On *X86-32* only supports up to 4 bit type parameters. No
 327        floating-point types are supported.
 328     -  On *X86-64* only supports up to 10 bit type parameters and 6
 329        floating-point parameters.
 330
 331     This calling convention supports `tail call
 332     optimization <CodeGenerator.html#id80>`_ but requires both the
 333     caller and callee are using it.
 334 "``cc 11``" - The HiPE calling convention
 335     This calling convention has been implemented specifically for use by
 336     the `High-Performance Erlang
 337     (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
 338     native code compiler of the `Ericsson's Open Source Erlang/OTP
 339     system <http://www.erlang.org/download.shtml>`_. It uses more
 340     registers for argument passing than the ordinary C calling
 341     convention and defines no callee-saved registers. The calling
 342     convention properly supports `tail call
 343     optimization <CodeGenerator.html#id80>`_ but requires that both the
 344     caller and the callee use it. It uses a *register pinning*
 345     mechanism, similar to GHC's convention, for keeping frequently
 346     accessed runtime components pinned to specific hardware registers.
 347     At the moment only X86 supports this convention (both 32 and 64
 348     bit).
 349 "``webkit_jscc``" - WebKit's JavaScript calling convention
 350     This calling convention has been implemented for `WebKit FTL JIT
 351     <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
 352     stack right to left (as cdecl does), and returns a value in the
 353     platform's customary return register.
 354 "``anyregcc``" - Dynamic calling convention for code patching
 355     This is a special convention that supports patching an arbitrary code
 356     sequence in place of a call site. This convention forces the call
 357     arguments into registers but allows them to be dynamically
 358     allocated. This can currently only be used with calls to
 359     llvm.experimental.patchpoint because only this intrinsic records
 360     the location of its arguments in a side table. See :doc:`StackMaps`.
 361 "``preserve_mostcc``" - The `PreserveMost` calling convention
 362     This calling convention attempts to make the code in the caller as
 363     unintrusive as possible. This convention behaves identically to the `C`
 364     calling convention on how arguments and return values are passed, but it
 365     uses a different set of caller/callee-saved registers. This alleviates the
 366     burden of saving and recovering a large register set before and after the
 367     call in the caller. If the arguments are passed in callee-saved registers,
 368     then they will be preserved by the callee across the call. This doesn't
 369     apply for values returned in callee-saved registers.
 370
 371     - On X86-64 the callee preserves all general purpose registers, except for
 372       R11. R11 can be used as a scratch register. Floating-point registers
 373       (XMMs/YMMs) are not preserved and need to be saved by the caller.
 374
 375     The idea behind this convention is to support calls to runtime functions
 376     that have a hot path and a cold path. The hot path is usually a small piece
 377     of code that doesn't use many registers. The cold path might need to call out to
 378     another function and therefore only needs to preserve the caller-saved
 379     registers, which haven't already been saved by the caller. The
 380     `PreserveMost` calling convention is very similar to the `cold` calling
 381     convention in terms of caller/callee-saved registers, but they are used for
 382     different types of function calls. `coldcc` is for function calls that are
 383     rarely executed, whereas `preserve_mostcc` function calls are intended to be
 384     on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
 385     doesn't prevent the inliner from inlining the function call.
 386
 387     This calling convention will be used by a future version of the ObjectiveC
 388     runtime and should therefore still be considered experimental at this time.
 389     Although this convention was created to optimize certain runtime calls to
 390     the ObjectiveC runtime, it is not limited to this runtime and might be used
 391     by other runtimes in the future too. The current implementation only
 392     supports X86-64, but the intention is to support more architectures in the
 393     future.
 394 "``preserve_allcc``" - The `PreserveAll` calling convention
 395     This calling convention attempts to make the code in the caller even less
 396     intrusive than the `PreserveMost` calling convention. This calling
 397     convention also behaves identical to the `C` calling convention on how
 398     arguments and return values are passed, but it uses a different set of
 399     caller/callee-saved registers. This removes the burden of saving and
 400     recovering a large register set before and after the call in the caller. If
 401     the arguments are passed in callee-saved registers, then they will be
 402     preserved by the callee across the call. This doesn't apply for values
 403     returned in callee-saved registers.
 404
 405     - On X86-64 the callee preserves all general purpose registers, except for
 406       R11. R11 can be used as a scratch register. Furthermore it also preserves
 407       all floating-point registers (XMMs/YMMs).
 408
 409     The idea behind this convention is to support calls to runtime functions
 410     that don't need to call out to any other functions.
 411
 412     This calling convention, like the `PreserveMost` calling convention, will be
 413     used by a future version of the ObjectiveC runtime and should be considered
 414     experimental at this time.
 415 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
 416     Clang generates an access function to access C++-style TLS. The access
 417     function generally has an entry block, an exit block and an initialization
 418     block that is run at the first time. The entry and exit blocks can access
 419     a few TLS IR variables, each access will be lowered to a platform-specific
 420     sequence.
 421
 422     This calling convention aims to minimize overhead in the caller by
 423     preserving as many registers as possible (all the registers that are
 424     preserved on the fast path, composed of the entry and exit blocks).
 425
 426     This calling convention behaves identical to the `C` calling convention on
 427     how arguments and return values are passed, but it uses a different set of
 428     caller/callee-saved registers.
 429
 430     Given that each platform has its own lowering sequence, hence its own set
 431     of preserved registers, we can't use the existing `PreserveMost`.
 432
 433     - On X86-64 the callee preserves all general purpose registers, except for
 434       RDI and RAX.
 435 "``tailcc``" - Tail callable calling convention
 436     This calling convention ensures that calls in tail position will always be
 437     tail call optimized. This calling convention is equivalent to fastcc,
 438     except for an additional guarantee that tail calls will be produced
 439     whenever possible. `Tail calls can only be optimized when this, the fastcc,
 440     the GHC or the HiPE convention is used. <CodeGenerator.html#id80>`_ This
 441     calling convention does not support varargs and requires the prototype of
 442     all callees to exactly match the prototype of the function definition.
 443 "``swiftcc``" - This calling convention is used for Swift language.
 444     - On X86-64 RCX and R8 are available for additional integer returns, and
 445       XMM2 and XMM3 are available for additional FP/vector returns.
 446     - On iOS platforms, we use AAPCS-VFP calling convention.
 447 "``swifttailcc``"
 448     This calling convention is like ``swiftcc`` in most respects, but also the
 449     callee pops the argument area of the stack so that mandatory tail calls are
 450     possible as in ``tailcc``.
 451 "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
 452     This calling convention is used for the Control Flow Guard check function,
 453     calls to which can be inserted before indirect calls to check that the call
 454     target is a valid function address. The check function has no return value,
 455     but it will trigger an OS-level error if the address is not a valid target.
 456     The set of registers preserved by the check function, and the register
 457     containing the target address are architecture-specific.
 458
 459     - On X86 the target address is passed in ECX.
 460     - On ARM the target address is passed in R0.
 461     - On AArch64 the target address is passed in X15.
 462 "``cc <n>``" - Numbered convention
 463     Any calling convention may be specified by number, allowing
 464     target-specific calling conventions to be used. Target specific
 465     calling conventions start at 64.
 466
 467 More calling conventions can be added/defined on an as-needed basis, to
 468 support Pascal conventions or any other well-known target-independent
 469 convention.
 470
 471 .. _visibilitystyles:
 472
 473 Visibility Styles
 474 -----------------
 475
 476 All Global Variables and Functions have one of the following visibility
 477 styles:
 478
 479 "``default``" - Default style
 480     On targets that use the ELF object file format, default visibility
 481     means that the declaration is visible to other modules and, in
 482     shared libraries, means that the declared entity may be overridden.
 483     On Darwin, default visibility means that the declaration is visible
 484     to other modules. Default visibility corresponds to "external
 485     linkage" in the language.
 486 "``hidden``" - Hidden style
 487     Two declarations of an object with hidden visibility refer to the
 488     same object if they are in the same shared object. Usually, hidden
 489     visibility indicates that the symbol will not be placed into the
 490     dynamic symbol table, so no other module (executable or shared
 491     library) can reference it directly.
 492 "``protected``" - Protected style
 493     On ELF, protected visibility indicates that the symbol will be
 494     placed in the dynamic symbol table, but that references within the
 495     defining module will bind to the local symbol. That is, the symbol
 496     cannot be overridden by another module.
 497
 498 A symbol with ``internal`` or ``private`` linkage must have ``default``
 499 visibility.
 500
 501 .. _dllstorageclass:
 502
 503 DLL Storage Classes
 504 -------------------
 505
 506 All Global Variables, Functions and Aliases can have one of the following
 507 DLL storage class:
 508
 509 ``dllimport``
 510     "``dllimport``" causes the compiler to reference a function or variable via
 511     a global pointer to a pointer that is set up by the DLL exporting the
 512     symbol. On Microsoft Windows targets, the pointer name is formed by
 513     combining ``__imp_`` and the function or variable name.
 514 ``dllexport``
 515     "``dllexport``" causes the compiler to provide a global pointer to a pointer
 516     in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
 517     Microsoft Windows targets, the pointer name is formed by combining
 518     ``__imp_`` and the function or variable name. Since this storage class
 519     exists for defining a dll interface, the compiler, assembler and linker know
 520     it is externally referenced and must refrain from deleting the symbol.
 521
 522 .. _tls_model:
 523
 524 Thread Local Storage Models
 525 ---------------------------
 526
 527 A variable may be defined as ``thread_local``, which means that it will
 528 not be shared by threads (each thread will have a separated copy of the
 529 variable). Not all targets support thread-local variables. Optionally, a
 530 TLS model may be specified:
 531
 532 ``localdynamic``
 533     For variables that are only used within the current shared library.
 534 ``initialexec``
 535     For variables in modules that will not be loaded dynamically.
 536 ``localexec``
 537     For variables defined in the executable and only used within it.
 538
 539 If no explicit model is given, the "general dynamic" model is used.
 540
 541 The models correspond to the ELF TLS models; see `ELF Handling For
 542 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
 543 more information on under which circumstances the different models may
 544 be used. The target may choose a different TLS model if the specified
 545 model is not supported, or if a better choice of model can be made.
 546
 547 A model can also be specified in an alias, but then it only governs how
 548 the alias is accessed. It will not have any effect in the aliasee.
 549
 550 For platforms without linker support of ELF TLS model, the -femulated-tls
 551 flag can be used to generate GCC compatible emulated TLS code.
 552
 553 .. _runtime_preemption_model:
 554
 555 Runtime Preemption Specifiers
 556 -----------------------------
 557
 558 Global variables, functions and aliases may have an optional runtime preemption
 559 specifier. If a preemption specifier isn't given explicitly, then a
 560 symbol is assumed to be ``dso_preemptable``.
 561
 562 ``dso_preemptable``
 563     Indicates that the function or variable may be replaced by a symbol from
 564     outside the linkage unit at runtime.
 565
 566 ``dso_local``
 567     The compiler may assume that a function or variable marked as ``dso_local``
 568     will resolve to a symbol within the same linkage unit. Direct access will
 569     be generated even if the definition is not within this compilation unit.
 570
 571 .. _namedtypes:
 572
 573 Structure Types
 574 ---------------
 575
 576 LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
 577 types <t_struct>`. Literal types are uniqued structurally, but identified types
 578 are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
 579 to forward declare a type that is not yet available.
 580
 581 An example of an identified structure specification is:
 582
 583 .. code-block:: llvm
 584
 585     %mytype = type { %mytype*, i32 }
 586
 587 Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
 588 literal types are uniqued in recent versions of LLVM.
 589
 590 .. _nointptrtype:
 591
 592 Non-Integral Pointer Type
 593 -------------------------
 594
 595 Note: non-integral pointer types are a work in progress, and they should be
 596 considered experimental at this time.
 597
 598 LLVM IR optionally allows the frontend to denote pointers in certain address
 599 spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
 600 Non-integral pointer types represent pointers that have an *unspecified* bitwise
 601 representation; that is, the integral representation may be target dependent or
 602 unstable (not backed by a fixed integer).
 603
 604 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 605 integral (i.e. normal) pointers in that they convert integers to and from
 606 corresponding pointer types, but there are additional implications to be
 607 aware of.  Because the bit-representation of a non-integral pointer may
 608 not be stable, two identical casts of the same operand may or may not
 609 return the same value.  Said differently, the conversion to or from the
 610 non-integral type depends on environmental state in an implementation
 611 defined manner.
 612
 613 If the frontend wishes to observe a *particular* value following a cast, the
 614 generated IR must fence with the underlying environment in an implementation
 615 defined manner. (In practice, this tends to require ``noinline`` routines for
 616 such operations.)
 617
 618 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
 619 non-integral types are analogous to ones on integral types with one
 620 key exception: the optimizer may not, in general, insert new dynamic
 621 occurrences of such casts.  If a new cast is inserted, the optimizer would
 622 need to either ensure that a) all possible values are valid, or b)
 623 appropriate fencing is inserted.  Since the appropriate fencing is
 624 implementation defined, the optimizer can't do the latter.  The former is
 625 challenging as many commonly expected properties, such as
 626 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
 627
 628 .. _globalvars:
 629
 630 Global Variables
 631 ----------------
 632
 633 Global variables define regions of memory allocated at compilation time
 634 instead of run-time.
 635
 636 Global variable definitions must be initialized.
 637
 638 Global variables in other translation units can also be declared, in which
 639 case they don't have an initializer.
 640
 641 Global variables can optionally specify a :ref:`linkage type <linkage>`.
 642
 643 Either global variable definitions or declarations may have an explicit section
 644 to be placed in and may have an optional explicit alignment specified. If there
 645 is a mismatch between the explicit or inferred section information for the
 646 variable declaration and its definition the resulting behavior is undefined.
 647
 648 A variable may be defined as a global ``constant``, which indicates that
 649 the contents of the variable will **never** be modified (enabling better
 650 optimization, allowing the global data to be placed in the read-only
 651 section of an executable, etc). Note that variables that need runtime
 652 initialization cannot be marked ``constant`` as there is a store to the
 653 variable.
 654
 655 LLVM explicitly allows *declarations* of global variables to be marked
 656 constant, even if the final definition of the global is not. This
 657 capability can be used to enable slightly better optimization of the
 658 program, but requires the language definition to guarantee that
 659 optimizations based on the 'constantness' are valid for the translation
 660 units that do not include the definition.
 661
 662 As SSA values, global variables define pointer values that are in scope
 663 (i.e. they dominate) all basic blocks in the program. Global variables
 664 always define a pointer to their "content" type because they describe a
 665 region of memory, and all memory objects in LLVM are accessed through
 666 pointers.
 667
 668 Global variables can be marked with ``unnamed_addr`` which indicates
 669 that the address is not significant, only the content. Constants marked
 670 like this can be merged with other constants if they have the same
 671 initializer. Note that a constant with significant address *can* be
 672 merged with a ``unnamed_addr`` constant, the result being a constant
 673 whose address is significant.
 674
 675 If the ``local_unnamed_addr`` attribute is given, the address is known to
 676 not be significant within the module.
 677
 678 A global variable may be declared to reside in a target-specific
 679 numbered address space. For targets that support them, address spaces
 680 may affect how optimizations are performed and/or what target
 681 instructions are used to access the variable. The default address space
 682 is zero. The address space qualifier must precede any other attributes.
 683
 684 LLVM allows an explicit section to be specified for globals. If the
 685 target supports it, it will emit globals to the section specified.
 686 Additionally, the global can placed in a comdat if the target has the necessary
 687 support.
 688
 689 External declarations may have an explicit section specified. Section
 690 information is retained in LLVM IR for targets that make use of this
 691 information. Attaching section information to an external declaration is an
 692 assertion that its definition is located in the specified section. If the
 693 definition is located in a different section, the behavior is undefined.
 694
 695 By default, global initializers are optimized by assuming that global
 696 variables defined within the module are not modified from their
 697 initial values before the start of the global initializer. This is
 698 true even for variables potentially accessible from outside the
 699 module, including those with external linkage or appearing in
 700 ``@llvm.used`` or dllexported variables. This assumption may be suppressed
 701 by marking the variable with ``externally_initialized``.
 702
 703 An explicit alignment may be specified for a global, which must be a
 704 power of 2. If not present, or if the alignment is set to zero, the
 705 alignment of the global is set by the target to whatever it feels
 706 convenient. If an explicit alignment is specified, the global is forced
 707 to have exactly that alignment. Targets and optimizers are not allowed
 708 to over-align the global if the global has an assigned section. In this
 709 case, the extra alignment could be observable: for example, code could
 710 assume that the globals are densely packed in their section and try to
 711 iterate over them as an array, alignment padding would break this
 712 iteration. The maximum alignment is ``1 << 32``.
 713
 714 For global variables declarations, as well as definitions that may be
 715 replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
 716 linkage types), LLVM makes no assumptions about the allocation size of the
 717 variables, except that they may not overlap. The alignment of a global variable
 718 declaration or replaceable definition must not be greater than the alignment of
 719 the definition it resolves to.
 720
 721 Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
 722 an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
 723 an optional :ref:`global attributes <glattrs>` and
 724 an optional list of attached :ref:`metadata <metadata>`.
 725
 726 Variables and aliases can have a
 727 :ref:`Thread Local Storage Model <tls_model>`.
 728
 729 :ref:`Scalable vectors <t_vector>` cannot be global variables or members of
 730 arrays because their size is unknown at compile time. They are allowed in
 731 structs to facilitate intrinsics returning multiple values. Structs containing
 732 scalable vectors cannot be used in loads, stores, allocas, or GEPs.
 733
 734 Syntax::
 735
 736       @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
 737                          [DLLStorageClass] [ThreadLocal]
 738                          [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
 739                          [ExternallyInitialized]
 740                          <global | constant> <Type> [<InitializerConstant>]
 741                          [, section "name"] [, comdat [($name)]]
 742                          [, align <Alignment>] (, !name !N)*
 743
 744 For example, the following defines a global in a numbered address space
 745 with an initializer, section, and alignment:
 746
 747 .. code-block:: llvm
 748
 749     @G = addrspace(5) constant float 1.0, section "foo", align 4
 750
 751 The following example just declares a global variable
 752
 753 .. code-block:: llvm
 754
 755    @G = external global i32
 756
 757 The following example defines a thread-local global with the
 758 ``initialexec`` TLS model:
 759
 760 .. code-block:: llvm
 761
 762     @G = thread_local(initialexec) global i32 0, align 4
 763
 764 .. _functionstructure:
 765
 766 Functions
 767 ---------
 768
 769 LLVM function definitions consist of the "``define``" keyword, an
 770 optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
 771 specifier <runtime_preemption_model>`,  an optional :ref:`visibility
 772 style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
 773 an optional :ref:`calling convention <callingconv>`,
 774 an optional ``unnamed_addr`` attribute, a return type, an optional
 775 :ref:`parameter attribute <paramattrs>` for the return type, a function
 776 name, a (possibly empty) argument list (each with optional :ref:`parameter
 777 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
 778 an optional address space, an optional section, an optional alignment,
 779 an optional :ref:`comdat <langref_comdats>`,
 780 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
 781 an optional :ref:`prologue <prologuedata>`,
 782 an optional :ref:`personality <personalityfn>`,
 783 an optional list of attached :ref:`metadata <metadata>`,
 784 an opening curly brace, a list of basic blocks, and a closing curly brace.
 785
 786 LLVM function declarations consist of the "``declare``" keyword, an
 787 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
 788 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
 789 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
 790 or ``local_unnamed_addr`` attribute, an optional address space, a return type,
 791 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
 792 empty list of arguments, an optional alignment, an optional :ref:`garbage
 793 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
 794 :ref:`prologue <prologuedata>`.
 795
 796 A function definition contains a list of basic blocks, forming the CFG (Control
 797 Flow Graph) for the function. Each basic block may optionally start with a label
 798 (giving the basic block a symbol table entry), contains a list of instructions,
 799 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
 800 function return). If an explicit label name is not provided, a block is assigned
 801 an implicit numbered label, using the next value from the same counter as used
 802 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
 803 function entry block does not have an explicit label, it will be assigned label
 804 "%0", then the first unnamed temporary in that block will be "%1", etc. If a
 805 numeric label is explicitly specified, it must match the numeric label that
 806 would be used implicitly.
 807
 808 The first basic block in a function is special in two ways: it is
 809 immediately executed on entrance to the function, and it is not allowed
 810 to have predecessor basic blocks (i.e. there can not be any branches to
 811 the entry block of a function). Because the block can have no
 812 predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
 813
 814 LLVM allows an explicit section to be specified for functions. If the
 815 target supports it, it will emit functions to the section specified.
 816 Additionally, the function can be placed in a COMDAT.
 817
 818 An explicit alignment may be specified for a function. If not present,
 819 or if the alignment is set to zero, the alignment of the function is set
 820 by the target to whatever it feels convenient. If an explicit alignment
 821 is specified, the function is forced to have at least that much
 822 alignment. All alignments must be a power of 2.
 823
 824 If the ``unnamed_addr`` attribute is given, the address is known to not
 825 be significant and two identical functions can be merged.
 826
 827 If the ``local_unnamed_addr`` attribute is given, the address is known to
 828 not be significant within the module.
 829
 830 If an explicit address space is not given, it will default to the program
 831 address space from the :ref:`datalayout string<langref_datalayout>`.
 832
 833 Syntax::
 834
 835     define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
 836            [cconv] [ret attrs]
 837            <ResultType> @<FunctionName> ([argument list])
 838            [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
 839            [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant]
 840            [prologue Constant] [personality Constant] (!name !N)* { ... }
 841
 842 The argument list is a comma separated sequence of arguments where each
 843 argument is of the following form:
 844
 845 Syntax::
 846
 847    <type> [parameter Attrs] [name]
 848
 849
 850 .. _langref_aliases:
 851
 852 Aliases
 853 -------
 854
 855 Aliases, unlike function or variables, don't create any new data. They
 856 are just a new symbol and metadata for an existing position.
 857
 858 Aliases have a name and an aliasee that is either a global value or a
 859 constant expression.
 860
 861 Aliases may have an optional :ref:`linkage type <linkage>`, an optional
 862 :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
 863 :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
 864 <dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
 865
 866 Syntax::
 867
 868     @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
 869
 870 The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
 871 ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
 872 might not correctly handle dropping a weak symbol that is aliased.
 873
 874 Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
 875 the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
 876 to the same content.
 877
 878 If the ``local_unnamed_addr`` attribute is given, the address is known to
 879 not be significant within the module.
 880
 881 Since aliases are only a second name, some restrictions apply, of which
 882 some can only be checked when producing an object file:
 883
 884 * The expression defining the aliasee must be computable at assembly
 885   time. Since it is just a name, no relocations can be used.
 886
 887 * No alias in the expression can be weak as the possibility of the
 888   intermediate alias being overridden cannot be represented in an
 889   object file.
 890
 891 * No global value in the expression can be a declaration, since that
 892   would require a relocation, which is not possible.
 893
 894 .. _langref_ifunc:
 895
 896 IFuncs
 897 -------
 898
 899 IFuncs, like as aliases, don't create any new data or func. They are just a new
 900 symbol that dynamic linker resolves at runtime by calling a resolver function.
 901
 902 IFuncs have a name and a resolver that is a function called by dynamic linker
 903 that returns address of another function associated with the name.
 904
 905 IFunc may have an optional :ref:`linkage type <linkage>` and an optional
 906 :ref:`visibility style <visibility>`.
 907
 908 Syntax::
 909
 910     @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
 911
 912
 913 .. _langref_comdats:
 914
 915 Comdats
 916 -------
 917
 918 Comdat IR provides access to object file COMDAT/section group functionality
 919 which represents interrelated sections.
 920
 921 Comdats have a name which represents the COMDAT key and a selection kind to
 922 provide input on how the linker deduplicates comdats with the same key in two
 923 different object files. A comdat must be included or omitted as a unit.
 924 Discarding the whole comdat is allowed but discarding a subset is not.
 925
 926 A global object may be a member of at most one comdat. Aliases are placed in the
 927 same COMDAT that their aliasee computes to, if any.
 928
 929 Syntax::
 930
 931     $<Name> = comdat SelectionKind
 932
 933 For selection kinds other than ``nodeduplicate``, only one of the duplicate
 934 comdats may be retained by the linker and the members of the remaining comdats
 935 must be discarded. The following selection kinds are supported:
 936
 937 ``any``
 938     The linker may choose any COMDAT key, the choice is arbitrary.
 939 ``exactmatch``
 940     The linker may choose any COMDAT key but the sections must contain the
 941     same data.
 942 ``largest``
 943     The linker will choose the section containing the largest COMDAT key.
 944 ``nodeduplicate``
 945     No deduplication is performed.
 946 ``samesize``
 947     The linker may choose any COMDAT key but the sections must contain the
 948     same amount of data.
 949
 950 - XCOFF and Mach-O don't support COMDATs.
 951 - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
 952   a non-local linkage COMDAT symbol.
 953 - ELF supports ``any`` and ``nodeduplicate``.
 954 - WebAssembly only supports ``any``.
 955
 956 Here is an example of a COFF COMDAT where a function will only be selected if
 957 the COMDAT key's section is the largest:
 958
 959 .. code-block:: text
 960
 961    $foo = comdat largest
 962    @foo = global i32 2, comdat($foo)
 963
 964    define void @bar() comdat($foo) {
 965      ret void
 966    }
 967
 968 In a COFF object file, this will create a COMDAT section with selection kind
 969 ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
 970 and another COMDAT section with selection kind
 971 ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
 972 section and contains the contents of the ``@bar`` symbol.
 973
 974 As a syntactic sugar the ``$name`` can be omitted if the name is the same as
 975 the global name:
 976
 977 .. code-block:: llvm
 978
 979   $foo = comdat any
 980   @foo = global i32 2, comdat
 981   @bar = global i32 3, comdat($foo)
 982
 983 There are some restrictions on the properties of the global object.
 984 It, or an alias to it, must have the same name as the COMDAT group when
 985 targeting COFF.
 986 The contents and size of this object may be used during link-time to determine
 987 which COMDAT groups get selected depending on the selection kind.
 988 Because the name of the object must match the name of the COMDAT group, the
 989 linkage of the global object must not be local; local symbols can get renamed
 990 if a collision occurs in the symbol table.
 991
 992 The combined use of COMDATS and section attributes may yield surprising results.
 993 For example:
 994
 995 .. code-block:: llvm
 996
 997    $foo = comdat any
 998    $bar = comdat any
 999    @g1 = global i32 42, section "sec", comdat($foo)
1000    @g2 = global i32 42, section "sec", comdat($bar)
1001
1002 From the object file perspective, this requires the creation of two sections
1003 with the same name. This is necessary because both globals belong to different
1004 COMDAT groups and COMDATs, at the object file level, are represented by
1005 sections.
1006
1007 Note that certain IR constructs like global variables and functions may
1008 create COMDATs in the object file in addition to any which are specified using
1009 COMDAT IR. This arises when the code generator is configured to emit globals
1010 in individual sections (e.g. when `-data-sections` or `-function-sections`
1011 is supplied to `llc`).
1012
1013 .. _namedmetadatastructure:
1014
1015 Named Metadata
1016 --------------
1017
1018 Named metadata is a collection of metadata. :ref:`Metadata
1019 nodes <metadata>` (but not metadata strings) are the only valid
1020 operands for a named metadata.
1021
1022 #. Named metadata are represented as a string of characters with the
1023    metadata prefix. The rules for metadata names are the same as for
1024    identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1025    are still valid, which allows any character to be part of a name.
1026
1027 Syntax::
1028
1029     ; Some unnamed metadata nodes, which are referenced by the named metadata.
1030     !0 = !{!"zero"}
1031     !1 = !{!"one"}
1032     !2 = !{!"two"}
1033     ; A named metadata.
1034     !name = !{!0, !1, !2}
1035
1036 .. _paramattrs:
1037
1038 Parameter Attributes
1039 --------------------
1040
1041 The return type and each parameter of a function type may have a set of
1042 *parameter attributes* associated with them. Parameter attributes are
1043 used to communicate additional information about the result or
1044 parameters of a function. Parameter attributes are considered to be part
1045 of the function, not of the function type, so functions with different
1046 parameter attributes can have the same function type.
1047
1048 Parameter attributes are simple keywords that follow the type specified.
1049 If multiple parameter attributes are needed, they are space separated.
1050 For example:
1051
1052 .. code-block:: llvm
1053
1054     declare i32 @printf(i8* noalias nocapture, ...)
1055     declare i32 @atoi(i8 zeroext)
1056     declare signext i8 @returns_signed_char()
1057
1058 Note that any attributes for the function result (``nounwind``,
1059 ``readonly``) come immediately after the argument list.
1060
1061 Currently, only the following parameter attributes are defined:
1062
1063 ``zeroext``
1064     This indicates to the code generator that the parameter or return
1065     value should be zero-extended to the extent required by the target's
1066     ABI by the caller (for a parameter) or the callee (for a return value).
1067 ``signext``
1068     This indicates to the code generator that the parameter or return
1069     value should be sign-extended to the extent required by the target's
1070     ABI (which is usually 32-bits) by the caller (for a parameter) or
1071     the callee (for a return value).
1072 ``inreg``
1073     This indicates that this parameter or return value should be treated
1074     in a special target-dependent fashion while emitting code for
1075     a function call or return (usually, by putting it in a register as
1076     opposed to memory, though some targets use it to distinguish between
1077     two different kinds of registers). Use of this attribute is
1078     target-specific.
1079 ``byval(<ty>)``
1080     This indicates that the pointer parameter should really be passed by
1081     value to the function. The attribute implies that a hidden copy of
1082     the pointee is made between the caller and the callee, so the callee
1083     is unable to modify the value in the caller. This attribute is only
1084     valid on LLVM pointer arguments. It is generally used to pass
1085     structs and arrays by value, but is also valid on pointers to
1086     scalars. The copy is considered to belong to the caller not the
1087     callee (for example, ``readonly`` functions should not write to
1088     ``byval`` parameters). This is not a valid attribute for return
1089     values.
1090
1091     The byval type argument indicates the in-memory value type, and
1092     must be the same as the pointee type of the argument.
1093
1094     The byval attribute also supports specifying an alignment with the
1095     align attribute. It indicates the alignment of the stack slot to
1096     form and the known alignment of the pointer specified to the call
1097     site. If the alignment is not specified, then the code generator
1098     makes a target-specific assumption.
1099
1100 .. _attr_byref:
1101
1102 ``byref(<ty>)``
1103
1104     The ``byref`` argument attribute allows specifying the pointee
1105     memory type of an argument. This is similar to ``byval``, but does
1106     not imply a copy is made anywhere, or that the argument is passed
1107     on the stack. This implies the pointer is dereferenceable up to
1108     the storage size of the type.
1109
1110     It is not generally permissible to introduce a write to an
1111     ``byref`` pointer. The pointer may have any address space and may
1112     be read only.
1113
1114     This is not a valid attribute for return values.
1115
1116     The alignment for an ``byref`` parameter can be explicitly
1117     specified by combining it with the ``align`` attribute, similar to
1118     ``byval``. If the alignment is not specified, then the code generator
1119     makes a target-specific assumption.
1120
1121     This is intended for representing ABI constraints, and is not
1122     intended to be inferred for optimization use.
1123
1124 .. _attr_preallocated:
1125
1126 ``preallocated(<ty>)``
1127     This indicates that the pointer parameter should really be passed by
1128     value to the function, and that the pointer parameter's pointee has
1129     already been initialized before the call instruction. This attribute
1130     is only valid on LLVM pointer arguments. The argument must be the value
1131     returned by the appropriate
1132     :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1133     ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1134     calls, although it is ignored during codegen.
1135
1136     A non ``musttail`` function call with a ``preallocated`` attribute in
1137     any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1138     function call cannot have a ``"preallocated"`` operand bundle.
1139
1140     The preallocated attribute requires a type argument, which must be
1141     the same as the pointee type of the argument.
1142
1143     The preallocated attribute also supports specifying an alignment with the
1144     align attribute. It indicates the alignment of the stack slot to
1145     form and the known alignment of the pointer specified to the call
1146     site. If the alignment is not specified, then the code generator
1147     makes a target-specific assumption.
1148
1149 .. _attr_inalloca:
1150
1151 ``inalloca(<ty>)``
1152
1153     The ``inalloca`` argument attribute allows the caller to take the
1154     address of outgoing stack arguments. An ``inalloca`` argument must
1155     be a pointer to stack memory produced by an ``alloca`` instruction.
1156     The alloca, or argument allocation, must also be tagged with the
1157     inalloca keyword. Only the last argument may have the ``inalloca``
1158     attribute, and that argument is guaranteed to be passed in memory.
1159
1160     An argument allocation may be used by a call at most once because
1161     the call may deallocate it. The ``inalloca`` attribute cannot be
1162     used in conjunction with other attributes that affect argument
1163     storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1164     ``inalloca`` attribute also disables LLVM's implicit lowering of
1165     large aggregate return values, which means that frontend authors
1166     must lower them with ``sret`` pointers.
1167
1168     When the call site is reached, the argument allocation must have
1169     been the most recent stack allocation that is still live, or the
1170     behavior is undefined. It is possible to allocate additional stack
1171     space after an argument allocation and before its call site, but it
1172     must be cleared off with :ref:`llvm.stackrestore
1173     <int_stackrestore>`.
1174
1175     The inalloca attribute requires a type argument, which must be the
1176     same as the pointee type of the argument.
1177
1178     See :doc:`InAlloca` for more information on how to use this
1179     attribute.
1180
1181 ``sret(<ty>)``
1182     This indicates that the pointer parameter specifies the address of a
1183     structure that is the return value of the function in the source
1184     program. This pointer must be guaranteed by the caller to be valid:
1185     loads and stores to the structure may be assumed by the callee not
1186     to trap and to be properly aligned. This is not a valid attribute
1187     for return values.
1188
1189     The sret type argument specifies the in memory type, which must be
1190     the same as the pointee type of the argument.
1191
1192 .. _attr_elementtype:
1193
1194 ``elementtype(<ty>)``
1195
1196     The ``elementtype`` argument attribute can be used to specify a pointer
1197     element type in a way that is compatible with `opaque pointers
1198     <OpaquePointers.html>`.
1199
1200     The ``elementtype`` attribute by itself does not carry any specific
1201     semantics. However, certain intrinsics may require this attribute to be
1202     present and assign it particular semantics. This will be documented on
1203     individual intrinsics.
1204
1205     The attribute may only be applied to pointer typed arguments of intrinsic
1206     calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1207     to parameters on function declarations. For non-opaque pointers, the type
1208     passed to ``elementtype`` must match the pointer element type.
1209
1210 .. _attr_align:
1211
1212 ``align <n>`` or ``align(<n>)``
1213     This indicates that the pointer value has the specified alignment.
1214     If the pointer value does not have the specified alignment,
1215     :ref:`poison value <poisonvalues>` is returned or passed instead. The
1216     ``align`` attribute should be combined with the ``noundef`` attribute to
1217     ensure a pointer is aligned, or otherwise the behavior is undefined. Note
1218     that ``align 1`` has no effect on non-byval, non-preallocated arguments.
1219
1220     Note that this attribute has additional semantics when combined with the
1221     ``byval`` or ``preallocated`` attribute, which are documented there.
1222
1223 .. _noalias:
1224
1225 ``noalias``
1226     This indicates that memory locations accessed via pointer values
1227     :ref:`based <pointeraliasing>` on the argument or return value are not also
1228     accessed, during the execution of the function, via pointer values not
1229     *based* on the argument or return value. This guarantee only holds for
1230     memory locations that are *modified*, by any means, during the execution of
1231     the function. The attribute on a return value also has additional semantics
1232     described below. The caller shares the responsibility with the callee for
1233     ensuring that these requirements are met.  For further details, please see
1234     the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1235     or No>`.
1236
1237     Note that this definition of ``noalias`` is intentionally similar
1238     to the definition of ``restrict`` in C99 for function arguments.
1239
1240     For function return values, C99's ``restrict`` is not meaningful,
1241     while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1242     attribute on return values are stronger than the semantics of the attribute
1243     when used on function arguments. On function return values, the ``noalias``
1244     attribute indicates that the function acts like a system memory allocation
1245     function, returning a pointer to allocated storage disjoint from the
1246     storage for any other object accessible to the caller.
1247
1248 .. _nocapture:
1249
1250 ``nocapture``
1251     This indicates that the callee does not :ref:`capture <pointercapture>` the
1252     pointer. This is not a valid attribute for return values.
1253     This attribute applies only to the particular copy of the pointer passed in
1254     this argument. A caller could pass two copies of the same pointer with one
1255     being annotated nocapture and the other not, and the callee could validly
1256     capture through the non annotated parameter.
1257
1258 .. code-block:: llvm
1259
1260     define void @f(i8* nocapture %a, i8* %b) {
1261       ; (capture %b)
1262     }
1263
1264     call void @f(i8* @glb, i8* @glb) ; well-defined
1265
1266 ``nofree``
1267     This indicates that callee does not free the pointer argument. This is not
1268     a valid attribute for return values.
1269
1270 .. _nest:
1271
1272 ``nest``
1273     This indicates that the pointer parameter can be excised using the
1274     :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1275     attribute for return values and can only be applied to one parameter.
1276
1277 ``returned``
1278     This indicates that the function always returns the argument as its return
1279     value. This is a hint to the optimizer and code generator used when
1280     generating the caller, allowing value propagation, tail call optimization,
1281     and omission of register saves and restores in some cases; it is not
1282     checked or enforced when generating the callee. The parameter and the
1283     function return type must be valid operands for the
1284     :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1285     return values and can only be applied to one parameter.
1286
1287 ``nonnull``
1288     This indicates that the parameter or return pointer is not null. This
1289     attribute may only be applied to pointer typed parameters. This is not
1290     checked or enforced by LLVM; if the parameter or return pointer is null,
1291     :ref:`poison value <poisonvalues>` is returned or passed instead.
1292     The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1293     to ensure a pointer is not null or otherwise the behavior is undefined.
1294
1295 ``dereferenceable(<n>)``
1296     This indicates that the parameter or return pointer is dereferenceable. This
1297     attribute may only be applied to pointer typed parameters. A pointer that
1298     is dereferenceable can be loaded from speculatively without a risk of
1299     trapping. The number of bytes known to be dereferenceable must be provided
1300     in parentheses. It is legal for the number of bytes to be less than the
1301     size of the pointee type. The ``nonnull`` attribute does not imply
1302     dereferenceability (consider a pointer to one element past the end of an
1303     array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1304     ``addrspace(0)`` (which is the default address space), except if the
1305     ``null_pointer_is_valid`` function attribute is present.
1306     ``n`` should be a positive number. The pointer should be well defined,
1307     otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1308     implies ``noundef``.
1309
1310 ``dereferenceable_or_null(<n>)``
1311     This indicates that the parameter or return value isn't both
1312     non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1313     time. All non-null pointers tagged with
1314     ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1315     For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1316     a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1317     and in other address spaces ``dereferenceable_or_null(<n>)``
1318     implies that a pointer is at least one of ``dereferenceable(<n>)``
1319     or ``null`` (i.e. it may be both ``null`` and
1320     ``dereferenceable(<n>)``). This attribute may only be applied to
1321     pointer typed parameters.
1322
1323 ``swiftself``
1324     This indicates that the parameter is the self/context parameter. This is not
1325     a valid attribute for return values and can only be applied to one
1326     parameter.
1327
1328 ``swiftasync``
1329     This indicates that the parameter is the asynchronous context parameter and
1330     triggers the creation of a target-specific extended frame record to store
1331     this pointer. This is not a valid attribute for return values and can only
1332     be applied to one parameter.
1333
1334 ``swifterror``
1335     This attribute is motivated to model and optimize Swift error handling. It
1336     can be applied to a parameter with pointer to pointer type or a
1337     pointer-sized alloca. At the call site, the actual argument that corresponds
1338     to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1339     the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1340     the parameter or the alloca) can only be loaded and stored from, or used as
1341     a ``swifterror`` argument. This is not a valid attribute for return values
1342     and can only be applied to one parameter.
1343
1344     These constraints allow the calling convention to optimize access to
1345     ``swifterror`` variables by associating them with a specific register at
1346     call boundaries rather than placing them in memory. Since this does change
1347     the calling convention, a function which uses the ``swifterror`` attribute
1348     on a parameter is not ABI-compatible with one which does not.
1349
1350     These constraints also allow LLVM to assume that a ``swifterror`` argument
1351     does not alias any other memory visible within a function and that a
1352     ``swifterror`` alloca passed as an argument does not escape.
1353
1354 ``immarg``
1355     This indicates the parameter is required to be an immediate
1356     value. This must be a trivial immediate integer or floating-point
1357     constant. Undef or constant expressions are not valid. This is
1358     only valid on intrinsic declarations and cannot be applied to a
1359     call site or arbitrary function.
1360
1361 ``noundef``
1362     This attribute applies to parameters and return values. If the value
1363     representation contains any undefined or poison bits, the behavior is
1364     undefined. Note that this does not refer to padding introduced by the
1365     type's storage representation.
1366
1367 ``alignstack(<n>)``
1368     This indicates the alignment that should be considered by the backend when
1369     assigning this parameter to a stack slot during calling convention
1370     lowering. The enforcement of the specified alignment is target-dependent,
1371     as target-specific calling convention rules may override this value. This
1372     attribute serves the purpose of carrying language specific alignment
1373     information that is not mapped to base types in the backend (for example,
1374     over-alignment specification through language attributes).
1375
1376 .. _gc:
1377
1378 Garbage Collector Strategy Names
1379 --------------------------------
1380
1381 Each function may specify a garbage collector strategy name, which is simply a
1382 string:
1383
1384 .. code-block:: llvm
1385
1386     define void @f() gc "name" { ... }
1387
1388 The supported values of *name* includes those :ref:`built in to LLVM
1389 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1390 strategy will cause the compiler to alter its output in order to support the
1391 named garbage collection algorithm. Note that LLVM itself does not contain a
1392 garbage collector, this functionality is restricted to generating machine code
1393 which can interoperate with a collector provided externally.
1394
1395 .. _prefixdata:
1396
1397 Prefix Data
1398 -----------
1399
1400 Prefix data is data associated with a function which the code
1401 generator will emit immediately before the function's entrypoint.
1402 The purpose of this feature is to allow frontends to associate
1403 language-specific runtime metadata with specific functions and make it
1404 available through the function pointer while still allowing the
1405 function pointer to be called.
1406
1407 To access the data for a given function, a program may bitcast the
1408 function pointer to a pointer to the constant's type and dereference
1409 index -1. This implies that the IR symbol points just past the end of
1410 the prefix data. For instance, take the example of a function annotated
1411 with a single ``i32``,
1412
1413 .. code-block:: llvm
1414
1415     define void @f() prefix i32 123 { ... }
1416
1417 The prefix data can be referenced as,
1418
1419 .. code-block:: llvm
1420
1421     %0 = bitcast void* () @f to i32*
1422     %a = getelementptr inbounds i32, i32* %0, i32 -1
1423     %b = load i32, i32* %a
1424
1425 Prefix data is laid out as if it were an initializer for a global variable
1426 of the prefix data's type. The function will be placed such that the
1427 beginning of the prefix data is aligned. This means that if the size
1428 of the prefix data is not a multiple of the alignment size, the
1429 function's entrypoint will not be aligned. If alignment of the
1430 function's entrypoint is desired, padding must be added to the prefix
1431 data.
1432
1433 A function may have prefix data but no body. This has similar semantics
1434 to the ``available_externally`` linkage in that the data may be used by the
1435 optimizers but will not be emitted in the object file.
1436
1437 .. _prologuedata:
1438
1439 Prologue Data
1440 -------------
1441
1442 The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1443 be inserted prior to the function body. This can be used for enabling
1444 function hot-patching and instrumentation.
1445
1446 To maintain the semantics of ordinary function calls, the prologue data must
1447 have a particular format. Specifically, it must begin with a sequence of
1448 bytes which decode to a sequence of machine instructions, valid for the
1449 module's target, which transfer control to the point immediately succeeding
1450 the prologue data, without performing any other visible action. This allows
1451 the inliner and other passes to reason about the semantics of the function
1452 definition without needing to reason about the prologue data. Obviously this
1453 makes the format of the prologue data highly target dependent.
1454
1455 A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1456 which encodes the ``nop`` instruction:
1457
1458 .. code-block:: text
1459
1460     define void @f() prologue i8 144 { ... }
1461
1462 Generally prologue data can be formed by encoding a relative branch instruction
1463 which skips the metadata, as in this example of valid prologue data for the
1464 x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1465
1466 .. code-block:: text
1467
1468     %0 = type <{ i8, i8, i8* }>
1469
1470     define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
1471
1472 A function may have prologue data but no body. This has similar semantics
1473 to the ``available_externally`` linkage in that the data may be used by the
1474 optimizers but will not be emitted in the object file.
1475
1476 .. _personalityfn:
1477
1478 Personality Function
1479 --------------------
1480
1481 The ``personality`` attribute permits functions to specify what function
1482 to use for exception handling.
1483
1484 .. _attrgrp:
1485
1486 Attribute Groups
1487 ----------------
1488
1489 Attribute groups are groups of attributes that are referenced by objects within
1490 the IR. They are important for keeping ``.ll`` files readable, because a lot of
1491 functions will use the same set of attributes. In the degenerative case of a
1492 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1493 group will capture the important command line flags used to build that file.
1494
1495 An attribute group is a module-level object. To use an attribute group, an
1496 object references the attribute group's ID (e.g. ``#37``). An object may refer
1497 to more than one attribute group. In that situation, the attributes from the
1498 different groups are merged.
1499
1500 Here is an example of attribute groups for a function that should always be
1501 inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1502
1503 .. code-block:: llvm
1504
1505    ; Target-independent attributes:
1506    attributes #0 = { alwaysinline alignstack=4 }
1507
1508    ; Target-dependent attributes:
1509    attributes #1 = { "no-sse" }
1510
1511    ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1512    define void @f() #0 #1 { ... }
1513
1514 .. _fnattrs:
1515
1516 Function Attributes
1517 -------------------
1518
1519 Function attributes are set to communicate additional information about
1520 a function. Function attributes are considered to be part of the
1521 function, not of the function type, so functions with different function
1522 attributes can have the same function type.
1523
1524 Function attributes are simple keywords that follow the type specified.
1525 If multiple attributes are needed, they are space separated. For
1526 example:
1527
1528 .. code-block:: llvm
1529
1530     define void @f() noinline { ... }
1531     define void @f() alwaysinline { ... }
1532     define void @f() alwaysinline optsize { ... }
1533     define void @f() optsize { ... }
1534
1535 ``alignstack(<n>)``
1536     This attribute indicates that, when emitting the prologue and
1537     epilogue, the backend should forcibly align the stack pointer.
1538     Specify the desired alignment, which must be a power of two, in
1539     parentheses.
1540 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1541     This attribute indicates that the annotated function will always return at
1542     least a given number of bytes (or null). Its arguments are zero-indexed
1543     parameter numbers; if one argument is provided, then it's assumed that at
1544     least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1545     returned pointer. If two are provided, then it's assumed that
1546     ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1547     available. The referenced parameters must be integer types. No assumptions
1548     are made about the contents of the returned block of memory.
1549 ``alwaysinline``
1550     This attribute indicates that the inliner should attempt to inline
1551     this function into callers whenever possible, ignoring any active
1552     inlining size threshold for this caller.
1553 ``builtin``
1554     This indicates that the callee function at a call site should be
1555     recognized as a built-in function, even though the function's declaration
1556     uses the ``nobuiltin`` attribute. This is only valid at call sites for
1557     direct calls to functions that are declared with the ``nobuiltin``
1558     attribute.
1559 ``cold``
1560     This attribute indicates that this function is rarely called. When
1561     computing edge weights, basic blocks post-dominated by a cold
1562     function call are also considered to be cold; and, thus, given low
1563     weight.
1564 ``convergent``
1565     In some parallel execution models, there exist operations that cannot be
1566     made control-dependent on any additional values.  We call such operations
1567     ``convergent``, and mark them with this attribute.
1568
1569     The ``convergent`` attribute may appear on functions or call/invoke
1570     instructions.  When it appears on a function, it indicates that calls to
1571     this function should not be made control-dependent on additional values.
1572     For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1573     calls to this intrinsic cannot be made control-dependent on additional
1574     values.
1575
1576     When it appears on a call/invoke, the ``convergent`` attribute indicates
1577     that we should treat the call as though we're calling a convergent
1578     function.  This is particularly useful on indirect calls; without this we
1579     may treat such calls as though the target is non-convergent.
1580
1581     The optimizer may remove the ``convergent`` attribute on functions when it
1582     can prove that the function does not execute any convergent operations.
1583     Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1584     can prove that the call/invoke cannot call a convergent function.
1585 ``disable_sanitizer_instrumentation``
1586     When instrumenting code with sanitizers, it can be important to skip certain
1587     functions to ensure no instrumentation is applied to them.
1588
1589     This attribute is not always similar to absent ``sanitize_<name>``
1590     attributes: depending on the specific sanitizer, code can be inserted into
1591     functions regardless of the ``sanitize_<name>`` attribute to prevent false
1592     positive reports.
1593
1594     ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1595     taking precedence over the ``sanitize_<name>`` attributes and other compiler
1596     flags.
1597 ``"dontcall-error"``
1598     This attribute denotes that an error diagnostic should be emitted when a
1599     call of a function with this attribute is not eliminated via optimization.
1600     Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1601     such callees to attach information about where in the source language such a
1602     call came from. A string value can be provided as a note.
1603 ``"dontcall-warn"``
1604     This attribute denotes that a warning diagnostic should be emitted when a
1605     call of a function with this attribute is not eliminated via optimization.
1606     Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1607     such callees to attach information about where in the source language such a
1608     call came from. A string value can be provided as a note.
1609 ``"frame-pointer"``
1610     This attribute tells the code generator whether the function
1611     should keep the frame pointer. The code generator may emit the frame pointer
1612     even if this attribute says the frame pointer can be eliminated.
1613     The allowed string values are:
1614
1615      * ``"none"`` (default) - the frame pointer can be eliminated.
1616      * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1617        other functions.
1618      * ``"all"`` - the frame pointer should be kept.
1619 ``hot``
1620     This attribute indicates that this function is a hot spot of the program
1621     execution. The function will be optimized more aggressively and will be
1622     placed into special subsection of the text section to improving locality.
1623
1624     When profile feedback is enabled, this attribute has the precedence over
1625     the profile information. By marking a function ``hot``, users can work
1626     around the cases where the training input does not have good coverage
1627     on all the hot functions.
1628 ``inaccessiblememonly``
1629     This attribute indicates that the function may only access memory that
1630     is not accessible by the module being compiled. This is a weaker form
1631     of ``readnone``. If the function reads or writes other memory, the
1632     behavior is undefined.
1633 ``inaccessiblemem_or_argmemonly``
1634     This attribute indicates that the function may only access memory that is
1635     either not accessible by the module being compiled, or is pointed to
1636     by its pointer arguments. This is a weaker form of  ``argmemonly``. If the
1637     function reads or writes other memory, the behavior is undefined.
1638 ``inlinehint``
1639     This attribute indicates that the source code contained a hint that
1640     inlining this function is desirable (such as the "inline" keyword in
1641     C/C++). It is just a hint; it imposes no requirements on the
1642     inliner.
1643 ``jumptable``
1644     This attribute indicates that the function should be added to a
1645     jump-instruction table at code-generation time, and that all address-taken
1646     references to this function should be replaced with a reference to the
1647     appropriate jump-instruction-table function pointer. Note that this creates
1648     a new pointer for the original function, which means that code that depends
1649     on function-pointer identity can break. So, any function annotated with
1650     ``jumptable`` must also be ``unnamed_addr``.
1651 ``minsize``
1652     This attribute suggests that optimization passes and code generator
1653     passes make choices that keep the code size of this function as small
1654     as possible and perform optimizations that may sacrifice runtime
1655     performance in order to minimize the size of the generated code.
1656 ``naked``
1657     This attribute disables prologue / epilogue emission for the
1658     function. This can have very system-specific consequences.
1659 ``"no-inline-line-tables"``
1660     When this attribute is set to true, the inliner discards source locations
1661     when inlining code and instead uses the source location of the call site.
1662     Breakpoints set on code that was inlined into the current function will
1663     not fire during the execution of the inlined call sites. If the debugger
1664     stops inside an inlined call site, it will appear to be stopped at the
1665     outermost inlined call site.
1666 ``no-jump-tables``
1667     When this attribute is set to true, the jump tables and lookup tables that
1668     can be generated from a switch case lowering are disabled.
1669 ``nobuiltin``
1670     This indicates that the callee function at a call site is not recognized as
1671     a built-in function. LLVM will retain the original call and not replace it
1672     with equivalent code based on the semantics of the built-in function, unless
1673     the call site uses the ``builtin`` attribute. This is valid at call sites
1674     and on function declarations and definitions.
1675 ``noduplicate``
1676     This attribute indicates that calls to the function cannot be
1677     duplicated. A call to a ``noduplicate`` function may be moved
1678     within its parent function, but may not be duplicated within
1679     its parent function.
1680
1681     A function containing a ``noduplicate`` call may still
1682     be an inlining candidate, provided that the call is not
1683     duplicated by inlining. That implies that the function has
1684     internal linkage and only has one call site, so the original
1685     call is dead after inlining.
1686 ``nofree``
1687     This function attribute indicates that the function does not, directly or
1688     transitively, call a memory-deallocation function (``free``, for example)
1689     on a memory allocation which existed before the call.
1690
1691     As a result, uncaptured pointers that are known to be dereferenceable
1692     prior to a call to a function with the ``nofree`` attribute are still
1693     known to be dereferenceable after the call. The capturing condition is
1694     necessary in environments where the function might communicate the
1695     pointer to another thread which then deallocates the memory.  Alternatively,
1696     ``nosync`` would ensure such communication cannot happen and even captured
1697     pointers cannot be freed by the function.
1698
1699     A ``nofree`` function is explicitly allowed to free memory which it
1700     allocated or (if not ``nosync``) arrange for another thread to free
1701     memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
1702     function can return a pointer to a previously deallocated memory object.
1703 ``noimplicitfloat``
1704     Disallows implicit floating-point code. This inhibits optimizations that
1705     use floating-point code and floating-point/SIMD/vector registers for
1706     operations that are not nominally floating-point. LLVM instructions that
1707     perform floating-point operations or require access to floating-point
1708     registers may still cause floating-point code to be generated.
1709 ``noinline``
1710     This attribute indicates that the inliner should never inline this
1711     function in any situation. This attribute may not be used together
1712     with the ``alwaysinline`` attribute.
1713 ``nomerge``
1714     This attribute indicates that calls to this function should never be merged
1715     during optimization. For example, it will prevent tail merging otherwise
1716     identical code sequences that raise an exception or terminate the program.
1717     Tail merging normally reduces the precision of source location information,
1718     making stack traces less useful for debugging. This attribute gives the
1719     user control over the tradeoff between code size and debug information
1720     precision.
1721 ``nonlazybind``
1722     This attribute suppresses lazy symbol binding for the function. This
1723     may make calls to the function faster, at the cost of extra program
1724     startup time if the function is not called during program startup.
1725 ``noprofile``
1726     This function attribute prevents instrumentation based profiling, used for
1727     coverage or profile based optimization, from being added to a function,
1728     even when inlined.
1729 ``noredzone``
1730     This attribute indicates that the code generator should not use a
1731     red zone, even if the target-specific ABI normally permits it.
1732 ``indirect-tls-seg-refs``
1733     This attribute indicates that the code generator should not use
1734     direct TLS access through segment registers, even if the
1735     target-specific ABI normally permits it.
1736 ``noreturn``
1737     This function attribute indicates that the function never returns
1738     normally, hence through a return instruction. This produces undefined
1739     behavior at runtime if the function ever does dynamically return. Annotated
1740     functions may still raise an exception, i.a., ``nounwind`` is not implied.
1741 ``norecurse``
1742     This function attribute indicates that the function does not call itself
1743     either directly or indirectly down any possible call path. This produces
1744     undefined behavior at runtime if the function ever does recurse.
1745 ``willreturn``
1746     This function attribute indicates that a call of this function will
1747     either exhibit undefined behavior or comes back and continues execution
1748     at a point in the existing call stack that includes the current invocation.
1749     Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
1750     If an invocation of an annotated function does not return control back
1751     to a point in the call stack, the behavior is undefined.
1752 ``nosync``
1753     This function attribute indicates that the function does not communicate
1754     (synchronize) with another thread through memory or other well-defined means.
1755     Synchronization is considered possible in the presence of `atomic` accesses
1756     that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
1757     as well as `convergent` function calls. Note that through `convergent` function calls
1758     non-memory communication, e.g., cross-lane operations, are possible and are also
1759     considered synchronization. However `convergent` does not contradict `nosync`.
1760     If an annotated function does ever synchronize with another thread,
1761     the behavior is undefined.
1762 ``nounwind``
1763     This function attribute indicates that the function never raises an
1764     exception. If the function does raise an exception, its runtime
1765     behavior is undefined. However, functions marked nounwind may still
1766     trap or generate asynchronous exceptions. Exception handling schemes
1767     that are recognized by LLVM to handle asynchronous exceptions, such
1768     as SEH, will still provide their implementation defined semantics.
1769 ``nosanitize_coverage``
1770     This attribute indicates that SanitizerCoverage instrumentation is disabled
1771     for this function.
1772 ``null_pointer_is_valid``
1773    If ``null_pointer_is_valid`` is set, then the ``null`` address
1774    in address-space 0 is considered to be a valid address for memory loads and
1775    stores. Any analysis or optimization should not treat dereferencing a
1776    pointer to ``null`` as undefined behavior in this function.
1777    Note: Comparing address of a global variable to ``null`` may still
1778    evaluate to false because of a limitation in querying this attribute inside
1779    constant expressions.
1780 ``optforfuzzing``
1781     This attribute indicates that this function should be optimized
1782     for maximum fuzzing signal.
1783 ``optnone``
1784     This function attribute indicates that most optimization passes will skip
1785     this function, with the exception of interprocedural optimization passes.
1786     Code generation defaults to the "fast" instruction selector.
1787     This attribute cannot be used together with the ``alwaysinline``
1788     attribute; this attribute is also incompatible
1789     with the ``minsize`` attribute and the ``optsize`` attribute.
1790
1791     This attribute requires the ``noinline`` attribute to be specified on
1792     the function as well, so the function is never inlined into any caller.
1793     Only functions with the ``alwaysinline`` attribute are valid
1794     candidates for inlining into the body of this function.
1795 ``optsize``
1796     This attribute suggests that optimization passes and code generator
1797     passes make choices that keep the code size of this function low,
1798     and otherwise do optimizations specifically to reduce code size as
1799     long as they do not significantly impact runtime performance.
1800 ``"patchable-function"``
1801     This attribute tells the code generator that the code
1802     generated for this function needs to follow certain conventions that
1803     make it possible for a runtime function to patch over it later.
1804     The exact effect of this attribute depends on its string value,
1805     for which there currently is one legal possibility:
1806
1807      * ``"prologue-short-redirect"`` - This style of patchable
1808        function is intended to support patching a function prologue to
1809        redirect control away from the function in a thread safe
1810        manner.  It guarantees that the first instruction of the
1811        function will be large enough to accommodate a short jump
1812        instruction, and will be sufficiently aligned to allow being
1813        fully changed via an atomic compare-and-swap instruction.
1814        While the first requirement can be satisfied by inserting large
1815        enough NOP, LLVM can and will try to re-purpose an existing
1816        instruction (i.e. one that would have to be emitted anyway) as
1817        the patchable instruction larger than a short jump.
1818
1819        ``"prologue-short-redirect"`` is currently only supported on
1820        x86-64.
1821
1822     This attribute by itself does not imply restrictions on
1823     inter-procedural optimizations.  All of the semantic effects the
1824     patching may have to be separately conveyed via the linkage type.
1825 ``"probe-stack"``
1826     This attribute indicates that the function will trigger a guard region
1827     in the end of the stack. It ensures that accesses to the stack must be
1828     no further apart than the size of the guard region to a previous
1829     access of the stack. It takes one required string value, the name of
1830     the stack probing function that will be called.
1831
1832     If a function that has a ``"probe-stack"`` attribute is inlined into
1833     a function with another ``"probe-stack"`` attribute, the resulting
1834     function has the ``"probe-stack"`` attribute of the caller. If a
1835     function that has a ``"probe-stack"`` attribute is inlined into a
1836     function that has no ``"probe-stack"`` attribute at all, the resulting
1837     function has the ``"probe-stack"`` attribute of the callee.
1838 ``readnone``
1839     On a function, this attribute indicates that the function computes its
1840     result (or decides to unwind an exception) based strictly on its arguments,
1841     without dereferencing any pointer arguments or otherwise accessing
1842     any mutable state (e.g. memory, control registers, etc) visible to
1843     caller functions. It does not write through any pointer arguments
1844     (including ``byval`` arguments) and never changes any state visible
1845     to callers. This means while it cannot unwind exceptions by calling
1846     the ``C++`` exception throwing methods (since they write to memory), there may
1847     be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1848     visible memory.
1849
1850     On an argument, this attribute indicates that the function does not
1851     dereference that pointer argument, even though it may read or write the
1852     memory that the pointer points to if accessed through other pointers.
1853
1854     If a readnone function reads or writes memory visible to the program, or
1855     has other side-effects, the behavior is undefined. If a function reads from
1856     or writes to a readnone pointer argument, the behavior is undefined.
1857 ``readonly``
1858     On a function, this attribute indicates that the function does not write
1859     through any pointer arguments (including ``byval`` arguments) or otherwise
1860     modify any state (e.g. memory, control registers, etc) visible to
1861     caller functions. It may dereference pointer arguments and read
1862     state that may be set in the caller. A readonly function always
1863     returns the same value (or unwinds an exception identically) when
1864     called with the same set of arguments and global state.  This means while it
1865     cannot unwind exceptions by calling the ``C++`` exception throwing methods
1866     (since they write to memory), there may be non-``C++`` mechanisms that throw
1867     exceptions without writing to LLVM visible memory.
1868
1869     On an argument, this attribute indicates that the function does not write
1870     through this pointer argument, even though it may write to the memory that
1871     the pointer points to.
1872
1873     If a readonly function writes memory visible to the program, or
1874     has other side-effects, the behavior is undefined. If a function writes to
1875     a readonly pointer argument, the behavior is undefined.
1876 ``"stack-probe-size"``
1877     This attribute controls the behavior of stack probes: either
1878     the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1879     It defines the size of the guard region. It ensures that if the function
1880     may use more stack space than the size of the guard region, stack probing
1881     sequence will be emitted. It takes one required integer value, which
1882     is 4096 by default.
1883
1884     If a function that has a ``"stack-probe-size"`` attribute is inlined into
1885     a function with another ``"stack-probe-size"`` attribute, the resulting
1886     function has the ``"stack-probe-size"`` attribute that has the lower
1887     numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1888     inlined into a function that has no ``"stack-probe-size"`` attribute
1889     at all, the resulting function has the ``"stack-probe-size"`` attribute
1890     of the callee.
1891 ``"no-stack-arg-probe"``
1892     This attribute disables ABI-required stack probes, if any.
1893 ``writeonly``
1894     On a function, this attribute indicates that the function may write to but
1895     does not read from memory.
1896
1897     On an argument, this attribute indicates that the function may write to but
1898     does not read through this pointer argument (even though it may read from
1899     the memory that the pointer points to).
1900
1901     If a writeonly function reads memory visible to the program, or
1902     has other side-effects, the behavior is undefined. If a function reads
1903     from a writeonly pointer argument, the behavior is undefined.
1904 ``argmemonly``
1905     This attribute indicates that the only memory accesses inside function are
1906     loads and stores from objects pointed to by its pointer-typed arguments,
1907     with arbitrary offsets. Or in other words, all memory operations in the
1908     function can refer to memory only using pointers based on its function
1909     arguments.
1910
1911     Note that ``argmemonly`` can be used together with ``readonly`` attribute
1912     in order to specify that function reads only from its arguments.
1913
1914     If an argmemonly function reads or writes memory other than the pointer
1915     arguments, or has other side-effects, the behavior is undefined.
1916 ``returns_twice``
1917     This attribute indicates that this function can return twice. The C
1918     ``setjmp`` is an example of such a function. The compiler disables
1919     some optimizations (like tail calls) in the caller of these
1920     functions.
1921 ``safestack``
1922     This attribute indicates that
1923     `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
1924     protection is enabled for this function.
1925
1926     If a function that has a ``safestack`` attribute is inlined into a
1927     function that doesn't have a ``safestack`` attribute or which has an
1928     ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1929     function will have a ``safestack`` attribute.
1930 ``sanitize_address``
1931     This attribute indicates that AddressSanitizer checks
1932     (dynamic address safety analysis) are enabled for this function.
1933 ``sanitize_memory``
1934     This attribute indicates that MemorySanitizer checks (dynamic detection
1935     of accesses to uninitialized memory) are enabled for this function.
1936 ``sanitize_thread``
1937     This attribute indicates that ThreadSanitizer checks
1938     (dynamic thread safety analysis) are enabled for this function.
1939 ``sanitize_hwaddress``
1940     This attribute indicates that HWAddressSanitizer checks
1941     (dynamic address safety analysis based on tagged pointers) are enabled for
1942     this function.
1943 ``sanitize_memtag``
1944     This attribute indicates that MemTagSanitizer checks
1945     (dynamic address safety analysis based on Armv8 MTE) are enabled for
1946     this function.
1947 ``speculative_load_hardening``
1948     This attribute indicates that
1949     `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
1950     should be enabled for the function body.
1951
1952     Speculative Load Hardening is a best-effort mitigation against
1953     information leak attacks that make use of control flow
1954     miss-speculation - specifically miss-speculation of whether a branch
1955     is taken or not. Typically vulnerabilities enabling such attacks are
1956     classified as "Spectre variant #1". Notably, this does not attempt to
1957     mitigate against miss-speculation of branch target, classified as
1958     "Spectre variant #2" vulnerabilities.
1959
1960     When inlining, the attribute is sticky. Inlining a function that carries
1961     this attribute will cause the caller to gain the attribute. This is intended
1962     to provide a maximally conservative model where the code in a function
1963     annotated with this attribute will always (even after inlining) end up
1964     hardened.
1965 ``speculatable``
1966     This function attribute indicates that the function does not have any
1967     effects besides calculating its result and does not have undefined behavior.
1968     Note that ``speculatable`` is not enough to conclude that along any
1969     particular execution path the number of calls to this function will not be
1970     externally observable. This attribute is only valid on functions
1971     and declarations, not on individual call sites. If a function is
1972     incorrectly marked as speculatable and really does exhibit
1973     undefined behavior, the undefined behavior may be observed even
1974     if the call site is dead code.
1975
1976 ``ssp``
1977     This attribute indicates that the function should emit a stack
1978     smashing protector. It is in the form of a "canary" --- a random value
1979     placed on the stack before the local variables that's checked upon
1980     return from the function to see if it has been overwritten. A
1981     heuristic is used to determine if a function needs stack protectors
1982     or not. The heuristic used will enable protectors for functions with:
1983
1984     - Character arrays larger than ``ssp-buffer-size`` (default 8).
1985     - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1986     - Calls to alloca() with variable sizes or constant sizes greater than
1987       ``ssp-buffer-size``.
1988
1989     Variables that are identified as requiring a protector will be arranged
1990     on the stack such that they are adjacent to the stack protector guard.
1991
1992     A function with the ``ssp`` attribute but without the ``alwaysinline``
1993     attribute cannot be inlined into a function without a
1994     ``ssp/sspreq/sspstrong`` attribute. If inlined, the caller will get the
1995     ``ssp`` attribute. ``call``, ``invoke``, and ``callbr`` instructions with
1996     the ``alwaysinline`` attribute force inlining.
1997 ``sspstrong``
1998     This attribute indicates that the function should emit a stack smashing
1999     protector. This attribute causes a strong heuristic to be used when
2000     determining if a function needs stack protectors. The strong heuristic
2001     will enable protectors for functions with:
2002
2003     - Arrays of any size and type
2004     - Aggregates containing an array of any size and type.
2005     - Calls to alloca().
2006     - Local variables that have had their address taken.
2007
2008     Variables that are identified as requiring a protector will be arranged
2009     on the stack such that they are adjacent to the stack protector guard.
2010     The specific layout rules are:
2011
2012     #. Large arrays and structures containing large arrays
2013        (``>= ssp-buffer-size``) are closest to the stack protector.
2014     #. Small arrays and structures containing small arrays
2015        (``< ssp-buffer-size``) are 2nd closest to the protector.
2016     #. Variables that have had their address taken are 3rd closest to the
2017        protector.
2018
2019     This overrides the ``ssp`` function attribute.
2020
2021     A function with the ``sspstrong`` attribute but without the
2022     ``alwaysinline`` attribute cannot be inlined into a function without a
2023     ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the
2024     ``sspstrong`` attribute unless the ``sspreq`` attribute exists.  ``call``,
2025     ``invoke``, and ``callbr`` instructions with the ``alwaysinline`` attribute
2026     force inlining.
2027 ``sspreq``
2028     This attribute indicates that the function should *always* emit a stack
2029     smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2030     attributes.
2031
2032     Variables that are identified as requiring a protector will be arranged
2033     on the stack such that they are adjacent to the stack protector guard.
2034     The specific layout rules are:
2035
2036     #. Large arrays and structures containing large arrays
2037        (``>= ssp-buffer-size``) are closest to the stack protector.
2038     #. Small arrays and structures containing small arrays
2039        (``< ssp-buffer-size``) are 2nd closest to the protector.
2040     #. Variables that have had their address taken are 3rd closest to the
2041        protector.
2042
2043     A function with the ``sspreq`` attribute but without the ``alwaysinline``
2044     attribute cannot be inlined into a function without a
2045     ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the
2046     ``sspreq`` attribute.  ``call``, ``invoke``, and ``callbr`` instructions
2047     with the ``alwaysinline`` attribute force inlining.
2048
2049 ``strictfp``
2050     This attribute indicates that the function was called from a scope that
2051     requires strict floating-point semantics.  LLVM will not attempt any
2052     optimizations that require assumptions about the floating-point rounding
2053     mode or that might alter the state of floating-point status flags that
2054     might otherwise be set or cleared by calling this function. LLVM will
2055     not introduce any new floating-point instructions that may trap.
2056
2057 ``"denormal-fp-math"``
2058     This indicates the denormal (subnormal) handling that may be
2059     assumed for the default floating-point environment. This is a
2060     comma separated pair. The elements may be one of ``"ieee"``,
2061     ``"preserve-sign"``, or ``"positive-zero"``. The first entry
2062     indicates the flushing mode for the result of floating point
2063     operations. The second indicates the handling of denormal inputs
2064     to floating point instructions. For compatibility with older
2065     bitcode, if the second value is omitted, both input and output
2066     modes will assume the same mode.
2067
2068     If this is attribute is not specified, the default is
2069     ``"ieee,ieee"``.
2070
2071     If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2072     denormal outputs may be flushed to zero by standard floating-point
2073     operations. It is not mandated that flushing to zero occurs, but if
2074     a denormal output is flushed to zero, it must respect the sign
2075     mode. Not all targets support all modes. While this indicates the
2076     expected floating point mode the function will be executed with,
2077     this does not make any attempt to ensure the mode is
2078     consistent. User or platform code is expected to set the floating
2079     point mode appropriately before function entry.
2080
2081    If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a
2082    floating-point operation must treat any input denormal value as
2083    zero. In some situations, if an instruction does not respect this
2084    mode, the input may need to be converted to 0 as if by
2085    ``@llvm.canonicalize`` during lowering for correctness.
2086
2087 ``"denormal-fp-math-f32"``
2088     Same as ``"denormal-fp-math"``, but only controls the behavior of
2089     the 32-bit float type (or vectors of 32-bit floats). If both are
2090     are present, this overrides ``"denormal-fp-math"``. Not all targets
2091     support separately setting the denormal mode per type, and no
2092     attempt is made to diagnose unsupported uses. Currently this
2093     attribute is respected by the AMDGPU and NVPTX backends.
2094
2095 ``"thunk"``
2096     This attribute indicates that the function will delegate to some other
2097     function with a tail call. The prototype of a thunk should not be used for
2098     optimization purposes. The caller is expected to cast the thunk prototype to
2099     match the thunk target prototype.
2100 ``uwtable``
2101     This attribute indicates that the ABI being targeted requires that
2102     an unwind table entry be produced for this function even if we can
2103     show that no exceptions passes by it. This is normally the case for
2104     the ELF x86-64 abi, but it can be disabled for some compilation
2105     units.
2106 ``nocf_check``
2107     This attribute indicates that no control-flow check will be performed on
2108     the attributed entity. It disables -fcf-protection=<> for a specific
2109     entity to fine grain the HW control flow protection mechanism. The flag
2110     is target independent and currently appertains to a function or function
2111     pointer.
2112 ``shadowcallstack``
2113     This attribute indicates that the ShadowCallStack checks are enabled for
2114     the function. The instrumentation checks that the return address for the
2115     function has not changed between the function prolog and epilog. It is
2116     currently x86_64-specific.
2117 ``mustprogress``
2118     This attribute indicates that the function is required to return, unwind,
2119     or interact with the environment in an observable way e.g. via a volatile
2120     memory access, I/O, or other synchronization.  The ``mustprogress``
2121     attribute is intended to model the requirements of the first section of
2122     [intro.progress] of the C++ Standard. As a consequence, a loop in a
2123     function with the `mustprogress` attribute can be assumed to terminate if
2124     it does not interact with the environment in an observable way, and
2125     terminating loops without side-effects can be removed. If a `mustprogress`
2126     function does not satisfy this contract, the behavior is undefined.  This
2127     attribute does not apply transitively to callees, but does apply to call
2128     sites within the function. Note that `willreturn` implies `mustprogress`.
2129 ``"warn-stack-size"="<threshold>"``
2130     This attribute sets a threshold to emit diagnostics once the frame size is
2131     known should the frame size exceed the specified value.  It takes one
2132     required integer value, which should be a non-negative integer, and less
2133     than `UINT_MAX`.  It's unspecified which threshold will be used when
2134     duplicate definitions are linked together with differing values.
2135 ``vscale_range(<min>[, <max>])``
2136     This attribute indicates the minimum and maximum vscale value for the given
2137     function. A value of 0 means unbounded. If the optional max value is omitted
2138     then max is set to the value of min. If the attribute is not present, no
2139     assumptions are made about the range of vscale.
2140
2141 Call Site Attributes
2142 ----------------------
2143
2144 In addition to function attributes the following call site only
2145 attributes are supported:
2146
2147 ``vector-function-abi-variant``
2148     This attribute can be attached to a :ref:`call <i_call>` to list
2149     the vector functions associated to the function. Notice that the
2150     attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2151     :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2152     comma separated list of mangled names. The order of the list does
2153     not imply preference (it is logically a set). The compiler is free
2154     to pick any listed vector function of its choosing.
2155
2156     The syntax for the mangled names is as follows:::
2157
2158         _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2159
2160     When present, the attribute informs the compiler that the function
2161     ``<scalar_name>`` has a corresponding vector variant that can be
2162     used to perform the concurrent invocation of ``<scalar_name>`` on
2163     vectors. The shape of the vector function is described by the
2164     tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2165     token. The standard name of the vector function is
2166     ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2167     the optional token ``(<vector_redirection>)`` informs the compiler
2168     that a custom name is provided in addition to the standard one
2169     (custom names can be provided for example via the use of ``declare
2170     variant`` in OpenMP 5.0). The declaration of the variant must be
2171     present in the IR Module. The signature of the vector variant is
2172     determined by the rules of the Vector Function ABI (VFABI)
2173     specifications of the target. For Arm and X86, the VFABI can be
2174     found at https://github.com/ARM-software/abi-aa and
2175     https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2176     respectively.
2177
2178     For X86 and Arm targets, the values of the tokens in the standard
2179     name are those that are defined in the VFABI. LLVM has an internal
2180     ``<isa>`` token that can be used to create scalar-to-vector
2181     mappings for functions that are not directly associated to any of
2182     the target ISAs (for example, some of the mappings stored in the
2183     TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2184
2185         <isa>:= b | c | d | e  -> X86 SSE, AVX, AVX2, AVX512
2186               | n | s          -> Armv8 Advanced SIMD, SVE
2187               | __LLVM__       -> Internal LLVM Vector ISA
2188
2189     For all targets currently supported (x86, Arm and Internal LLVM),
2190     the remaining tokens can have the following values:::
2191
2192         <mask>:= M | N         -> mask | no mask
2193
2194         <vlen>:= number        -> number of lanes
2195                | x             -> VLA (Vector Length Agnostic)
2196
2197         <parameters>:= v              -> vector
2198                      | l | l <number> -> linear
2199                      | R | R <number> -> linear with ref modifier
2200                      | L | L <number> -> linear with val modifier
2201                      | U | U <number> -> linear with uval modifier
2202                      | ls <pos>       -> runtime linear
2203                      | Rs <pos>       -> runtime linear with ref modifier
2204                      | Ls <pos>       -> runtime linear with val modifier
2205                      | Us <pos>       -> runtime linear with uval modifier
2206                      | u              -> uniform
2207
2208         <scalar_name>:= name of the scalar function
2209
2210         <vector_redirection>:= optional, custom name of the vector function
2211
2212 ``preallocated(<ty>)``
2213     This attribute is required on calls to ``llvm.call.preallocated.arg``
2214     and cannot be used on any other call. See
2215     :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2216     details.
2217
2218 .. _glattrs:
2219
2220 Global Attributes
2221 -----------------
2222
2223 Attributes may be set to communicate additional information about a global variable.
2224 Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2225 are grouped into a single :ref:`attribute group <attrgrp>`.
2226
2227 .. _opbundles:
2228
2229 Operand Bundles
2230 ---------------
2231
2232 Operand bundles are tagged sets of SSA values that can be associated
2233 with certain LLVM instructions (currently only ``call`` s and
2234 ``invoke`` s).  In a way they are like metadata, but dropping them is
2235 incorrect and will change program semantics.
2236
2237 Syntax::
2238
2239     operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2240     operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2241     bundle operand ::= SSA value
2242     tag ::= string constant
2243
2244 Operand bundles are **not** part of a function's signature, and a
2245 given function may be called from multiple places with different kinds
2246 of operand bundles.  This reflects the fact that the operand bundles
2247 are conceptually a part of the ``call`` (or ``invoke``), not the
2248 callee being dispatched to.
2249
2250 Operand bundles are a generic mechanism intended to support
2251 runtime-introspection-like functionality for managed languages.  While
2252 the exact semantics of an operand bundle depend on the bundle tag,
2253 there are certain limitations to how much the presence of an operand
2254 bundle can influence the semantics of a program.  These restrictions
2255 are described as the semantics of an "unknown" operand bundle.  As
2256 long as the behavior of an operand bundle is describable within these
2257 restrictions, LLVM does not need to have special knowledge of the
2258 operand bundle to not miscompile programs containing it.
2259
2260 - The bundle operands for an unknown operand bundle escape in unknown
2261   ways before control is transferred to the callee or invokee.
2262 - Calls and invokes with operand bundles have unknown read / write
2263   effect on the heap on entry and exit (even if the call target is
2264   ``readnone`` or ``readonly``), unless they're overridden with
2265   callsite specific attributes.
2266 - An operand bundle at a call site cannot change the implementation
2267   of the called function.  Inter-procedural optimizations work as
2268   usual as long as they take into account the first two properties.
2269
2270 More specific types of operand bundles are described below.
2271
2272 .. _deopt_opbundles:
2273
2274 Deoptimization Operand Bundles
2275 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2276
2277 Deoptimization operand bundles are characterized by the ``"deopt"``
2278 operand bundle tag.  These operand bundles represent an alternate
2279 "safe" continuation for the call site they're attached to, and can be
2280 used by a suitable runtime to deoptimize the compiled frame at the
2281 specified call site.  There can be at most one ``"deopt"`` operand
2282 bundle attached to a call site.  Exact details of deoptimization is
2283 out of scope for the language reference, but it usually involves
2284 rewriting a compiled frame into a set of interpreted frames.
2285
2286 From the compiler's perspective, deoptimization operand bundles make
2287 the call sites they're attached to at least ``readonly``.  They read
2288 through all of their pointer typed operands (even if they're not
2289 otherwise escaped) and the entire visible heap.  Deoptimization
2290 operand bundles do not capture their operands except during
2291 deoptimization, in which case control will not be returned to the
2292 compiled frame.
2293
2294 The inliner knows how to inline through calls that have deoptimization
2295 operand bundles.  Just like inlining through a normal call site
2296 involves composing the normal and exceptional continuations, inlining
2297 through a call site with a deoptimization operand bundle needs to
2298 appropriately compose the "safe" deoptimization continuation.  The
2299 inliner does this by prepending the parent's deoptimization
2300 continuation to every deoptimization continuation in the inlined body.
2301 E.g. inlining ``@f`` into ``@g`` in the following example
2302
2303 .. code-block:: llvm
2304
2305     define void @f() {
2306       call void @x()  ;; no deopt state
2307       call void @y() [ "deopt"(i32 10) ]
2308       call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
2309       ret void
2310     }
2311
2312     define void @g() {
2313       call void @f() [ "deopt"(i32 20) ]
2314       ret void
2315     }
2316
2317 will result in
2318
2319 .. code-block:: llvm
2320
2321     define void @g() {
2322       call void @x()  ;; still no deopt state
2323       call void @y() [ "deopt"(i32 20, i32 10) ]
2324       call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
2325       ret void
2326     }
2327
2328 It is the frontend's responsibility to structure or encode the
2329 deoptimization state in a way that syntactically prepending the
2330 caller's deoptimization state to the callee's deoptimization state is
2331 semantically equivalent to composing the caller's deoptimization
2332 continuation after the callee's deoptimization continuation.
2333
2334 .. _ob_funclet:
2335
2336 Funclet Operand Bundles
2337 ^^^^^^^^^^^^^^^^^^^^^^^
2338
2339 Funclet operand bundles are characterized by the ``"funclet"``
2340 operand bundle tag.  These operand bundles indicate that a call site
2341 is within a particular funclet.  There can be at most one
2342 ``"funclet"`` operand bundle attached to a call site and it must have
2343 exactly one bundle operand.
2344
2345 If any funclet EH pads have been "entered" but not "exited" (per the
2346 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2347 it is undefined behavior to execute a ``call`` or ``invoke`` which:
2348
2349 * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2350   intrinsic, or
2351 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2352   not-yet-exited funclet EH pad.
2353
2354 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2355 executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2356
2357 GC Transition Operand Bundles
2358 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2359
2360 GC transition operand bundles are characterized by the
2361 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
2362 call as a transition between a function with one GC strategy to a
2363 function with a different GC strategy. If coordinating the transition
2364 between GC strategies requires additional code generation at the call
2365 site, these bundles may contain any values that are needed by the
2366 generated code.  For more details, see :ref:`GC Transitions
2367 <gc_transition_args>`.
2368
2369 The bundle contain an arbitrary list of Values which need to be passed
2370 to GC transition code. They will be lowered and passed as operands to
2371 the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2372 that these arguments must be available before and after (but not
2373 necessarily during) the execution of the callee.
2374
2375 .. _assume_opbundles:
2376
2377 Assume Operand Bundles
2378 ^^^^^^^^^^^^^^^^^^^^^^
2379
2380 Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2381 assumptions that a :ref:`parameter attribute <paramattrs>` or a
2382 :ref:`function attribute <fnattrs>` holds for a certain value at a certain
2383 location. Operand bundles enable assumptions that are either hard or impossible
2384 to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2385
2386 An assume operand bundle has the form:
2387
2388 ::
2389
2390       "<tag>"([ <holds for value> [, <attribute argument>] ])
2391
2392 * The tag of the operand bundle is usually the name of attribute that can be
2393   assumed to hold. It can also be `ignore`, this tag doesn't contain any
2394   information and should be ignored.
2395 * The first argument if present is the value for which the attribute hold.
2396 * The second argument if present is an argument of the attribute.
2397
2398 If there are no arguments the attribute is a property of the call location.
2399
2400 If the represented attribute expects a constant argument, the argument provided
2401 to the operand bundle should be a constant as well.
2402
2403 For example:
2404
2405 .. code-block:: llvm
2406
2407       call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)]
2408
2409 allows the optimizer to assume that at location of call to
2410 :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2411
2412 .. code-block:: llvm
2413
2414       call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)]
2415
2416 allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2417 call location is cold and that ``%val`` may not be null.
2418
2419 Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2420 provided guarantees are violated at runtime the behavior is undefined.
2421
2422 Even if the assumed property can be encoded as a boolean value, like
2423 ``nonnull``, using operand bundles to express the property can still have
2424 benefits:
2425
2426 * Attributes that can be expressed via operand bundles are directly the
2427   property that the optimizer uses and cares about. Encoding attributes as
2428   operand bundles removes the need for an instruction sequence that represents
2429   the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the
2430   optimizer to deduce the property from that instruction sequence.
2431 * Expressing the property using operand bundles makes it easy to identify the
2432   use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2433   simplifies and improves heuristics, e.g., for use "use-sensitive"
2434   optimizations.
2435
2436 .. _ob_preallocated:
2437
2438 Preallocated Operand Bundles
2439 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2440
2441 Preallocated operand bundles are characterized by the ``"preallocated"``
2442 operand bundle tag.  These operand bundles allow separation of the allocation
2443 of the call argument memory from the call site.  This is necessary to pass
2444 non-trivially copyable objects by value in a way that is compatible with MSVC
2445 on some targets.  There can be at most one ``"preallocated"`` operand bundle
2446 attached to a call site and it must have exactly one bundle operand, which is
2447 a token generated by ``@llvm.call.preallocated.setup``.  A call with this
2448 operand bundle should not adjust the stack before entering the function, as
2449 that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2450
2451 .. code-block:: llvm
2452
2453       %foo = type { i64, i32 }
2454
2455       ...
2456
2457       %t = call token @llvm.call.preallocated.setup(i32 1)
2458       %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2459       %b = bitcast i8* %a to %foo*
2460       ; initialize %b
2461       call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)]
2462
2463 .. _ob_gc_live:
2464
2465 GC Live Operand Bundles
2466 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2467
2468 A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2469 intrinsic. The operand bundle must contain every pointer to a garbage collected
2470 object which potentially needs to be updated by the garbage collector.
2471
2472 When lowered, any relocated value will be recorded in the corresponding
2473 :ref:`stackmap entry <statepoint-stackmap-format>`.  See the intrinsic description
2474 for further details.
2475
2476 ObjC ARC Attached Call Operand Bundles
2477 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2478
2479 A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2480 implicitly followed by a marker instruction and a call to an ObjC runtime
2481 function that uses the result of the call. The operand bundle takes either the
2482 pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2483 ``@objc_unsafeClaimAutoreleasedReturnValue``) or no arguments. If the bundle
2484 doesn't take any arguments, only the marker instruction has to be emitted after
2485 the call; the runtime function calls don't have to be emitted since they already
2486 have been emitted. The return value of a call with this bundle is used by a call
2487 to ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2488 void, in which case the operand bundle is ignored.
2489
2490 .. code-block:: llvm
2491
2492    ; The marker instruction and a runtime function call are inserted after the call
2493    ; to @foo.
2494    call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_retainAutoreleasedReturnValue) ]
2495    call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_unsafeClaimAutoreleasedReturnValue) ]
2496
2497    ; Only the marker instruction is inserted after the call to @foo.
2498    call i8* @foo() [ "clang.arc.attachedcall"() ]
2499
2500 The operand bundle is needed to ensure the call is immediately followed by the
2501 marker instruction or the ObjC runtime call in the final output.
2502
2503 .. _moduleasm:
2504
2505 Module-Level Inline Assembly
2506 ----------------------------
2507
2508 Modules may contain "module-level inline asm" blocks, which corresponds
2509 to the GCC "file scope inline asm" blocks. These blocks are internally
2510 concatenated by LLVM and treated as a single unit, but may be separated
2511 in the ``.ll`` file if desired. The syntax is very simple:
2512
2513 .. code-block:: llvm
2514
2515     module asm "inline asm code goes here"
2516     module asm "more can go here"
2517
2518 The strings can contain any character by escaping non-printable
2519 characters. The escape sequence used is simply "\\xx" where "xx" is the
2520 two digit hex code for the number.
2521
2522 Note that the assembly string *must* be parseable by LLVM's integrated assembler
2523 (unless it is disabled), even when emitting a ``.s`` file.
2524
2525 .. _langref_datalayout:
2526
2527 Data Layout
2528 -----------
2529
2530 A module may specify a target specific data layout string that specifies
2531 how data is to be laid out in memory. The syntax for the data layout is
2532 simply:
2533
2534 .. code-block:: llvm
2535
2536     target datalayout = "layout specification"
2537
2538 The *layout specification* consists of a list of specifications
2539 separated by the minus sign character ('-'). Each specification starts
2540 with a letter and may include other information after the letter to
2541 define some aspect of the data layout. The specifications accepted are
2542 as follows:
2543
2544 ``E``
2545     Specifies that the target lays out data in big-endian form. That is,
2546     the bits with the most significance have the lowest address
2547     location.
2548 ``e``
2549     Specifies that the target lays out data in little-endian form. That
2550     is, the bits with the least significance have the lowest address
2551     location.
2552 ``S<size>``
2553     Specifies the natural alignment of the stack in bits. Alignment
2554     promotion of stack variables is limited to the natural stack
2555     alignment to avoid dynamic stack realignment. The stack alignment
2556     must be a multiple of 8-bits. If omitted, the natural stack
2557     alignment defaults to "unspecified", which does not prevent any
2558     alignment promotions.
2559 ``P<address space>``
2560     Specifies the address space that corresponds to program memory.
2561     Harvard architectures can use this to specify what space LLVM
2562     should place things such as functions into. If omitted, the
2563     program memory space defaults to the default address space of 0,
2564     which corresponds to a Von Neumann architecture that has code
2565     and data in the same space.
2566 ``G<address space>``
2567     Specifies the address space to be used by default when creating global
2568     variables. If omitted, the globals address space defaults to the default
2569     address space 0.
2570     Note: variable declarations without an address space are always created in
2571     address space 0, this property only affects the default value to be used
2572     when creating globals without additional contextual information (e.g. in
2573     LLVM passes).
2574 ``A<address space>``
2575     Specifies the address space of objects created by '``alloca``'.
2576     Defaults to the default address space of 0.
2577 ``p[n]:<size>:<abi>[:<pref>][:<idx>]``
2578     This specifies the *size* of a pointer and its ``<abi>`` and
2579     ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
2580     and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
2581     index that used for address calculation. If not
2582     specified, the default index size is equal to the pointer size. All sizes
2583     are in bits. The address space, ``n``, is optional, and if not specified,
2584     denotes the default address space 0. The value of ``n`` must be
2585     in the range [1,2^23).
2586 ``i<size>:<abi>[:<pref>]``
2587     This specifies the alignment for an integer type of a given bit
2588     ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2589     ``<pref>`` is optional and defaults to ``<abi>``.
2590 ``v<size>:<abi>[:<pref>]``
2591     This specifies the alignment for a vector type of a given bit
2592     ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2593     ``<pref>`` is optional and defaults to ``<abi>``.
2594 ``f<size>:<abi>[:<pref>]``
2595     This specifies the alignment for a floating-point type of a given bit
2596     ``<size>``. Only values of ``<size>`` that are supported by the target
2597     will work. 32 (float) and 64 (double) are supported on all targets; 80
2598     or 128 (different flavors of long double) are also supported on some
2599     targets. The value of ``<size>`` must be in the range [1,2^23).
2600     ``<pref>`` is optional and defaults to ``<abi>``.
2601 ``a:<abi>[:<pref>]``
2602     This specifies the alignment for an object of aggregate type.
2603     ``<pref>`` is optional and defaults to ``<abi>``.
2604 ``F<type><abi>``
2605     This specifies the alignment for function pointers.
2606     The options for ``<type>`` are:
2607
2608     * ``i``: The alignment of function pointers is independent of the alignment
2609       of functions, and is a multiple of ``<abi>``.
2610     * ``n``: The alignment of function pointers is a multiple of the explicit
2611       alignment specified on the function, and is a multiple of ``<abi>``.
2612 ``m:<mangling>``
2613     If present, specifies that llvm names are mangled in the output. Symbols
2614     prefixed with the mangling escape character ``\01`` are passed through
2615     directly to the assembler without the escape character. The mangling style
2616     options are
2617
2618     * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2619     * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
2620     * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2621     * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2622       symbols get a ``_`` prefix.
2623     * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2624       Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2625       ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2626       ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2627       starting with ``?`` are not mangled in any way.
2628     * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
2629       symbols do not receive a ``_`` prefix.
2630     * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
2631 ``n<size1>:<size2>:<size3>...``
2632     This specifies a set of native integer widths for the target CPU in
2633     bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2634     ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2635     this set are considered to support most general arithmetic operations
2636     efficiently.
2637 ``ni:<address space0>:<address space1>:<address space2>...``
2638     This specifies pointer types with the specified address spaces
2639     as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
2640     address space cannot be specified as non-integral.
2641
2642 On every specification that takes a ``<abi>:<pref>``, specifying the
2643 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
2644 should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2645
2646 When constructing the data layout for a given target, LLVM starts with a
2647 default set of specifications which are then (possibly) overridden by
2648 the specifications in the ``datalayout`` keyword. The default
2649 specifications are given in this list:
2650
2651 -  ``e`` - little endian
2652 -  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2653 -  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2654    same as the default address space.
2655 -  ``S0`` - natural stack alignment is unspecified
2656 -  ``i1:8:8`` - i1 is 8-bit (byte) aligned
2657 -  ``i8:8:8`` - i8 is 8-bit (byte) aligned
2658 -  ``i16:16:16`` - i16 is 16-bit aligned
2659 -  ``i32:32:32`` - i32 is 32-bit aligned
2660 -  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2661    alignment of 64-bits
2662 -  ``f16:16:16`` - half is 16-bit aligned
2663 -  ``f32:32:32`` - float is 32-bit aligned
2664 -  ``f64:64:64`` - double is 64-bit aligned
2665 -  ``f128:128:128`` - quad is 128-bit aligned
2666 -  ``v64:64:64`` - 64-bit vector is 64-bit aligned
2667 -  ``v128:128:128`` - 128-bit vector is 128-bit aligned
2668 -  ``a:0:64`` - aggregates are 64-bit aligned
2669
2670 When LLVM is determining the alignment for a given type, it uses the
2671 following rules:
2672
2673 #. If the type sought is an exact match for one of the specifications,
2674    that specification is used.
2675 #. If no match is found, and the type sought is an integer type, then
2676    the smallest integer type that is larger than the bitwidth of the
2677    sought type is used. If none of the specifications are larger than
2678    the bitwidth then the largest integer type is used. For example,
2679    given the default specifications above, the i7 type will use the
2680    alignment of i8 (next largest) while both i65 and i256 will use the
2681    alignment of i64 (largest specified).
2682 #. If no match is found, and the type sought is a vector type, then the
2683    largest vector type that is smaller than the sought vector type will
2684    be used as a fall back. This happens because <128 x double> can be
2685    implemented in terms of 64 <2 x double>, for example.
2686
2687 The function of the data layout string may not be what you expect.
2688 Notably, this is not a specification from the frontend of what alignment
2689 the code generator should use.
2690
2691 Instead, if specified, the target data layout is required to match what
2692 the ultimate *code generator* expects. This string is used by the
2693 mid-level optimizers to improve code, and this only works if it matches
2694 what the ultimate code generator uses. There is no way to generate IR
2695 that does not embed this target-specific detail into the IR. If you
2696 don't specify the string, the default specifications will be used to
2697 generate a Data Layout and the optimization phases will operate
2698 accordingly and introduce target specificity into the IR with respect to
2699 these default specifications.
2700
2701 .. _langref_triple:
2702
2703 Target Triple
2704 -------------
2705
2706 A module may specify a target triple string that describes the target
2707 host. The syntax for the target triple is simply:
2708
2709 .. code-block:: llvm
2710
2711     target triple = "x86_64-apple-macosx10.7.0"
2712
2713 The *target triple* string consists of a series of identifiers delimited
2714 by the minus sign character ('-'). The canonical forms are:
2715
2716 ::
2717
2718     ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2719     ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2720
2721 This information is passed along to the backend so that it generates
2722 code for the proper architecture. It's possible to override this on the
2723 command line with the ``-mtriple`` command line option.
2724
2725 .. _objectlifetime:
2726
2727 Object Lifetime
2728 ----------------------
2729
2730 A memory object, or simply object, is a region of a memory space that is
2731 reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
2732 allocation calls, and global variable definitions.
2733 Once it is allocated, the bytes stored in the region can only be read or written
2734 through a pointer that is :ref:`based on <pointeraliasing>` the allocation
2735 value.
2736 If a pointer that is not based on the object tries to read or write to the
2737 object, it is undefined behavior.
2738
2739 A lifetime of a memory object is a property that decides its accessibility.
2740 Unless stated otherwise, a memory object is alive since its allocation, and
2741 dead after its deallocation.
2742 It is undefined behavior to access a memory object that isn't alive, but
2743 operations that don't dereference it such as
2744 :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
2745 :ref:`icmp <i_icmp>` return a valid result.
2746 This explains code motion of these instructions across operations that
2747 impact the object's lifetime.
2748 A stack object's lifetime can be explicitly specified using
2749 :ref:`llvm.lifetime.start <int_lifestart>` and
2750 :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
2751
2752 .. _pointeraliasing:
2753
2754 Pointer Aliasing Rules
2755 ----------------------
2756
2757 Any memory access must be done through a pointer value associated with
2758 an address range of the memory access, otherwise the behavior is
2759 undefined. Pointer values are associated with address ranges according
2760 to the following rules:
2761
2762 -  A pointer value is associated with the addresses associated with any
2763    value it is *based* on.
2764 -  An address of a global variable is associated with the address range
2765    of the variable's storage.
2766 -  The result value of an allocation instruction is associated with the
2767    address range of the allocated storage.
2768 -  A null pointer in the default address-space is associated with no
2769    address.
2770 -  An :ref:`undef value <undefvalues>` in *any* address-space is
2771    associated with no address.
2772 -  An integer constant other than zero or a pointer value returned from
2773    a function not defined within LLVM may be associated with address
2774    ranges allocated through mechanisms other than those provided by
2775    LLVM. Such ranges shall not overlap with any ranges of addresses
2776    allocated by mechanisms provided by LLVM.
2777
2778 A pointer value is *based* on another pointer value according to the
2779 following rules:
2780
2781 -  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2782    the pointer-typed operand of the ``getelementptr``.
2783 -  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2784    is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2785    of the ``getelementptr``.
2786 -  The result value of a ``bitcast`` is *based* on the operand of the
2787    ``bitcast``.
2788 -  A pointer value formed by an ``inttoptr`` is *based* on all pointer
2789    values that contribute (directly or indirectly) to the computation of
2790    the pointer's value.
2791 -  The "*based* on" relationship is transitive.
2792
2793 Note that this definition of *"based"* is intentionally similar to the
2794 definition of *"based"* in C99, though it is slightly weaker.
2795
2796 LLVM IR does not associate types with memory. The result type of a
2797 ``load`` merely indicates the size and alignment of the memory from
2798 which to load, as well as the interpretation of the value. The first
2799 operand type of a ``store`` similarly only indicates the size and
2800 alignment of the store.
2801
2802 Consequently, type-based alias analysis, aka TBAA, aka
2803 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2804 :ref:`Metadata <metadata>` may be used to encode additional information
2805 which specialized optimization passes may use to implement type-based
2806 alias analysis.
2807
2808 .. _pointercapture:
2809
2810 Pointer Capture
2811 ---------------
2812
2813 Given a function call and a pointer that is passed as an argument or stored in
2814 the memory before the call, a pointer is *captured* by the call if it makes a
2815 copy of any part of the pointer that outlives the call.
2816 To be precise, a pointer is captured if one or more of the following conditions
2817 hold:
2818
2819 1. The call stores any bit of the pointer carrying information into a place,
2820    and the stored bits can be read from the place by the caller after this call
2821    exits.
2822
2823 .. code-block:: llvm
2824
2825     @glb  = global i8* null
2826     @glb2 = global i8* null
2827     @glb3 = global i8* null
2828     @glbi = global i32 0
2829
2830     define i8* @f(i8* %a, i8* %b, i8* %c, i8* %d, i8* %e) {
2831       store i8* %a, i8** @glb ; %a is captured by this call
2832
2833       store i8* %b,   i8** @glb2 ; %b isn't captured because the stored value is overwritten by the store below
2834       store i8* null, i8** @glb2
2835
2836       store i8* %c,   i8** @glb3
2837       call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
2838       store i8* null, i8** @glb3
2839
2840       %i = ptrtoint i8* %d to i64
2841       %j = trunc i64 %i to i32
2842       store i32 %j, i32* @glbi ; %d is captured
2843
2844       ret i8* %e ; %e is captured
2845     }
2846
2847 2. The call stores any bit of the pointer carrying information into a place,
2848    and the stored bits can be safely read from the place by another thread via
2849    synchronization.
2850
2851 .. code-block:: llvm
2852
2853     @lock = global i1 true
2854
2855     define void @f(i8* %a) {
2856       store i8* %a, i8** @glb
2857       store atomic i1 false, i1* @lock release ; %a is captured because another thread can safely read @glb
2858       store i8* null, i8** @glb
2859       ret void
2860     }
2861
2862 3. The call's behavior depends on any bit of the pointer carrying information.
2863
2864 .. code-block:: llvm
2865
2866     @glb = global i8 0
2867
2868     define void @f(i8* %a) {
2869       %c = icmp eq i8* %a, @glb
2870       br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
2871     BB_EXIT:
2872       call void @exit()
2873       unreachable
2874     BB_CONTINUE:
2875       ret void
2876     }
2877
2878 4. The pointer is used in a volatile access as its address.
2879
2880
2881 .. _volatile:
2882
2883 Volatile Memory Accesses
2884 ------------------------
2885
2886 Certain memory accesses, such as :ref:`load <i_load>`'s,
2887 :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2888 marked ``volatile``. The optimizers must not change the number of
2889 volatile operations or change their order of execution relative to other
2890 volatile operations. The optimizers *may* change the order of volatile
2891 operations relative to non-volatile operations. This is not Java's
2892 "volatile" and has no cross-thread synchronization behavior.
2893
2894 A volatile load or store may have additional target-specific semantics.
2895 Any volatile operation can have side effects, and any volatile operation
2896 can read and/or modify state which is not accessible via a regular load
2897 or store in this module. Volatile operations may use addresses which do
2898 not point to memory (like MMIO registers). This means the compiler may
2899 not use a volatile operation to prove a non-volatile access to that
2900 address has defined behavior.
2901
2902 The allowed side-effects for volatile accesses are limited.  If a
2903 non-volatile store to a given address would be legal, a volatile
2904 operation may modify the memory at that address. A volatile operation
2905 may not modify any other memory accessible by the module being compiled.
2906 A volatile operation may not call any code in the current module.
2907
2908 The compiler may assume execution will continue after a volatile operation,
2909 so operations which modify memory or may have undefined behavior can be
2910 hoisted past a volatile operation.
2911
2912 As an exception to the preceding rule, the compiler may not assume execution
2913 will continue after a volatile store operation. This restriction is necessary
2914 to support the somewhat common pattern in C of intentionally storing to an
2915 invalid pointer to crash the program. In the future, it might make sense to
2916 allow frontends to control this behavior.
2917
2918 IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
2919 or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
2920 Likewise, the backend should never split or merge target-legal volatile
2921 load/store instructions. Similarly, IR-level volatile loads and stores cannot
2922 change from integer to floating-point or vice versa.
2923
2924 .. admonition:: Rationale
2925
2926  Platforms may rely on volatile loads and stores of natively supported
2927  data width to be executed as single instruction. For example, in C
2928  this holds for an l-value of volatile primitive type with native
2929  hardware support, but not necessarily for aggregate types. The
2930  frontend upholds these expectations, which are intentionally
2931  unspecified in the IR. The rules above ensure that IR transformations
2932  do not violate the frontend's contract with the language.
2933
2934 .. _memmodel:
2935
2936 Memory Model for Concurrent Operations
2937 --------------------------------------
2938
2939 The LLVM IR does not define any way to start parallel threads of
2940 execution or to register signal handlers. Nonetheless, there are
2941 platform-specific ways to create them, and we define LLVM IR's behavior
2942 in their presence. This model is inspired by the C++0x memory model.
2943
2944 For a more informal introduction to this model, see the :doc:`Atomics`.
2945
2946 We define a *happens-before* partial order as the least partial order
2947 that
2948
2949 -  Is a superset of single-thread program order, and
2950 -  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2951    ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2952    techniques, like pthread locks, thread creation, thread joining,
2953    etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2954    Constraints <ordering>`).
2955
2956 Note that program order does not introduce *happens-before* edges
2957 between a thread and signals executing inside that thread.
2958
2959 Every (defined) read operation (load instructions, memcpy, atomic
2960 loads/read-modify-writes, etc.) R reads a series of bytes written by
2961 (defined) write operations (store instructions, atomic
2962 stores/read-modify-writes, memcpy, etc.). For the purposes of this
2963 section, initialized globals are considered to have a write of the
2964 initializer which is atomic and happens before any other read or write
2965 of the memory in question. For each byte of a read R, R\ :sub:`byte`
2966 may see any write to the same byte, except:
2967
2968 -  If write\ :sub:`1`  happens before write\ :sub:`2`, and
2969    write\ :sub:`2` happens before R\ :sub:`byte`, then
2970    R\ :sub:`byte` does not see write\ :sub:`1`.
2971 -  If R\ :sub:`byte` happens before write\ :sub:`3`, then
2972    R\ :sub:`byte` does not see write\ :sub:`3`.
2973
2974 Given that definition, R\ :sub:`byte` is defined as follows:
2975
2976 -  If R is volatile, the result is target-dependent. (Volatile is
2977    supposed to give guarantees which can support ``sig_atomic_t`` in
2978    C/C++, and may be used for accesses to addresses that do not behave
2979    like normal memory. It does not generally provide cross-thread
2980    synchronization.)
2981 -  Otherwise, if there is no write to the same byte that happens before
2982    R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2983 -  Otherwise, if R\ :sub:`byte` may see exactly one write,
2984    R\ :sub:`byte` returns the value written by that write.
2985 -  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2986    see are atomic, it chooses one of the values written. See the :ref:`Atomic
2987    Memory Ordering Constraints <ordering>` section for additional
2988    constraints on how the choice is made.
2989 -  Otherwise R\ :sub:`byte` returns ``undef``.
2990
2991 R returns the value composed of the series of bytes it read. This
2992 implies that some bytes within the value may be ``undef`` **without**
2993 the entire value being ``undef``. Note that this only defines the
2994 semantics of the operation; it doesn't mean that targets will emit more
2995 than one instruction to read the series of bytes.
2996
2997 Note that in cases where none of the atomic intrinsics are used, this
2998 model places only one restriction on IR transformations on top of what
2999 is required for single-threaded execution: introducing a store to a byte
3000 which might not otherwise be stored is not allowed in general.
3001 (Specifically, in the case where another thread might write to and read
3002 from an address, introducing a store can change a load that may see
3003 exactly one write into a load that may see multiple writes.)
3004
3005 .. _ordering:
3006
3007 Atomic Memory Ordering Constraints
3008 ----------------------------------
3009
3010 Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3011 :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3012 :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3013 ordering parameters that determine which other atomic instructions on
3014 the same address they *synchronize with*. These semantics are borrowed
3015 from Java and C++0x, but are somewhat more colloquial. If these
3016 descriptions aren't precise enough, check those specs (see spec
3017 references in the :doc:`atomics guide <Atomics>`).
3018 :ref:`fence <i_fence>` instructions treat these orderings somewhat
3019 differently since they don't take an address. See that instruction's
3020 documentation for details.
3021
3022 For a simpler introduction to the ordering constraints, see the
3023 :doc:`Atomics`.
3024
3025 ``unordered``
3026     The set of values that can be read is governed by the happens-before
3027     partial order. A value cannot be read unless some operation wrote
3028     it. This is intended to provide a guarantee strong enough to model
3029     Java's non-volatile shared variables. This ordering cannot be
3030     specified for read-modify-write operations; it is not strong enough
3031     to make them atomic in any interesting way.
3032 ``monotonic``
3033     In addition to the guarantees of ``unordered``, there is a single
3034     total order for modifications by ``monotonic`` operations on each
3035     address. All modification orders must be compatible with the
3036     happens-before order. There is no guarantee that the modification
3037     orders can be combined to a global total order for the whole program
3038     (and this often will not be possible). The read in an atomic
3039     read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3040     :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3041     order immediately before the value it writes. If one atomic read
3042     happens before another atomic read of the same address, the later
3043     read must see the same value or a later value in the address's
3044     modification order. This disallows reordering of ``monotonic`` (or
3045     stronger) operations on the same address. If an address is written
3046     ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3047     read that address repeatedly, the other threads must eventually see
3048     the write. This corresponds to the C++0x/C1x
3049     ``memory_order_relaxed``.
3050 ``acquire``
3051     In addition to the guarantees of ``monotonic``, a
3052     *synchronizes-with* edge may be formed with a ``release`` operation.
3053     This is intended to model C++'s ``memory_order_acquire``.
3054 ``release``
3055     In addition to the guarantees of ``monotonic``, if this operation
3056     writes a value which is subsequently read by an ``acquire``
3057     operation, it *synchronizes-with* that operation. (This isn't a
3058     complete description; see the C++0x definition of a release
3059     sequence.) This corresponds to the C++0x/C1x
3060     ``memory_order_release``.
3061 ``acq_rel`` (acquire+release)
3062     Acts as both an ``acquire`` and ``release`` operation on its
3063     address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3064 ``seq_cst`` (sequentially consistent)
3065     In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3066     operation that only reads, ``release`` for an operation that only
3067     writes), there is a global total order on all
3068     sequentially-consistent operations on all addresses, which is
3069     consistent with the *happens-before* partial order and with the
3070     modification orders of all the affected addresses. Each
3071     sequentially-consistent read sees the last preceding write to the
3072     same address in this global order. This corresponds to the C++0x/C1x
3073     ``memory_order_seq_cst`` and Java volatile.
3074
3075 .. _syncscope:
3076
3077 If an atomic operation is marked ``syncscope("singlethread")``, it only
3078 *synchronizes with* and only participates in the seq\_cst total orderings of
3079 other operations running in the same thread (for example, in signal handlers).
3080
3081 If an atomic operation is marked ``syncscope("<target-scope>")``, where
3082 ``<target-scope>`` is a target specific synchronization scope, then it is target
3083 dependent if it *synchronizes with* and participates in the seq\_cst total
3084 orderings of other operations.
3085
3086 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3087 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3088 seq\_cst total orderings of other operations that are not marked
3089 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3090
3091 .. _floatenv:
3092
3093 Floating-Point Environment
3094 --------------------------
3095
3096 The default LLVM floating-point environment assumes that floating-point
3097 instructions do not have side effects. Results assume the round-to-nearest
3098 rounding mode. No floating-point exception state is maintained in this
3099 environment. Therefore, there is no attempt to create or preserve invalid
3100 operation (SNaN) or division-by-zero exceptions.
3101
3102 The benefit of this exception-free assumption is that floating-point
3103 operations may be speculated freely without any other fast-math relaxations
3104 to the floating-point model.
3105
3106 Code that requires different behavior than this should use the
3107 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3108
3109 .. _fastmath:
3110
3111 Fast-Math Flags
3112 ---------------
3113
3114 LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3115 :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3116 :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3117 :ref:`select <i_select>` and :ref:`call <i_call>`
3118 may use the following flags to enable otherwise unsafe
3119 floating-point transformations.
3120
3121 ``nnan``
3122    No NaNs - Allow optimizations to assume the arguments and result are not
3123    NaN. If an argument is a nan, or the result would be a nan, it produces
3124    a :ref:`poison value <poisonvalues>` instead.
3125
3126 ``ninf``
3127    No Infs - Allow optimizations to assume the arguments and result are not
3128    +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3129    produces a :ref:`poison value <poisonvalues>` instead.
3130
3131 ``nsz``
3132    No Signed Zeros - Allow optimizations to treat the sign of a zero
3133    argument or result as insignificant. This does not imply that -0.0
3134    is poison and/or guaranteed to not exist in the operation.
3135
3136 ``arcp``
3137    Allow Reciprocal - Allow optimizations to use the reciprocal of an
3138    argument rather than perform division.
3139
3140 ``contract``
3141    Allow floating-point contraction (e.g. fusing a multiply followed by an
3142    addition into a fused multiply-and-add). This does not enable reassociating
3143    to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3144    be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3145
3146 ``afn``
3147    Approximate functions - Allow substitution of approximate calculations for
3148    functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3149    for places where this can apply to LLVM's intrinsic math functions.
3150
3151 ``reassoc``
3152    Allow reassociation transformations for floating-point instructions.
3153    This may dramatically change results in floating-point.
3154
3155 ``fast``
3156    This flag implies all of the others.
3157
3158 .. _uselistorder:
3159
3160 Use-list Order Directives
3161 -------------------------
3162
3163 Use-list directives encode the in-memory order of each use-list, allowing the
3164 order to be recreated. ``<order-indexes>`` is a comma-separated list of
3165 indexes that are assigned to the referenced value's uses. The referenced
3166 value's use-list is immediately sorted by these indexes.
3167
3168 Use-list directives may appear at function scope or global scope. They are not
3169 instructions, and have no effect on the semantics of the IR. When they're at
3170 function scope, they must appear after the terminator of the final basic block.
3171
3172 If basic blocks have their address taken via ``blockaddress()`` expressions,
3173 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
3174 function's scope.
3175
3176 :Syntax:
3177
3178 ::
3179
3180     uselistorder <ty> <value>, { <order-indexes> }
3181     uselistorder_bb @function, %block { <order-indexes> }
3182
3183 :Examples:
3184
3185 ::
3186
3187     define void @foo(i32 %arg1, i32 %arg2) {
3188     entry:
3189       ; ... instructions ...
3190     bb:
3191       ; ... instructions ...
3192
3193       ; At function scope.
3194       uselistorder i32 %arg1, { 1, 0, 2 }
3195       uselistorder label %bb, { 1, 0 }
3196     }
3197
3198     ; At global scope.
3199     uselistorder i32* @global, { 1, 2, 0 }
3200     uselistorder i32 7, { 1, 0 }
3201     uselistorder i32 (i32) @bar, { 1, 0 }
3202     uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3203
3204 .. _source_filename:
3205
3206 Source Filename
3207 ---------------
3208
3209 The *source filename* string is set to the original module identifier,
3210 which will be the name of the compiled source file when compiling from
3211 source through the clang front end, for example. It is then preserved through
3212 the IR and bitcode.
3213
3214 This is currently necessary to generate a consistent unique global
3215 identifier for local functions used in profile data, which prepends the
3216 source file name to the local function name.
3217
3218 The syntax for the source file name is simply:
3219
3220 .. code-block:: text
3221
3222     source_filename = "/path/to/source.c"
3223
3224 .. _typesystem:
3225
3226 Type System
3227 ===========
3228
3229 The LLVM type system is one of the most important features of the
3230 intermediate representation. Being typed enables a number of
3231 optimizations to be performed on the intermediate representation
3232 directly, without having to do extra analyses on the side before the
3233 transformation. A strong type system makes it easier to read the
3234 generated code and enables novel analyses and transformations that are
3235 not feasible to perform on normal three address code representations.
3236
3237 .. _t_void:
3238
3239 Void Type
3240 ---------
3241
3242 :Overview:
3243
3244
3245 The void type does not represent any value and has no size.
3246
3247 :Syntax:
3248
3249
3250 ::
3251
3252       void
3253
3254
3255 .. _t_function:
3256
3257 Function Type
3258 -------------
3259
3260 :Overview:
3261
3262
3263 The function type can be thought of as a function signature. It consists of a
3264 return type and a list of formal parameter types. The return type of a function
3265 type is a void type or first class type --- except for :ref:`label <t_label>`
3266 and :ref:`metadata <t_metadata>` types.
3267
3268 :Syntax:
3269
3270 ::
3271
3272       <returntype> (<parameter list>)
3273
3274 ...where '``<parameter list>``' is a comma-separated list of type
3275 specifiers. Optionally, the parameter list may include a type ``...``, which
3276 indicates that the function takes a variable number of arguments. Variable
3277 argument functions can access their arguments with the :ref:`variable argument
3278 handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3279 except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3280
3281 :Examples:
3282
3283 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3284 | ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
3285 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3286 | ``float (i16, i32 *) *``        | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``.                                    |
3287 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3288 | ``i32 (i8*, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
3289 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3290 | ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
3291 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3292
3293 .. _t_firstclass:
3294
3295 First Class Types
3296 -----------------
3297
3298 The :ref:`first class <t_firstclass>` types are perhaps the most important.
3299 Values of these types are the only ones which can be produced by
3300 instructions.
3301
3302 .. _t_single_value:
3303
3304 Single Value Types
3305 ^^^^^^^^^^^^^^^^^^
3306
3307 These are the types that are valid in registers from CodeGen's perspective.
3308
3309 .. _t_integer:
3310
3311 Integer Type
3312 """"""""""""
3313
3314 :Overview:
3315
3316 The integer type is a very simple type that simply specifies an
3317 arbitrary bit width for the integer type desired. Any bit width from 1
3318 bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3319
3320 :Syntax:
3321
3322 ::
3323
3324       iN
3325
3326 The number of bits the integer will occupy is specified by the ``N``
3327 value.
3328
3329 Examples:
3330 *********
3331
3332 +----------------+------------------------------------------------+
3333 | ``i1``         | a single-bit integer.                          |
3334 +----------------+------------------------------------------------+
3335 | ``i32``        | a 32-bit integer.                              |
3336 +----------------+------------------------------------------------+
3337 | ``i1942652``   | a really big integer of over 1 million bits.   |
3338 +----------------+------------------------------------------------+
3339
3340 .. _t_floating:
3341
3342 Floating-Point Types
3343 """"""""""""""""""""
3344
3345 .. list-table::
3346    :header-rows: 1
3347
3348    * - Type
3349      - Description
3350
3351    * - ``half``
3352      - 16-bit floating-point value
3353
3354    * - ``bfloat``
3355      - 16-bit "brain" floating-point value (7-bit significand).  Provides the
3356        same number of exponent bits as ``float``, so that it matches its dynamic
3357        range, but with greatly reduced precision.  Used in Intel's AVX-512 BF16
3358        extensions and Arm's ARMv8.6-A extensions, among others.
3359
3360    * - ``float``
3361      - 32-bit floating-point value
3362
3363    * - ``double``
3364      - 64-bit floating-point value
3365
3366    * - ``fp128``
3367      - 128-bit floating-point value (113-bit significand)
3368
3369    * - ``x86_fp80``
3370      -  80-bit floating-point value (X87)
3371
3372    * - ``ppc_fp128``
3373      - 128-bit floating-point value (two 64-bits)
3374
3375 The binary format of half, float, double, and fp128 correspond to the
3376 IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3377 respectively.
3378
3379 X86_amx Type
3380 """"""""""""
3381
3382 :Overview:
3383
3384 The x86_amx type represents a value held in an AMX tile register on an x86
3385 machine. The operations allowed on it are quite limited. Only few intrinsics
3386 are allowed: stride load and store, zero and dot product. No instruction is
3387 allowed for this type. There are no arguments, arrays, pointers, vectors
3388 or constants of this type.
3389
3390 :Syntax:
3391
3392 ::
3393
3394       x86_amx
3395
3396
3397 X86_mmx Type
3398 """"""""""""
3399
3400 :Overview:
3401
3402 The x86_mmx type represents a value held in an MMX register on an x86
3403 machine. The operations allowed on it are quite limited: parameters and
3404 return values, load and store, and bitcast. User-specified MMX
3405 instructions are represented as intrinsic or asm calls with arguments
3406 and/or results of this type. There are no arrays, vectors or constants
3407 of this type.
3408
3409 :Syntax:
3410
3411 ::
3412
3413       x86_mmx
3414
3415
3416 .. _t_pointer:
3417
3418 Pointer Type
3419 """"""""""""
3420
3421 :Overview:
3422
3423 The pointer type is used to specify memory locations. Pointers are
3424 commonly used to reference objects in memory.
3425
3426 Pointer types may have an optional address space attribute defining the
3427 numbered address space where the pointed-to object resides. The default
3428 address space is number zero. The semantics of non-zero address spaces
3429 are target-specific.
3430
3431 Note that LLVM does not permit pointers to void (``void*``) nor does it
3432 permit pointers to labels (``label*``). Use ``i8*`` instead.
3433
3434 LLVM is in the process of transitioning to
3435 `opaque pointers <OpaquePointers.html#opaque-pointers>`_.
3436 Opaque pointers do not have a pointee type. Rather, instructions
3437 interacting through pointers specify the type of the underlying memory
3438 they are interacting with. Opaque pointers are still in the process of
3439 being worked on and are not complete.
3440
3441 :Syntax:
3442
3443 ::
3444
3445       <type> *
3446       ptr
3447
3448 :Examples:
3449
3450 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3451 | ``[4 x i32]*``          | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values.                               |
3452 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3453 | ``i32 (i32*) *``        | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
3454 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3455 | ``i32 addrspace(5)*``   | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space 5.                            |
3456 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3457 | ``ptr``                 | An opaque pointer type to a value that resides in address space 0.                                           |
3458 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3459 | ``ptr addrspace(5)``    | An opaque pointer type to a value that resides in address space 5.                                           |
3460 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3461
3462 .. _t_vector:
3463
3464 Vector Type
3465 """""""""""
3466
3467 :Overview:
3468
3469 A vector type is a simple derived type that represents a vector of
3470 elements. Vector types are used when multiple primitive data are
3471 operated in parallel using a single instruction (SIMD). A vector type
3472 requires a size (number of elements), an underlying primitive data type,
3473 and a scalable property to represent vectors where the exact hardware
3474 vector length is unknown at compile time. Vector types are considered
3475 :ref:`first class <t_firstclass>`.
3476
3477 :Memory Layout:
3478
3479 In general vector elements are laid out in memory in the same way as
3480 :ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3481 elements are byte sized. However, when the elements of the vector aren't byte
3482 sized it gets a bit more complicated. One way to describe the layout is by
3483 describing what happens when a vector such as <N x iM> is bitcasted to an
3484 integer type with N*M bits, and then following the rules for storing such an
3485 integer to memory.
3486
3487 A bitcast from a vector type to a scalar integer type will see the elements
3488 being packed together (without padding). The order in which elements are
3489 inserted in the integer depends on endianess. For little endian element zero
3490 is put in the least significant bits of the integer, and for big endian
3491 element zero is put in the most significant bits.
3492
3493 Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3494 with the analogy that we can replace a vector store by a bitcast followed by
3495 an integer store, we get this for big endian:
3496
3497 .. code-block:: llvm
3498
3499       %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3500
3501       ; Bitcasting from a vector to an integral type can be seen as
3502       ; concatenating the values:
3503       ;   %val now has the hexadecimal value 0x1235.
3504
3505       store i16 %val, i16* %ptr
3506
3507       ; In memory the content will be (8-bit addressing):
3508       ;
3509       ;    [%ptr + 0]: 00010010  (0x12)
3510       ;    [%ptr + 1]: 00110101  (0x35)
3511
3512 The same example for little endian:
3513
3514 .. code-block:: llvm
3515
3516       %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3517
3518       ; Bitcasting from a vector to an integral type can be seen as
3519       ; concatenating the values:
3520       ;   %val now has the hexadecimal value 0x5321.
3521
3522       store i16 %val, i16* %ptr
3523
3524       ; In memory the content will be (8-bit addressing):
3525       ;
3526       ;    [%ptr + 0]: 01010011  (0x53)
3527       ;    [%ptr + 1]: 00100001  (0x21)
3528
3529 When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
3530 is unspecified (just like it is for an integral type of the same size). This
3531 is because different targets could put the padding at different positions when
3532 the type size is smaller than the type's store size.
3533
3534 :Syntax:
3535
3536 ::
3537
3538       < <# elements> x <elementtype> >          ; Fixed-length vector
3539       < vscale x <# elements> x <elementtype> > ; Scalable vector
3540
3541 The number of elements is a constant integer value larger than 0;
3542 elementtype may be any integer, floating-point or pointer type. Vectors
3543 of size zero are not allowed. For scalable vectors, the total number of
3544 elements is a constant multiple (called vscale) of the specified number
3545 of elements; vscale is a positive integer that is unknown at compile time
3546 and the same hardware-dependent constant for all scalable vectors at run
3547 time. The size of a specific scalable vector type is thus constant within
3548 IR, even if the exact size in bytes cannot be determined until run time.
3549
3550 :Examples:
3551
3552 +------------------------+----------------------------------------------------+
3553 | ``<4 x i32>``          | Vector of 4 32-bit integer values.                 |
3554 +------------------------+----------------------------------------------------+
3555 | ``<8 x float>``        | Vector of 8 32-bit floating-point values.          |
3556 +------------------------+----------------------------------------------------+
3557 | ``<2 x i64>``          | Vector of 2 64-bit integer values.                 |
3558 +------------------------+----------------------------------------------------+
3559 | ``<4 x i64*>``         | Vector of 4 pointers to 64-bit integer values.     |
3560 +------------------------+----------------------------------------------------+
3561 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
3562 +------------------------+----------------------------------------------------+
3563
3564 .. _t_label:
3565
3566 Label Type
3567 ^^^^^^^^^^
3568
3569 :Overview:
3570
3571 The label type represents code labels.
3572
3573 :Syntax:
3574
3575 ::
3576
3577       label
3578
3579 .. _t_token:
3580
3581 Token Type
3582 ^^^^^^^^^^
3583
3584 :Overview:
3585
3586 The token type is used when a value is associated with an instruction
3587 but all uses of the value must not attempt to introspect or obscure it.
3588 As such, it is not appropriate to have a :ref:`phi <i_phi>` or
3589 :ref:`select <i_select>` of type token.
3590
3591 :Syntax:
3592
3593 ::
3594
3595       token
3596
3597
3598
3599 .. _t_metadata:
3600
3601 Metadata Type
3602 ^^^^^^^^^^^^^
3603
3604 :Overview:
3605
3606 The metadata type represents embedded metadata. No derived types may be
3607 created from metadata except for :ref:`function <t_function>` arguments.
3608
3609 :Syntax:
3610
3611 ::
3612
3613       metadata
3614
3615 .. _t_aggregate:
3616
3617 Aggregate Types
3618 ^^^^^^^^^^^^^^^
3619
3620 Aggregate Types are a subset of derived types that can contain multiple
3621 member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
3622 aggregate types. :ref:`Vectors <t_vector>` are not considered to be
3623 aggregate types.
3624
3625 .. _t_array:
3626
3627 Array Type
3628 """"""""""
3629
3630 :Overview:
3631
3632 The array type is a very simple derived type that arranges elements
3633 sequentially in memory. The array type requires a size (number of
3634 elements) and an underlying data type.
3635
3636 :Syntax:
3637
3638 ::
3639
3640       [<# elements> x <elementtype>]
3641
3642 The number of elements is a constant integer value; ``elementtype`` may
3643 be any type with a size.
3644
3645 :Examples:
3646
3647 +------------------+--------------------------------------+
3648 | ``[40 x i32]``   | Array of 40 32-bit integer values.   |
3649 +------------------+--------------------------------------+
3650 | ``[41 x i32]``   | Array of 41 32-bit integer values.   |
3651 +------------------+--------------------------------------+
3652 | ``[4 x i8]``     | Array of 4 8-bit integer values.     |
3653 +------------------+--------------------------------------+
3654
3655 Here are some examples of multidimensional arrays:
3656
3657 +-----------------------------+----------------------------------------------------------+
3658 | ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
3659 +-----------------------------+----------------------------------------------------------+
3660 | ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
3661 +-----------------------------+----------------------------------------------------------+
3662 | ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
3663 +-----------------------------+----------------------------------------------------------+
3664
3665 There is no restriction on indexing beyond the end of the array implied
3666 by a static type (though there are restrictions on indexing beyond the
3667 bounds of an allocated object in some cases). This means that
3668 single-dimension 'variable sized array' addressing can be implemented in
3669 LLVM with a zero length array type. An implementation of 'pascal style
3670 arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
3671 example.
3672
3673 .. _t_struct:
3674
3675 Structure Type
3676 """"""""""""""
3677
3678 :Overview:
3679
3680 The structure type is used to represent a collection of data members
3681 together in memory. The elements of a structure may be any type that has
3682 a size.
3683
3684 Structures in memory are accessed using '``load``' and '``store``' by
3685 getting a pointer to a field with the '``getelementptr``' instruction.
3686 Structures in registers are accessed using the '``extractvalue``' and
3687 '``insertvalue``' instructions.
3688
3689 Structures may optionally be "packed" structures, which indicate that
3690 the alignment of the struct is one byte, and that there is no padding
3691 between the elements. In non-packed structs, padding between field types
3692 is inserted as defined by the DataLayout string in the module, which is
3693 required to match what the underlying code generator expects.
3694
3695 Structures can either be "literal" or "identified". A literal structure
3696 is defined inline with other types (e.g. ``{i32, i32}*``) whereas
3697 identified types are always defined at the top level with a name.
3698 Literal types are uniqued by their contents and can never be recursive
3699 or opaque since there is no way to write one. Identified types can be
3700 recursive, can be opaqued, and are never uniqued.
3701
3702 :Syntax:
3703
3704 ::
3705
3706       %T1 = type { <type list> }     ; Identified normal struct type
3707       %T2 = type <{ <type list> }>   ; Identified packed struct type
3708
3709 :Examples:
3710
3711 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3712 | ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
3713 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3714 | ``{ float, i32 (i32) * }``   | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``.  |
3715 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3716 | ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
3717 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3718
3719 .. _t_opaque:
3720
3721 Opaque Structure Types
3722 """"""""""""""""""""""
3723
3724 :Overview:
3725
3726 Opaque structure types are used to represent structure types that
3727 do not have a body specified. This corresponds (for example) to the C
3728 notion of a forward declared structure. They can be named (``%X``) or
3729 unnamed (``%52``).
3730
3731 :Syntax:
3732
3733 ::
3734
3735       %X = type opaque
3736       %52 = type opaque
3737
3738 :Examples:
3739
3740 +--------------+-------------------+
3741 | ``opaque``   | An opaque type.   |
3742 +--------------+-------------------+
3743
3744 .. _constants:
3745
3746 Constants
3747 =========
3748
3749 LLVM has several different basic types of constants. This section
3750 describes them all and their syntax.
3751
3752 Simple Constants
3753 ----------------
3754
3755 **Boolean constants**
3756     The two strings '``true``' and '``false``' are both valid constants
3757     of the ``i1`` type.
3758 **Integer constants**
3759     Standard integers (such as '4') are constants of the
3760     :ref:`integer <t_integer>` type. Negative numbers may be used with
3761     integer types.
3762 **Floating-point constants**
3763     Floating-point constants use standard decimal notation (e.g.
3764     123.421), exponential notation (e.g. 1.23421e+2), or a more precise
3765     hexadecimal notation (see below). The assembler requires the exact
3766     decimal value of a floating-point constant. For example, the
3767     assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
3768     decimal in binary. Floating-point constants must have a
3769     :ref:`floating-point <t_floating>` type.
3770 **Null pointer constants**
3771     The identifier '``null``' is recognized as a null pointer constant
3772     and must be of :ref:`pointer type <t_pointer>`.
3773 **Token constants**
3774     The identifier '``none``' is recognized as an empty token constant
3775     and must be of :ref:`token type <t_token>`.
3776
3777 The one non-intuitive notation for constants is the hexadecimal form of
3778 floating-point constants. For example, the form
3779 '``double    0x432ff973cafa8000``' is equivalent to (but harder to read
3780 than) '``double 4.5e+15``'. The only time hexadecimal floating-point
3781 constants are required (and the only time that they are generated by the
3782 disassembler) is when a floating-point constant must be emitted but it
3783 cannot be represented as a decimal floating-point number in a reasonable
3784 number of digits. For example, NaN's, infinities, and other special
3785 values are represented in their IEEE hexadecimal format so that assembly
3786 and disassembly do not cause any bits to change in the constants.
3787
3788 When using the hexadecimal form, constants of types bfloat, half, float, and
3789 double are represented using the 16-digit form shown above (which matches the
3790 IEEE754 representation for double); bfloat, half and float values must, however,
3791 be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
3792 precision respectively. Hexadecimal format is always used for long double, and
3793 there are three forms of long double. The 80-bit format used by x86 is
3794 represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
3795 used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
3796 hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
3797 by 32 hexadecimal digits. Long doubles will only work if they match the long
3798 double format on your target.  The IEEE 16-bit format (half precision) is
3799 represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
3800 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
3801 hexadecimal formats are big-endian (sign bit at the left).
3802
3803 There are no constants of type x86_mmx and x86_amx.
3804
3805 .. _complexconstants:
3806
3807 Complex Constants
3808 -----------------
3809
3810 Complex constants are a (potentially recursive) combination of simple
3811 constants and smaller complex constants.
3812
3813 **Structure constants**
3814     Structure constants are represented with notation similar to
3815     structure type definitions (a comma separated list of elements,
3816     surrounded by braces (``{}``)). For example:
3817     "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
3818     "``@G = external global i32``". Structure constants must have
3819     :ref:`structure type <t_struct>`, and the number and types of elements
3820     must match those specified by the type.
3821 **Array constants**
3822     Array constants are represented with notation similar to array type
3823     definitions (a comma separated list of elements, surrounded by
3824     square brackets (``[]``)). For example:
3825     "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
3826     :ref:`array type <t_array>`, and the number and types of elements must
3827     match those specified by the type. As a special case, character array
3828     constants may also be represented as a double-quoted string using the ``c``
3829     prefix. For example: "``c"Hello World\0A\00"``".
3830 **Vector constants**
3831     Vector constants are represented with notation similar to vector
3832     type definitions (a comma separated list of elements, surrounded by
3833     less-than/greater-than's (``<>``)). For example:
3834     "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
3835     must have :ref:`vector type <t_vector>`, and the number and types of
3836     elements must match those specified by the type.
3837 **Zero initialization**
3838     The string '``zeroinitializer``' can be used to zero initialize a
3839     value to zero of *any* type, including scalar and
3840     :ref:`aggregate <t_aggregate>` types. This is often used to avoid
3841     having to print large zero initializers (e.g. for large arrays) and
3842     is always exactly equivalent to using explicit zero initializers.
3843 **Metadata node**
3844     A metadata node is a constant tuple without types. For example:
3845     "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
3846     for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
3847     Unlike other typed constants that are meant to be interpreted as part of
3848     the instruction stream, metadata is a place to attach additional
3849     information such as debug info.
3850
3851 Global Variable and Function Addresses
3852 --------------------------------------
3853
3854 The addresses of :ref:`global variables <globalvars>` and
3855 :ref:`functions <functionstructure>` are always implicitly valid
3856 (link-time) constants. These constants are explicitly referenced when
3857 the :ref:`identifier for the global <identifiers>` is used and always have
3858 :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
3859 file:
3860
3861 .. code-block:: llvm
3862
3863     @X = global i32 17
3864     @Y = global i32 42
3865     @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
3866
3867 .. _undefvalues:
3868
3869 Undefined Values
3870 ----------------
3871
3872 The string '``undef``' can be used anywhere a constant is expected, and
3873 indicates that the user of the value may receive an unspecified
3874 bit-pattern. Undefined values may be of any type (other than '``label``'
3875 or '``void``') and be used anywhere a constant is permitted.
3876
3877 Undefined values are useful because they indicate to the compiler that
3878 the program is well defined no matter what value is used. This gives the
3879 compiler more freedom to optimize. Here are some examples of
3880 (potentially surprising) transformations that are valid (in pseudo IR):
3881
3882 .. code-block:: llvm
3883
3884       %A = add %X, undef
3885       %B = sub %X, undef
3886       %C = xor %X, undef
3887     Safe:
3888       %A = undef
3889       %B = undef
3890       %C = undef
3891
3892 This is safe because all of the output bits are affected by the undef
3893 bits. Any output bit can have a zero or one depending on the input bits.
3894
3895 .. code-block:: llvm
3896
3897       %A = or %X, undef
3898       %B = and %X, undef
3899     Safe:
3900       %A = -1
3901       %B = 0
3902     Safe:
3903       %A = %X  ;; By choosing undef as 0
3904       %B = %X  ;; By choosing undef as -1
3905     Unsafe:
3906       %A = undef
3907       %B = undef
3908
3909 These logical operations have bits that are not always affected by the
3910 input. For example, if ``%X`` has a zero bit, then the output of the
3911 '``and``' operation will always be a zero for that bit, no matter what
3912 the corresponding bit from the '``undef``' is. As such, it is unsafe to
3913 optimize or assume that the result of the '``and``' is '``undef``'.
3914 However, it is safe to assume that all bits of the '``undef``' could be
3915 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
3916 all the bits of the '``undef``' operand to the '``or``' could be set,
3917 allowing the '``or``' to be folded to -1.
3918
3919 .. code-block:: llvm
3920
3921       %A = select undef, %X, %Y
3922       %B = select undef, 42, %Y
3923       %C = select %X, %Y, undef
3924     Safe:
3925       %A = %X     (or %Y)
3926       %B = 42     (or %Y)
3927       %C = %Y
3928     Unsafe:
3929       %A = undef
3930       %B = undef
3931       %C = undef
3932
3933 This set of examples shows that undefined '``select``' (and conditional
3934 branch) conditions can go *either way*, but they have to come from one
3935 of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3936 both known to have a clear low bit, then ``%A`` would have to have a
3937 cleared low bit. However, in the ``%C`` example, the optimizer is
3938 allowed to assume that the '``undef``' operand could be the same as
3939 ``%Y``, allowing the whole '``select``' to be eliminated.
3940
3941 .. code-block:: llvm
3942
3943       %A = xor undef, undef
3944
3945       %B = undef
3946       %C = xor %B, %B
3947
3948       %D = undef
3949       %E = icmp slt %D, 4
3950       %F = icmp gte %D, 4
3951
3952     Safe:
3953       %A = undef
3954       %B = undef
3955       %C = undef
3956       %D = undef
3957       %E = undef
3958       %F = undef
3959
3960 This example points out that two '``undef``' operands are not
3961 necessarily the same. This can be surprising to people (and also matches
3962 C semantics) where they assume that "``X^X``" is always zero, even if
3963 ``X`` is undefined. This isn't true for a number of reasons, but the
3964 short answer is that an '``undef``' "variable" can arbitrarily change
3965 its value over its "live range". This is true because the variable
3966 doesn't actually *have a live range*. Instead, the value is logically
3967 read from arbitrary registers that happen to be around when needed, so
3968 the value is not necessarily consistent over time. In fact, ``%A`` and
3969 ``%C`` need to have the same semantics or the core LLVM "replace all
3970 uses with" concept would not hold.
3971
3972 To ensure all uses of a given register observe the same value (even if
3973 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
3974
3975 .. code-block:: llvm
3976
3977       %A = sdiv undef, %X
3978       %B = sdiv %X, undef
3979     Safe:
3980       %A = 0
3981     b: unreachable
3982
3983 These examples show the crucial difference between an *undefined value*
3984 and *undefined behavior*. An undefined value (like '``undef``') is
3985 allowed to have an arbitrary bit-pattern. This means that the ``%A``
3986 operation can be constant folded to '``0``', because the '``undef``'
3987 could be zero, and zero divided by any value is zero.
3988 However, in the second example, we can make a more aggressive
3989 assumption: because the ``undef`` is allowed to be an arbitrary value,
3990 we are allowed to assume that it could be zero. Since a divide by zero
3991 has *undefined behavior*, we are allowed to assume that the operation
3992 does not execute at all. This allows us to delete the divide and all
3993 code after it. Because the undefined operation "can't happen", the
3994 optimizer can assume that it occurs in dead code.
3995
3996 .. code-block:: text
3997
3998     a:  store undef -> %X
3999     b:  store %X -> undef
4000     Safe:
4001     a: <deleted>
4002     b: unreachable
4003
4004 A store *of* an undefined value can be assumed to not have any effect;
4005 we can assume that the value is overwritten with bits that happen to
4006 match what was already there. However, a store *to* an undefined
4007 location could clobber arbitrary memory, therefore, it has undefined
4008 behavior.
4009
4010 Branching on an undefined value is undefined behavior.
4011 This explains optimizations that depend on branch conditions to construct
4012 predicates, such as Correlated Value Propagation and Global Value Numbering.
4013 In case of switch instruction, the branch condition should be frozen, otherwise
4014 it is undefined behavior.
4015
4016 .. code-block:: llvm
4017
4018     Unsafe:
4019       br undef, BB1, BB2 ; UB
4020
4021       %X = and i32 undef, 255
4022       switch %X, label %ret [ .. ] ; UB
4023
4024       store undef, i8* %ptr
4025       %X = load i8* %ptr ; %X is undef
4026       switch i8 %X, label %ret [ .. ] ; UB
4027
4028     Safe:
4029       %X = or i8 undef, 255 ; always 255
4030       switch i8 %X, label %ret [ .. ] ; Well-defined
4031
4032       %X = freeze i1 undef
4033       br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4034
4035
4036 This is also consistent with the behavior of MemorySanitizer.
4037 MemorySanitizer, detector of uses of uninitialized memory,
4038 defines a branch with condition that depends on an undef value (or
4039 certain other values, like e.g. a result of a load from heap-allocated
4040 memory that has never been stored to) to have an externally visible
4041 side effect. For this reason functions with *sanitize_memory*
4042 attribute are not allowed to produce such branches "out of thin
4043 air". More strictly, an optimization that inserts a conditional branch
4044 is only valid if in all executions where the branch condition has at
4045 least one undefined bit, the same branch condition is evaluated in the
4046 input IR as well.
4047
4048 .. _poisonvalues:
4049
4050 Poison Values
4051 -------------
4052
4053 A poison value is a result of an erroneous operation.
4054 In order to facilitate speculative execution, many instructions do not
4055 invoke immediate undefined behavior when provided with illegal operands,
4056 and return a poison value instead.
4057 The string '``poison``' can be used anywhere a constant is expected, and
4058 operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4059 a poison value.
4060
4061 Poison value behavior is defined in terms of value *dependence*:
4062
4063 -  Values other than :ref:`phi <i_phi>` nodes, :ref:`select <i_select>`, and
4064    :ref:`freeze <i_freeze>` instructions depend on their operands.
4065 -  :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
4066    their dynamic predecessor basic block.
4067 -  :ref:`Select <i_select>` instructions depend on their condition operand and
4068    their selected operand.
4069 -  Function arguments depend on the corresponding actual argument values
4070    in the dynamic callers of their functions.
4071 -  :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
4072    instructions that dynamically transfer control back to them.
4073 -  :ref:`Invoke <i_invoke>` instructions depend on the
4074    :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
4075    call instructions that dynamically transfer control back to them.
4076 -  Non-volatile loads and stores depend on the most recent stores to all
4077    of the referenced memory addresses, following the order in the IR
4078    (including loads and stores implied by intrinsics such as
4079    :ref:`@llvm.memcpy <int_memcpy>`.)
4080 -  An instruction with externally visible side effects depends on the
4081    most recent preceding instruction with externally visible side
4082    effects, following the order in the IR. (This includes :ref:`volatile
4083    operations <volatile>`.)
4084 -  An instruction *control-depends* on a :ref:`terminator
4085    instruction <terminators>` if the terminator instruction has
4086    multiple successors and the instruction is always executed when
4087    control transfers to one of the successors, and may not be executed
4088    when control is transferred to another.
4089 -  Additionally, an instruction also *control-depends* on a terminator
4090    instruction if the set of instructions it otherwise depends on would
4091    be different if the terminator had transferred control to a different
4092    successor.
4093 -  Dependence is transitive.
4094 -  Vector elements may be independently poisoned. Therefore, transforms
4095    on instructions such as shufflevector must be careful to propagate
4096    poison across values or elements only as allowed by the original code.
4097
4098 An instruction that *depends* on a poison value, produces a poison value
4099 itself. A poison value may be relaxed into an
4100 :ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern.
4101 Propagation of poison can be stopped with the
4102 :ref:`freeze instruction <i_freeze>`.
4103
4104 This means that immediate undefined behavior occurs if a poison value is
4105 used as an instruction operand that has any values that trigger undefined
4106 behavior. Notably this includes (but is not limited to):
4107
4108 -  The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4109    any other pointer dereferencing instruction (independent of address
4110    space).
4111 -  The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4112    instruction.
4113 -  The condition operand of a :ref:`br <i_br>` instruction.
4114 -  The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4115    instruction.
4116 -  The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4117    instruction, when the function or invoking call site has a ``noundef``
4118    attribute in the corresponding position.
4119 -  The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4120    call site has a `noundef` attribute in the return value position.
4121
4122 Here are some examples:
4123
4124 .. code-block:: llvm
4125
4126     entry:
4127       %poison = sub nuw i32 0, 1           ; Results in a poison value.
4128       %poison2 = sub i32 poison, 1         ; Also results in a poison value.
4129       %still_poison = and i32 %poison, 0   ; 0, but also poison.
4130       %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
4131       store i32 0, i32* %poison_yet_again  ; Undefined behavior due to
4132                                            ; store to poison.
4133
4134       store i32 %poison, i32* @g           ; Poison value stored to memory.
4135       %poison3 = load i32, i32* @g         ; Poison value loaded back from memory.
4136
4137       %narrowaddr = bitcast i32* @g to i16*
4138       %wideaddr = bitcast i32* @g to i64*
4139       %poison4 = load i16, i16* %narrowaddr ; Returns a poison value.
4140       %poison5 = load i64, i64* %wideaddr   ; Returns a poison value.
4141
4142       %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
4143       br i1 %cmp, label %end, label %end   ; undefined behavior
4144
4145     end:
4146
4147 .. _welldefinedvalues:
4148
4149 Well-Defined Values
4150 -------------------
4151
4152 Given a program execution, a value is *well defined* if the value does not
4153 have an undef bit and is not poison in the execution.
4154 An aggregate value or vector is well defined if its elements are well defined.
4155 The padding of an aggregate isn't considered, since it isn't visible
4156 without storing it into memory and loading it with a different type.
4157
4158 A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4159 defined if it is neither '``undef``' constant nor '``poison``' constant.
4160 The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4161 of its operand.
4162
4163 .. _blockaddress:
4164
4165 Addresses of Basic Blocks
4166 -------------------------
4167
4168 ``blockaddress(@function, %block)``
4169
4170 The '``blockaddress``' constant computes the address of the specified
4171 basic block in the specified function.
4172
4173 It always has an ``i8 addrspace(P)*`` type, where ``P`` is the address space
4174 of the function containing ``%block`` (usually ``addrspace(0)``).
4175
4176 Taking the address of the entry block is illegal.
4177
4178 This value only has defined behavior when used as an operand to the
4179 ':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or
4180 for comparisons against null. Pointer equality tests between labels addresses
4181 results in undefined behavior --- though, again, comparison against null is ok,
4182 and no label is equal to the null pointer. This may be passed around as an
4183 opaque pointer sized value as long as the bits are not inspected. This
4184 allows ``ptrtoint`` and arithmetic to be performed on these values so
4185 long as the original value is reconstituted before the ``indirectbr`` or
4186 ``callbr`` instruction.
4187
4188 Finally, some targets may provide defined semantics when using the value
4189 as the operand to an inline assembly, but that is target specific.
4190
4191 .. _dso_local_equivalent:
4192
4193 DSO Local Equivalent
4194 --------------------
4195
4196 ``dso_local_equivalent @func``
4197
4198 A '``dso_local_equivalent``' constant represents a function which is
4199 functionally equivalent to a given function, but is always defined in the
4200 current linkage unit. The resulting pointer has the same type as the underlying
4201 function. The resulting pointer is permitted, but not required, to be different
4202 from a pointer to the function, and it may have different values in different
4203 translation units.
4204
4205 The target function may not have ``extern_weak`` linkage.
4206
4207 ``dso_local_equivalent`` can be implemented as such:
4208
4209 - If the function has local linkage, hidden visibility, or is
4210   ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4211   to the function.
4212 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4213   function. Many targets support relocations that resolve at link time to either
4214   a function or a stub for it, depending on if the function is defined within the
4215   linkage unit; LLVM will use this when available. (This is commonly called a
4216   "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4217
4218 This can be used wherever a ``dso_local`` instance of a function is needed without
4219 needing to explicitly make the original function ``dso_local``. An instance where
4220 this can be used is for static offset calculations between a function and some other
4221 ``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4222 where dynamic relocations for function pointers in VTables can be replaced with
4223 static relocations for offsets between the VTable and virtual functions which
4224 may not be ``dso_local``.
4225
4226 This is currently only supported for ELF binary formats.
4227
4228 .. _constantexprs:
4229
4230 Constant Expressions
4231 --------------------
4232
4233 Constant expressions are used to allow expressions involving other
4234 constants to be used as constants. Constant expressions may be of any
4235 :ref:`first class <t_firstclass>` type and may involve any LLVM operation
4236 that does not have side effects (e.g. load and call are not supported).
4237 The following is the syntax for constant expressions:
4238
4239 ``trunc (CST to TYPE)``
4240     Perform the :ref:`trunc operation <i_trunc>` on constants.
4241 ``zext (CST to TYPE)``
4242     Perform the :ref:`zext operation <i_zext>` on constants.
4243 ``sext (CST to TYPE)``
4244     Perform the :ref:`sext operation <i_sext>` on constants.
4245 ``fptrunc (CST to TYPE)``
4246     Truncate a floating-point constant to another floating-point type.
4247     The size of CST must be larger than the size of TYPE. Both types
4248     must be floating-point.
4249 ``fpext (CST to TYPE)``
4250     Floating-point extend a constant to another type. The size of CST
4251     must be smaller or equal to the size of TYPE. Both types must be
4252     floating-point.
4253 ``fptoui (CST to TYPE)``
4254     Convert a floating-point constant to the corresponding unsigned
4255     integer constant. TYPE must be a scalar or vector integer type. CST
4256     must be of scalar or vector floating-point type. Both CST and TYPE
4257     must be scalars, or vectors of the same number of elements. If the
4258     value won't fit in the integer type, the result is a
4259     :ref:`poison value <poisonvalues>`.
4260 ``fptosi (CST to TYPE)``
4261     Convert a floating-point constant to the corresponding signed
4262     integer constant. TYPE must be a scalar or vector integer type. CST
4263     must be of scalar or vector floating-point type. Both CST and TYPE
4264     must be scalars, or vectors of the same number of elements. If the
4265     value won't fit in the integer type, the result is a
4266     :ref:`poison value <poisonvalues>`.
4267 ``uitofp (CST to TYPE)``
4268     Convert an unsigned integer constant to the corresponding
4269     floating-point constant. TYPE must be a scalar or vector floating-point
4270     type.  CST must be of scalar or vector integer type. Both CST and TYPE must
4271     be scalars, or vectors of the same number of elements.
4272 ``sitofp (CST to TYPE)``
4273     Convert a signed integer constant to the corresponding floating-point
4274     constant. TYPE must be a scalar or vector floating-point type.
4275     CST must be of scalar or vector integer type. Both CST and TYPE must
4276     be scalars, or vectors of the same number of elements.
4277 ``ptrtoint (CST to TYPE)``
4278     Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4279 ``inttoptr (CST to TYPE)``
4280     Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4281     This one is *really* dangerous!
4282 ``bitcast (CST to TYPE)``
4283     Convert a constant, CST, to another TYPE.
4284     The constraints of the operands are the same as those for the
4285     :ref:`bitcast instruction <i_bitcast>`.
4286 ``addrspacecast (CST to TYPE)``
4287     Convert a constant pointer or constant vector of pointer, CST, to another
4288     TYPE in a different address space. The constraints of the operands are the
4289     same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4290 ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4291     Perform the :ref:`getelementptr operation <i_getelementptr>` on
4292     constants. As with the :ref:`getelementptr <i_getelementptr>`
4293     instruction, the index list may have one or more indexes, which are
4294     required to make sense for the type of "pointer to TY".
4295 ``select (COND, VAL1, VAL2)``
4296     Perform the :ref:`select operation <i_select>` on constants.
4297 ``icmp COND (VAL1, VAL2)``
4298     Perform the :ref:`icmp operation <i_icmp>` on constants.
4299 ``fcmp COND (VAL1, VAL2)``
4300     Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4301 ``extractelement (VAL, IDX)``
4302     Perform the :ref:`extractelement operation <i_extractelement>` on
4303     constants.
4304 ``insertelement (VAL, ELT, IDX)``
4305     Perform the :ref:`insertelement operation <i_insertelement>` on
4306     constants.
4307 ``shufflevector (VEC1, VEC2, IDXMASK)``
4308     Perform the :ref:`shufflevector operation <i_shufflevector>` on
4309     constants.
4310 ``extractvalue (VAL, IDX0, IDX1, ...)``
4311     Perform the :ref:`extractvalue operation <i_extractvalue>` on
4312     constants. The index list is interpreted in a similar manner as
4313     indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
4314     least one index value must be specified.
4315 ``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
4316     Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
4317     The index list is interpreted in a similar manner as indices in a
4318     ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
4319     value must be specified.
4320 ``OPCODE (LHS, RHS)``
4321     Perform the specified operation of the LHS and RHS constants. OPCODE
4322     may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
4323     binary <bitwiseops>` operations. The constraints on operands are
4324     the same as those for the corresponding instruction (e.g. no bitwise
4325     operations on floating-point values are allowed).
4326
4327 Other Values
4328 ============
4329
4330 .. _inlineasmexprs:
4331
4332 Inline Assembler Expressions
4333 ----------------------------
4334
4335 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4336 Inline Assembly <moduleasm>`) through the use of a special value. This value
4337 represents the inline assembler as a template string (containing the
4338 instructions to emit), a list of operand constraints (stored as a string), a
4339 flag that indicates whether or not the inline asm expression has side effects,
4340 and a flag indicating whether the function containing the asm needs to align its
4341 stack conservatively.
4342
4343 The template string supports argument substitution of the operands using "``$``"
4344 followed by a number, to indicate substitution of the given register/memory
4345 location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4346 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4347 operand (See :ref:`inline-asm-modifiers`).
4348
4349 A literal "``$``" may be included by using "``$$``" in the template. To include
4350 other special characters into the output, the usual "``\XX``" escapes may be
4351 used, just as in other strings. Note that after template substitution, the
4352 resulting assembly string is parsed by LLVM's integrated assembler unless it is
4353 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4354 syntax known to LLVM.
4355
4356 LLVM also supports a few more substitutions useful for writing inline assembly:
4357
4358 - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4359   This substitution is useful when declaring a local label. Many standard
4360   compiler optimizations, such as inlining, may duplicate an inline asm blob.
4361   Adding a blob-unique identifier ensures that the two labels will not conflict
4362   during assembly. This is used to implement `GCC's %= special format
4363   string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4364 - ``${:comment}``: Expands to the comment character of the current target's
4365   assembly dialect. This is usually ``#``, but many targets use other strings,
4366   such as ``;``, ``//``, or ``!``.
4367 - ``${:private}``: Expands to the assembler private label prefix. Labels with
4368   this prefix will not appear in the symbol table of the assembled object.
4369   Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4370   relatively popular.
4371
4372 LLVM's support for inline asm is modeled closely on the requirements of Clang's
4373 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4374 modifier codes listed here are similar or identical to those in GCC's inline asm
4375 support. However, to be clear, the syntax of the template and constraint strings
4376 described here is *not* the same as the syntax accepted by GCC and Clang, and,
4377 while most constraint letters are passed through as-is by Clang, some get
4378 translated to other codes when converting from the C source to the LLVM
4379 assembly.
4380
4381 An example inline assembler expression is:
4382
4383 .. code-block:: llvm
4384
4385     i32 (i32) asm "bswap $0", "=r,r"
4386
4387 Inline assembler expressions may **only** be used as the callee operand
4388 of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4389 Thus, typically we have:
4390
4391 .. code-block:: llvm
4392
4393     %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4394
4395 Inline asms with side effects not visible in the constraint list must be
4396 marked as having side effects. This is done through the use of the
4397 '``sideeffect``' keyword, like so:
4398
4399 .. code-block:: llvm
4400
4401     call void asm sideeffect "eieio", ""()
4402
4403 In some cases inline asms will contain code that will not work unless
4404 the stack is aligned in some way, such as calls or SSE instructions on
4405 x86, yet will not contain code that does that alignment within the asm.
4406 The compiler should make conservative assumptions about what the asm
4407 might contain and should generate its usual stack alignment code in the
4408 prologue if the '``alignstack``' keyword is present:
4409
4410 .. code-block:: llvm
4411
4412     call void asm alignstack "eieio", ""()
4413
4414 Inline asms also support using non-standard assembly dialects. The
4415 assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4416 the inline asm is using the Intel dialect. Currently, ATT and Intel are
4417 the only supported dialects. An example is:
4418
4419 .. code-block:: llvm
4420
4421     call void asm inteldialect "eieio", ""()
4422
4423 In the case that the inline asm might unwind the stack,
4424 the '``unwind``' keyword must be used, so that the compiler emits
4425 unwinding information:
4426
4427 .. code-block:: llvm
4428
4429     call void asm unwind "call func", ""()
4430
4431 If the inline asm unwinds the stack and isn't marked with
4432 the '``unwind``' keyword, the behavior is undefined.
4433
4434 If multiple keywords appear, the '``sideeffect``' keyword must come
4435 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4436 third and the '``unwind``' keyword last.
4437
4438 Inline Asm Constraint String
4439 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4440
4441 The constraint list is a comma-separated string, each element containing one or
4442 more constraint codes.
4443
4444 For each element in the constraint list an appropriate register or memory
4445 operand will be chosen, and it will be made available to assembly template
4446 string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4447 second, etc.
4448
4449 There are three different types of constraints, which are distinguished by a
4450 prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4451 constraints must always be given in that order: outputs first, then inputs, then
4452 clobbers. They cannot be intermingled.
4453
4454 There are also three different categories of constraint codes:
4455
4456 - Register constraint. This is either a register class, or a fixed physical
4457   register. This kind of constraint will allocate a register, and if necessary,
4458   bitcast the argument or result to the appropriate type.
4459 - Memory constraint. This kind of constraint is for use with an instruction
4460   taking a memory operand. Different constraints allow for different addressing
4461   modes used by the target.
4462 - Immediate value constraint. This kind of constraint is for an integer or other
4463   immediate value which can be rendered directly into an instruction. The
4464   various target-specific constraints allow the selection of a value in the
4465   proper range for the instruction you wish to use it with.
4466
4467 Output constraints
4468 """"""""""""""""""
4469
4470 Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4471 indicates that the assembly will write to this operand, and the operand will
4472 then be made available as a return value of the ``asm`` expression. Output
4473 constraints do not consume an argument from the call instruction. (Except, see
4474 below about indirect outputs).
4475
4476 Normally, it is expected that no output locations are written to by the assembly
4477 expression until *all* of the inputs have been read. As such, LLVM may assign
4478 the same register to an output and an input. If this is not safe (e.g. if the
4479 assembly contains two instructions, where the first writes to one output, and
4480 the second reads an input and writes to a second output), then the "``&``"
4481 modifier must be used (e.g. "``=&r``") to specify that the output is an
4482 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4483 will not use the same register for any inputs (other than an input tied to this
4484 output).
4485
4486 Input constraints
4487 """""""""""""""""
4488
4489 Input constraints do not have a prefix -- just the constraint codes. Each input
4490 constraint will consume one argument from the call instruction. It is not
4491 permitted for the asm to write to any input register or memory location (unless
4492 that input is tied to an output). Note also that multiple inputs may all be
4493 assigned to the same register, if LLVM can determine that they necessarily all
4494 contain the same value.
4495
4496 Instead of providing a Constraint Code, input constraints may also "tie"
4497 themselves to an output constraint, by providing an integer as the constraint
4498 string. Tied inputs still consume an argument from the call instruction, and
4499 take up a position in the asm template numbering as is usual -- they will simply
4500 be constrained to always use the same register as the output they've been tied
4501 to. For example, a constraint string of "``=r,0``" says to assign a register for
4502 output, and use that register as an input as well (it being the 0'th
4503 constraint).
4504
4505 It is permitted to tie an input to an "early-clobber" output. In that case, no
4506 *other* input may share the same register as the input tied to the early-clobber
4507 (even when the other input has the same value).
4508
4509 You may only tie an input to an output which has a register constraint, not a
4510 memory constraint. Only a single input may be tied to an output.
4511
4512 There is also an "interesting" feature which deserves a bit of explanation: if a
4513 register class constraint allocates a register which is too small for the value
4514 type operand provided as input, the input value will be split into multiple
4515 registers, and all of them passed to the inline asm.
4516
4517 However, this feature is often not as useful as you might think.
4518
4519 Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4520 architectures that have instructions which operate on multiple consecutive
4521 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4522 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4523 hardware then loads into both the named register, and the next register. This
4524 feature of inline asm would not be useful to support that.)
4525
4526 A few of the targets provide a template string modifier allowing explicit access
4527 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4528 ``D``). On such an architecture, you can actually access the second allocated
4529 register (yet, still, not any subsequent ones). But, in that case, you're still
4530 probably better off simply splitting the value into two separate operands, for
4531 clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4532 despite existing only for use with this feature, is not really a good idea to
4533 use)
4534
4535 Indirect inputs and outputs
4536 """""""""""""""""""""""""""
4537
4538 Indirect output or input constraints can be specified by the "``*``" modifier
4539 (which goes after the "``=``" in case of an output). This indicates that the asm
4540 will write to or read from the contents of an *address* provided as an input
4541 argument. (Note that in this way, indirect outputs act more like an *input* than
4542 an output: just like an input, they consume an argument of the call expression,
4543 rather than producing a return value. An indirect output constraint is an
4544 "output" only in that the asm is expected to write to the contents of the input
4545 memory location, instead of just read from it).
4546
4547 This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4548 address of a variable as a value.
4549
4550 It is also possible to use an indirect *register* constraint, but only on output
4551 (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4552 value normally, and then, separately emit a store to the address provided as
4553 input, after the provided inline asm. (It's not clear what value this
4554 functionality provides, compared to writing the store explicitly after the asm
4555 statement, and it can only produce worse code, since it bypasses many
4556 optimization passes. I would recommend not using it.)
4557
4558
4559 Clobber constraints
4560 """""""""""""""""""
4561
4562 A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4563 consume an input operand, nor generate an output. Clobbers cannot use any of the
4564 general constraint code letters -- they may use only explicit register
4565 constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4566 "``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4567 memory locations -- not only the memory pointed to by a declared indirect
4568 output.
4569
4570 Note that clobbering named registers that are also present in output
4571 constraints is not legal.
4572
4573
4574 Constraint Codes
4575 """"""""""""""""
4576 After a potential prefix comes constraint code, or codes.
4577
4578 A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
4579 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
4580 (e.g. "``{eax}``").
4581
4582 The one and two letter constraint codes are typically chosen to be the same as
4583 GCC's constraint codes.
4584
4585 A single constraint may include one or more than constraint code in it, leaving
4586 it up to LLVM to choose which one to use. This is included mainly for
4587 compatibility with the translation of GCC inline asm coming from clang.
4588
4589 There are two ways to specify alternatives, and either or both may be used in an
4590 inline asm constraint list:
4591
4592 1) Append the codes to each other, making a constraint code set. E.g. "``im``"
4593    or "``{eax}m``". This means "choose any of the options in the set". The
4594    choice of constraint is made independently for each constraint in the
4595    constraint list.
4596
4597 2) Use "``|``" between constraint code sets, creating alternatives. Every
4598    constraint in the constraint list must have the same number of alternative
4599    sets. With this syntax, the same alternative in *all* of the items in the
4600    constraint list will be chosen together.
4601
4602 Putting those together, you might have a two operand constraint string like
4603 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
4604 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
4605 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
4606
4607 However, the use of either of the alternatives features is *NOT* recommended, as
4608 LLVM is not able to make an intelligent choice about which one to use. (At the
4609 point it currently needs to choose, not enough information is available to do so
4610 in a smart way.) Thus, it simply tries to make a choice that's most likely to
4611 compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
4612 always choose to use memory, not registers). And, if given multiple registers,
4613 or multiple register classes, it will simply choose the first one. (In fact, it
4614 doesn't currently even ensure explicitly specified physical registers are
4615 unique, so specifying multiple physical registers as alternatives, like
4616 ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
4617 intended.)
4618
4619 Supported Constraint Code List
4620 """"""""""""""""""""""""""""""
4621
4622 The constraint codes are, in general, expected to behave the same way they do in
4623 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4624 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4625 and GCC likely indicates a bug in LLVM.
4626
4627 Some constraint codes are typically supported by all targets:
4628
4629 - ``r``: A register in the target's general purpose register class.
4630 - ``m``: A memory address operand. It is target-specific what addressing modes
4631   are supported, typical examples are register, or register + register offset,
4632   or register + immediate offset (of some target-specific size).
4633 - ``i``: An integer constant (of target-specific width). Allows either a simple
4634   immediate, or a relocatable value.
4635 - ``n``: An integer constant -- *not* including relocatable values.
4636 - ``s``: An integer constant, but allowing *only* relocatable values.
4637 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
4638   useful to pass a label for an asm branch or call.
4639
4640   .. FIXME: but that surely isn't actually okay to jump out of an asm
4641      block without telling llvm about the control transfer???)
4642
4643 - ``{register-name}``: Requires exactly the named physical register.
4644
4645 Other constraints are target-specific:
4646
4647 AArch64:
4648
4649 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
4650 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
4651   i.e. 0 to 4095 with optional shift by 12.
4652 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
4653   ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
4654 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
4655   logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
4656 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
4657   logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
4658 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
4659   32-bit register. This is a superset of ``K``: in addition to the bitmask
4660   immediate, also allows immediate integers which can be loaded with a single
4661   ``MOVZ`` or ``MOVL`` instruction.
4662 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
4663   64-bit register. This is a superset of ``L``.
4664 - ``Q``: Memory address operand must be in a single register (no
4665   offsets). (However, LLVM currently does this for the ``m`` constraint as
4666   well.)
4667 - ``r``: A 32 or 64-bit integer register (W* or X*).
4668 - ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
4669 - ``x``: Like w, but restricted to registers 0 to 15 inclusive.
4670 - ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
4671 - ``Upl``: One of the low eight SVE predicate registers (P0 to P7)
4672 - ``Upa``: Any of the SVE predicate registers (P0 to P15)
4673
4674 AMDGPU:
4675
4676 - ``r``: A 32 or 64-bit integer register.
4677 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
4678 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
4679 - ``[0-9]a``: The 32-bit AGPR register, number 0-9.
4680 - ``I``: An integer inline constant in the range from -16 to 64.
4681 - ``J``: A 16-bit signed integer constant.
4682 - ``A``: An integer or a floating-point inline constant.
4683 - ``B``: A 32-bit signed integer constant.
4684 - ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
4685 - ``DA``: A 64-bit constant that can be split into two "A" constants.
4686 - ``DB``: A 64-bit constant that can be split into two "B" constants.
4687
4688 All ARM modes:
4689
4690 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
4691   operand. Treated the same as operand ``m``, at the moment.
4692 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
4693 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
4694
4695 ARM and ARM's Thumb2 mode:
4696
4697 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
4698 - ``I``: An immediate integer valid for a data-processing instruction.
4699 - ``J``: An immediate integer between -4095 and 4095.
4700 - ``K``: An immediate integer whose bitwise inverse is valid for a
4701   data-processing instruction. (Can be used with template modifier "``B``" to
4702   print the inverted value).
4703 - ``L``: An immediate integer whose negation is valid for a data-processing
4704   instruction. (Can be used with template modifier "``n``" to print the negated
4705   value).
4706 - ``M``: A power of two or an integer between 0 and 32.
4707 - ``N``: Invalid immediate constraint.
4708 - ``O``: Invalid immediate constraint.
4709 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
4710 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
4711   as ``r``.
4712 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
4713   invalid.
4714 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4715   ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4716 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4717   ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4718 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4719   ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4720
4721 ARM's Thumb1 mode:
4722
4723 - ``I``: An immediate integer between 0 and 255.
4724 - ``J``: An immediate integer between -255 and -1.
4725 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
4726   some amount.
4727 - ``L``: An immediate integer between -7 and 7.
4728 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
4729 - ``N``: An immediate integer between 0 and 31.
4730 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
4731 - ``r``: A low 32-bit GPR register (``r0-r7``).
4732 - ``l``: A low 32-bit GPR register (``r0-r7``).
4733 - ``h``: A high GPR register (``r0-r7``).
4734 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4735   ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4736 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4737   ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4738 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4739   ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4740
4741
4742 Hexagon:
4743
4744 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
4745   at the moment.
4746 - ``r``: A 32 or 64-bit register.
4747
4748 MSP430:
4749
4750 - ``r``: An 8 or 16-bit register.
4751
4752 MIPS:
4753
4754 - ``I``: An immediate signed 16-bit integer.
4755 - ``J``: An immediate integer zero.
4756 - ``K``: An immediate unsigned 16-bit integer.
4757 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
4758 - ``N``: An immediate integer between -65535 and -1.
4759 - ``O``: An immediate signed 15-bit integer.
4760 - ``P``: An immediate integer between 1 and 65535.
4761 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
4762   register plus 16-bit immediate offset. In MIPS mode, just a base register.
4763 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
4764   register plus a 9-bit signed offset. In MIPS mode, the same as constraint
4765   ``m``.
4766 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
4767   ``sc`` instruction on the given subtarget (details vary).
4768 - ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
4769 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
4770   (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
4771   argument modifier for compatibility with GCC.
4772 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
4773   ``25``).
4774 - ``l``: The ``lo`` register, 32 or 64-bit.
4775 - ``x``: Invalid.
4776
4777 NVPTX:
4778
4779 - ``b``: A 1-bit integer register.
4780 - ``c`` or ``h``: A 16-bit integer register.
4781 - ``r``: A 32-bit integer register.
4782 - ``l`` or ``N``: A 64-bit integer register.
4783 - ``f``: A 32-bit float register.
4784 - ``d``: A 64-bit float register.
4785
4786
4787 PowerPC:
4788
4789 - ``I``: An immediate signed 16-bit integer.
4790 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
4791 - ``K``: An immediate unsigned 16-bit integer.
4792 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
4793 - ``M``: An immediate integer greater than 31.
4794 - ``N``: An immediate integer that is an exact power of 2.
4795 - ``O``: The immediate integer constant 0.
4796 - ``P``: An immediate integer constant whose negation is a signed 16-bit
4797   constant.
4798 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
4799   treated the same as ``m``.
4800 - ``r``: A 32 or 64-bit integer register.
4801 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
4802   ``R1-R31``).
4803 - ``f``: A 32 or 64-bit float register (``F0-F31``),
4804 - ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
4805    register (``V0-V31``).
4806
4807 - ``y``: Condition register (``CR0-CR7``).
4808 - ``wc``: An individual CR bit in a CR register.
4809 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
4810   register set (overlapping both the floating-point and vector register files).
4811 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
4812   set.
4813
4814 RISC-V:
4815
4816 - ``A``: An address operand (using a general-purpose register, without an
4817   offset).
4818 - ``I``: A 12-bit signed integer immediate operand.
4819 - ``J``: A zero integer immediate operand.
4820 - ``K``: A 5-bit unsigned integer immediate operand.
4821 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
4822 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform
4823   ``XLEN``).
4824 - ``vr``: A vector register. (requires V extension).
4825 - ``vm``: A vector mask register. (requires V extension).
4826
4827 Sparc:
4828
4829 - ``I``: An immediate 13-bit signed integer.
4830 - ``r``: A 32-bit integer register.
4831 - ``f``: Any floating-point register on SparcV8, or a floating-point
4832   register in the "low" half of the registers on SparcV9.
4833 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
4834
4835 SystemZ:
4836
4837 - ``I``: An immediate unsigned 8-bit integer.
4838 - ``J``: An immediate unsigned 12-bit integer.
4839 - ``K``: An immediate signed 16-bit integer.
4840 - ``L``: An immediate signed 20-bit integer.
4841 - ``M``: An immediate integer 0x7fffffff.
4842 - ``Q``: A memory address operand with a base address and a 12-bit immediate
4843   unsigned displacement.
4844 - ``R``: A memory address operand with a base address, a 12-bit immediate
4845   unsigned displacement, and an index register.
4846 - ``S``: A memory address operand with a base address and a 20-bit immediate
4847   signed displacement.
4848 - ``T``: A memory address operand with a base address, a 20-bit immediate
4849   signed displacement, and an index register.
4850 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
4851 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
4852   address context evaluates as zero).
4853 - ``h``: A 32-bit value in the high part of a 64bit data register
4854   (LLVM-specific)
4855 - ``f``: A 32, 64, or 128-bit floating-point register.
4856
4857 X86:
4858
4859 - ``I``: An immediate integer between 0 and 31.
4860 - ``J``: An immediate integer between 0 and 64.
4861 - ``K``: An immediate signed 8-bit integer.
4862 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
4863   0xffffffff.
4864 - ``M``: An immediate integer between 0 and 3.
4865 - ``N``: An immediate unsigned 8-bit integer.
4866 - ``O``: An immediate integer between 0 and 127.
4867 - ``e``: An immediate 32-bit signed integer.
4868 - ``Z``: An immediate 32-bit unsigned integer.
4869 - ``o``, ``v``: Treated the same as ``m``, at the moment.
4870 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4871   ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
4872   registers, and on X86-64, it is all of the integer registers.
4873 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4874   ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
4875 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
4876 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
4877   existed since i386, and can be accessed without the REX prefix.
4878 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
4879 - ``y``: A 64-bit MMX register, if MMX is enabled.
4880 - ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
4881   operand in a SSE register. If AVX is also enabled, can also be a 256-bit
4882   vector operand in an AVX register. If AVX-512 is also enabled, can also be a
4883   512-bit vector operand in an AVX512 register, Otherwise, an error.
4884 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
4885 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
4886   32-bit mode, a 64-bit integer operand will get split into two registers). It
4887   is not recommended to use this constraint, as in 64-bit mode, the 64-bit
4888   operand will get allocated only to RAX -- if two 32-bit operands are needed,
4889   you're better off splitting it yourself, before passing it to the asm
4890   statement.
4891
4892 XCore:
4893
4894 - ``r``: A 32-bit integer register.
4895
4896
4897 .. _inline-asm-modifiers:
4898
4899 Asm template argument modifiers
4900 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4901
4902 In the asm template string, modifiers can be used on the operand reference, like
4903 "``${0:n}``".
4904
4905 The modifiers are, in general, expected to behave the same way they do in
4906 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4907 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4908 and GCC likely indicates a bug in LLVM.
4909
4910 Target-independent:
4911
4912 - ``c``: Print an immediate integer constant unadorned, without
4913   the target-specific immediate punctuation (e.g. no ``$`` prefix).
4914 - ``n``: Negate and print immediate integer constant unadorned, without the
4915   target-specific immediate punctuation (e.g. no ``$`` prefix).
4916 - ``l``: Print as an unadorned label, without the target-specific label
4917   punctuation (e.g. no ``$`` prefix).
4918
4919 AArch64:
4920
4921 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
4922   instead of ``x30``, print ``w30``.
4923 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
4924 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
4925   ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
4926   ``v*``.
4927
4928 AMDGPU:
4929
4930 - ``r``: No effect.
4931
4932 ARM:
4933
4934 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
4935   register).
4936 - ``P``: No effect.
4937 - ``q``: No effect.
4938 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
4939   as ``d4[1]`` instead of ``s9``)
4940 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
4941   prefix.
4942 - ``L``: Print the low 16-bits of an immediate integer constant.
4943 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
4944   register operands subsequent to the specified one (!), so use carefully.
4945 - ``Q``: Print the low-order register of a register-pair, or the low-order
4946   register of a two-register operand.
4947 - ``R``: Print the high-order register of a register-pair, or the high-order
4948   register of a two-register operand.
4949 - ``H``: Print the second register of a register-pair. (On a big-endian system,
4950   ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
4951   to ``R``.)
4952
4953   .. FIXME: H doesn't currently support printing the second register
4954      of a two-register operand.
4955
4956 - ``e``: Print the low doubleword register of a NEON quad register.
4957 - ``f``: Print the high doubleword register of a NEON quad register.
4958 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
4959   adornment.
4960
4961 Hexagon:
4962
4963 - ``L``: Print the second register of a two-register operand. Requires that it
4964   has been allocated consecutively to the first.
4965
4966   .. FIXME: why is it restricted to consecutive ones? And there's
4967      nothing that ensures that happens, is there?
4968
4969 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4970   nothing. Used to print 'addi' vs 'add' instructions.
4971
4972 MSP430:
4973
4974 No additional modifiers.
4975
4976 MIPS:
4977
4978 - ``X``: Print an immediate integer as hexadecimal
4979 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
4980 - ``d``: Print an immediate integer as decimal.
4981 - ``m``: Subtract one and print an immediate integer as decimal.
4982 - ``z``: Print $0 if an immediate zero, otherwise print normally.
4983 - ``L``: Print the low-order register of a two-register operand, or prints the
4984   address of the low-order word of a double-word memory operand.
4985
4986   .. FIXME: L seems to be missing memory operand support.
4987
4988 - ``M``: Print the high-order register of a two-register operand, or prints the
4989   address of the high-order word of a double-word memory operand.
4990
4991   .. FIXME: M seems to be missing memory operand support.
4992
4993 - ``D``: Print the second register of a two-register operand, or prints the
4994   second word of a double-word memory operand. (On a big-endian system, ``D`` is
4995   equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
4996   ``M``.)
4997 - ``w``: No effect. Provided for compatibility with GCC which requires this
4998   modifier in order to print MSA registers (``W0-W31``) with the ``f``
4999   constraint.
5000
5001 NVPTX:
5002
5003 - ``r``: No effect.
5004
5005 PowerPC:
5006
5007 - ``L``: Print the second register of a two-register operand. Requires that it
5008   has been allocated consecutively to the first.
5009
5010   .. FIXME: why is it restricted to consecutive ones? And there's
5011      nothing that ensures that happens, is there?
5012
5013 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5014   nothing. Used to print 'addi' vs 'add' instructions.
5015 - ``y``: For a memory operand, prints formatter for a two-register X-form
5016   instruction. (Currently always prints ``r0,OPERAND``).
5017 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
5018   otherwise. (NOTE: LLVM does not support update form, so this will currently
5019   always print nothing)
5020 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5021   not support indexed form, so this will currently always print nothing)
5022
5023 RISC-V:
5024
5025 - ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5026   nothing. Used to print 'addi' vs 'add' instructions, etc.
5027 - ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5028   normally.
5029
5030 Sparc:
5031
5032 - ``r``: No effect.
5033
5034 SystemZ:
5035
5036 SystemZ implements only ``n``, and does *not* support any of the other
5037 target-independent modifiers.
5038
5039 X86:
5040
5041 - ``c``: Print an unadorned integer or symbol name. (The latter is
5042   target-specific behavior for this typically target-independent modifier).
5043 - ``A``: Print a register name with a '``*``' before it.
5044 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5045   operand.
5046 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5047   memory operand.
5048 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5049   operand.
5050 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5051   operand.
5052 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5053   available, otherwise the 32-bit register name; do nothing on a memory operand.
5054 - ``n``: Negate and print an unadorned integer, or, for operands other than an
5055   immediate integer (e.g. a relocatable symbol expression), print a '-' before
5056   the operand. (The behavior for relocatable symbol expressions is a
5057   target-specific behavior for this typically target-independent modifier)
5058 - ``H``: Print a memory reference with additional offset +8.
5059 - ``P``: Print a memory reference or operand for use as the argument of a call
5060   instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
5061
5062 XCore:
5063
5064 No additional modifiers.
5065
5066
5067 Inline Asm Metadata
5068 ^^^^^^^^^^^^^^^^^^^
5069
5070 The call instructions that wrap inline asm nodes may have a
5071 "``!srcloc``" MDNode attached to it that contains a list of constant
5072 integers. If present, the code generator will use the integer as the
5073 location cookie value when report errors through the ``LLVMContext``
5074 error reporting mechanisms. This allows a front-end to correlate backend
5075 errors that occur with inline asm back to the source code that produced
5076 it. For example:
5077
5078 .. code-block:: llvm
5079
5080     call void asm sideeffect "something bad", ""(), !srcloc !42
5081     ...
5082     !42 = !{ i32 1234567 }
5083
5084 It is up to the front-end to make sense of the magic numbers it places
5085 in the IR. If the MDNode contains multiple constants, the code generator
5086 will use the one that corresponds to the line of the asm that the error
5087 occurs on.
5088
5089 .. _metadata:
5090
5091 Metadata
5092 ========
5093
5094 LLVM IR allows metadata to be attached to instructions and global objects in the
5095 program that can convey extra information about the code to the optimizers and
5096 code generator. One example application of metadata is source-level
5097 debug information. There are two metadata primitives: strings and nodes.
5098
5099 Metadata does not have a type, and is not a value. If referenced from a
5100 ``call`` instruction, it uses the ``metadata`` type.
5101
5102 All metadata are identified in syntax by an exclamation point ('``!``').
5103
5104 .. _metadata-string:
5105
5106 Metadata Nodes and Metadata Strings
5107 -----------------------------------
5108
5109 A metadata string is a string surrounded by double quotes. It can
5110 contain any character by escaping non-printable characters with
5111 "``\xx``" where "``xx``" is the two digit hex code. For example:
5112 "``!"test\00"``".
5113
5114 Metadata nodes are represented with notation similar to structure
5115 constants (a comma separated list of elements, surrounded by braces and
5116 preceded by an exclamation point). Metadata nodes can have any values as
5117 their operand. For example:
5118
5119 .. code-block:: llvm
5120
5121     !{ !"test\00", i32 10}
5122
5123 Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5124
5125 .. code-block:: text
5126
5127     !0 = distinct !{!"test\00", i32 10}
5128
5129 ``distinct`` nodes are useful when nodes shouldn't be merged based on their
5130 content. They can also occur when transformations cause uniquing collisions
5131 when metadata operands change.
5132
5133 A :ref:`named metadata <namedmetadatastructure>` is a collection of
5134 metadata nodes, which can be looked up in the module symbol table. For
5135 example:
5136
5137 .. code-block:: llvm
5138
5139     !foo = !{!4, !3}
5140
5141 Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5142 intrinsic is using three metadata arguments:
5143
5144 .. code-block:: llvm
5145
5146     call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5147
5148 Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5149 to the ``add`` instruction using the ``!dbg`` identifier:
5150
5151 .. code-block:: llvm
5152
5153     %indvar.next = add i64 %indvar, 1, !dbg !21
5154
5155 Instructions may not have multiple metadata attachments with the same
5156 identifier.
5157
5158 Metadata can also be attached to a function or a global variable. Here metadata
5159 ``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5160 and ``g2`` using the ``!dbg`` identifier:
5161
5162 .. code-block:: llvm
5163
5164     declare !dbg !22 void @f1()
5165     define void @f2() !dbg !22 {
5166       ret void
5167     }
5168
5169     @g1 = global i32 0, !dbg !22
5170     @g2 = external global i32, !dbg !22
5171
5172 Unlike instructions, global objects (functions and global variables) may have
5173 multiple metadata attachments with the same identifier.
5174
5175 A transformation is required to drop any metadata attachment that it does not
5176 know or know it can't preserve. Currently there is an exception for metadata
5177 attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
5178 unconditionally dropped unless the global is itself deleted.
5179
5180 Metadata attached to a module using named metadata may not be dropped, with
5181 the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5182
5183 More information about specific metadata nodes recognized by the
5184 optimizers and code generator is found below.
5185
5186 .. _specialized-metadata:
5187
5188 Specialized Metadata Nodes
5189 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5190
5191 Specialized metadata nodes are custom data structures in metadata (as opposed
5192 to generic tuples). Their fields are labelled, and can be specified in any
5193 order.
5194
5195 These aren't inherently debug info centric, but currently all the specialized
5196 metadata nodes are related to debug info.
5197
5198 .. _DICompileUnit:
5199
5200 DICompileUnit
5201 """""""""""""
5202
5203 ``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5204 ``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5205 containing the debug info to be emitted along with the compile unit, regardless
5206 of code optimizations (some nodes are only emitted if there are references to
5207 them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5208 indicating whether or not line-table discriminators are updated to provide
5209 more-accurate debug info for profiling results.
5210
5211 .. code-block:: text
5212
5213     !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5214                         isOptimized: true, flags: "-O2", runtimeVersion: 2,
5215                         splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5216                         enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5217                         macros: !6, dwoId: 0x0abcd)
5218
5219 Compile unit descriptors provide the root scope for objects declared in a
5220 specific compilation unit. File descriptors are defined using this scope.  These
5221 descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5222 track of global variables, type information, and imported entities (declarations
5223 and namespaces).
5224
5225 .. _DIFile:
5226
5227 DIFile
5228 """"""
5229
5230 ``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5231
5232 .. code-block:: none
5233
5234     !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5235                  checksumkind: CSK_MD5,
5236                  checksum: "000102030405060708090a0b0c0d0e0f")
5237
5238 Files are sometimes used in ``scope:`` fields, and are the only valid target
5239 for ``file:`` fields.
5240 Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256}
5241
5242 .. _DIBasicType:
5243
5244 DIBasicType
5245 """""""""""
5246
5247 ``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5248 ``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5249
5250 .. code-block:: text
5251
5252     !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5253                       encoding: DW_ATE_unsigned_char)
5254     !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5255
5256 The ``encoding:`` describes the details of the type. Usually it's one of the
5257 following:
5258
5259 .. code-block:: text
5260
5261   DW_ATE_address       = 1
5262   DW_ATE_boolean       = 2
5263   DW_ATE_float         = 4
5264   DW_ATE_signed        = 5
5265   DW_ATE_signed_char   = 6
5266   DW_ATE_unsigned      = 7
5267   DW_ATE_unsigned_char = 8
5268
5269 .. _DISubroutineType:
5270
5271 DISubroutineType
5272 """"""""""""""""
5273
5274 ``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5275 refers to a tuple; the first operand is the return type, while the rest are the
5276 types of the formal arguments in order. If the first operand is ``null``, that
5277 represents a function with no return value (such as ``void foo() {}`` in C++).
5278
5279 .. code-block:: text
5280
5281     !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5282     !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5283     !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5284
5285 .. _DIDerivedType:
5286
5287 DIDerivedType
5288 """""""""""""
5289
5290 ``DIDerivedType`` nodes represent types derived from other types, such as
5291 qualified types.
5292
5293 .. code-block:: text
5294
5295     !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5296                       encoding: DW_ATE_unsigned_char)
5297     !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5298                         align: 32)
5299
5300 The following ``tag:`` values are valid:
5301
5302 .. code-block:: text
5303
5304   DW_TAG_member             = 13
5305   DW_TAG_pointer_type       = 15
5306   DW_TAG_reference_type     = 16
5307   DW_TAG_typedef            = 22
5308   DW_TAG_inheritance        = 28
5309   DW_TAG_ptr_to_member_type = 31
5310   DW_TAG_const_type         = 38
5311   DW_TAG_friend             = 42
5312   DW_TAG_volatile_type      = 53
5313   DW_TAG_restrict_type      = 55
5314   DW_TAG_atomic_type        = 71
5315
5316 .. _DIDerivedTypeMember:
5317
5318 ``DW_TAG_member`` is used to define a member of a :ref:`composite type
5319 <DICompositeType>`. The type of the member is the ``baseType:``. The
5320 ``offset:`` is the member's bit offset.  If the composite type has an ODR
5321 ``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5322 uniqued based only on its ``name:`` and ``scope:``.
5323
5324 ``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5325 field of :ref:`composite types <DICompositeType>` to describe parents and
5326 friends.
5327
5328 ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5329
5330 ``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5331 ``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type``
5332 are used to qualify the ``baseType:``.
5333
5334 Note that the ``void *`` type is expressed as a type derived from NULL.
5335
5336 .. _DICompositeType:
5337
5338 DICompositeType
5339 """""""""""""""
5340
5341 ``DICompositeType`` nodes represent types composed of other types, like
5342 structures and unions. ``elements:`` points to a tuple of the composed types.
5343
5344 If the source language supports ODR, the ``identifier:`` field gives the unique
5345 identifier used for type merging between modules.  When specified,
5346 :ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5347 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5348 ``scope:`` change uniquing rules.
5349
5350 For a given ``identifier:``, there should only be a single composite type that
5351 does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
5352 together will unique such definitions at parse time via the ``identifier:``
5353 field, even if the nodes are ``distinct``.
5354
5355 .. code-block:: text
5356
5357     !0 = !DIEnumerator(name: "SixKind", value: 7)
5358     !1 = !DIEnumerator(name: "SevenKind", value: 7)
5359     !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5360     !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5361                           line: 2, size: 32, align: 32, identifier: "_M4Enum",
5362                           elements: !{!0, !1, !2})
5363
5364 The following ``tag:`` values are valid:
5365
5366 .. code-block:: text
5367
5368   DW_TAG_array_type       = 1
5369   DW_TAG_class_type       = 2
5370   DW_TAG_enumeration_type = 4
5371   DW_TAG_structure_type   = 19
5372   DW_TAG_union_type       = 23
5373
5374 For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5375 descriptors <DISubrange>`, each representing the range of subscripts at that
5376 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5377 array type is a native packed vector. The optional ``dataLocation`` is a
5378 DIExpression that describes how to get from an object's address to the actual
5379 raw data, if they aren't equivalent. This is only supported for array types,
5380 particularly to describe Fortran arrays, which have an array descriptor in
5381 addition to the array data. Alternatively it can also be DIVariable which
5382 has the address of the actual raw data. The Fortran language supports pointer
5383 arrays which can be attached to actual arrays, this attachment between pointer
5384 and pointee is called association.  The optional ``associated`` is a
5385 DIExpression that describes whether the pointer array is currently associated.
5386 The optional ``allocated`` is a DIExpression that describes whether the
5387 allocatable array is currently allocated.  The optional ``rank`` is a
5388 DIExpression that describes the rank (number of dimensions) of fortran assumed
5389 rank array (rank is known at runtime).
5390
5391 For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5392 descriptors <DIEnumerator>`, each representing the definition of an enumeration
5393 value for the set. All enumeration type descriptors are collected in the
5394 ``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5395
5396 For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5397 ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5398 <DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5399 ``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5400 ``isDefinition: false``.
5401
5402 .. _DISubrange:
5403
5404 DISubrange
5405 """"""""""
5406
5407 ``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5408 :ref:`DICompositeType`.
5409
5410 - ``count: -1`` indicates an empty array.
5411 - ``count: !10`` describes the count with a :ref:`DILocalVariable`.
5412 - ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
5413
5414 .. code-block:: text
5415
5416     !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5417     !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5418     !2 = !DISubrange(count: -1) ; empty array.
5419
5420     ; Scopes used in rest of example
5421     !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5422     !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5423     !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5424
5425     ; Use of local variable as count value
5426     !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5427     !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5428     !11 = !DISubrange(count: !10, lowerBound: 0)
5429
5430     ; Use of global variable as count value
5431     !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5432     !13 = !DISubrange(count: !12, lowerBound: 0)
5433
5434 .. _DIEnumerator:
5435
5436 DIEnumerator
5437 """"""""""""
5438
5439 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5440 variants of :ref:`DICompositeType`.
5441
5442 .. code-block:: text
5443
5444     !0 = !DIEnumerator(name: "SixKind", value: 7)
5445     !1 = !DIEnumerator(name: "SevenKind", value: 7)
5446     !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5447
5448 DITemplateTypeParameter
5449 """""""""""""""""""""""
5450
5451 ``DITemplateTypeParameter`` nodes represent type parameters to generic source
5452 language constructs. They are used (optionally) in :ref:`DICompositeType` and
5453 :ref:`DISubprogram` ``templateParams:`` fields.
5454
5455 .. code-block:: text
5456
5457     !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5458
5459 DITemplateValueParameter
5460 """"""""""""""""""""""""
5461
5462 ``DITemplateValueParameter`` nodes represent value parameters to generic source
5463 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5464 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5465 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5466 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5467
5468 .. code-block:: text
5469
5470     !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5471
5472 DINamespace
5473 """""""""""
5474
5475 ``DINamespace`` nodes represent namespaces in the source language.
5476
5477 .. code-block:: text
5478
5479     !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5480
5481 .. _DIGlobalVariable:
5482
5483 DIGlobalVariable
5484 """"""""""""""""
5485
5486 ``DIGlobalVariable`` nodes represent global variables in the source language.
5487
5488 .. code-block:: text
5489
5490     @foo = global i32, !dbg !0
5491     !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5492     !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5493                            file: !3, line: 7, type: !4, isLocal: true,
5494                            isDefinition: false, declaration: !5)
5495
5496
5497 DIGlobalVariableExpression
5498 """"""""""""""""""""""""""
5499
5500 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5501 with a :ref:`DIExpression`.
5502
5503 .. code-block:: text
5504
5505     @lower = global i32, !dbg !0
5506     @upper = global i32, !dbg !1
5507     !0 = !DIGlobalVariableExpression(
5508              var: !2,
5509              expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5510              )
5511     !1 = !DIGlobalVariableExpression(
5512              var: !2,
5513              expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5514              )
5515     !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5516                            file: !4, line: 8, type: !5, declaration: !6)
5517
5518 All global variable expressions should be referenced by the `globals:` field of
5519 a :ref:`compile unit <DICompileUnit>`.
5520
5521 .. _DISubprogram:
5522
5523 DISubprogram
5524 """"""""""""
5525
5526 ``DISubprogram`` nodes represent functions from the source language. A distinct
5527 ``DISubprogram`` may be attached to a function definition using ``!dbg``
5528 metadata. A unique ``DISubprogram`` may be attached to a function declaration
5529 used for call site debug info. The ``retainedNodes:`` field is a list of
5530 :ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
5531 retained, even if their IR counterparts are optimized out of the IR. The
5532 ``type:`` field must point at an :ref:`DISubroutineType`.
5533
5534 .. _DISubprogramDeclaration:
5535
5536 When ``isDefinition: false``, subprograms describe a declaration in the type
5537 tree as opposed to a definition of a function.  If the scope is a composite
5538 type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
5539 then the subprogram declaration is uniqued based only on its ``linkageName:``
5540 and ``scope:``.
5541
5542 .. code-block:: text
5543
5544     define void @_Z3foov() !dbg !0 {
5545       ...
5546     }
5547
5548     !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
5549                                 file: !2, line: 7, type: !3, isLocal: true,
5550                                 isDefinition: true, scopeLine: 8,
5551                                 containingType: !4,
5552                                 virtuality: DW_VIRTUALITY_pure_virtual,
5553                                 virtualIndex: 10, flags: DIFlagPrototyped,
5554                                 isOptimized: true, unit: !5, templateParams: !6,
5555                                 declaration: !7, retainedNodes: !8,
5556                                 thrownTypes: !9)
5557
5558 .. _DILexicalBlock:
5559
5560 DILexicalBlock
5561 """"""""""""""
5562
5563 ``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
5564 <DISubprogram>`. The line number and column numbers are used to distinguish
5565 two lexical blocks at same depth. They are valid targets for ``scope:``
5566 fields.
5567
5568 .. code-block:: text
5569
5570     !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
5571
5572 Usually lexical blocks are ``distinct`` to prevent node merging based on
5573 operands.
5574
5575 .. _DILexicalBlockFile:
5576
5577 DILexicalBlockFile
5578 """"""""""""""""""
5579
5580 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a
5581 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
5582 indicate textual inclusion, or the ``discriminator:`` field can be used to
5583 discriminate between control flow within a single block in the source language.
5584
5585 .. code-block:: text
5586
5587     !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
5588     !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
5589     !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
5590
5591 .. _DILocation:
5592
5593 DILocation
5594 """"""""""
5595
5596 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is
5597 mandatory, and points at an :ref:`DILexicalBlockFile`, an
5598 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
5599
5600 .. code-block:: text
5601
5602     !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
5603
5604 .. _DILocalVariable:
5605
5606 DILocalVariable
5607 """""""""""""""
5608
5609 ``DILocalVariable`` nodes represent local variables in the source language. If
5610 the ``arg:`` field is set to non-zero, then this variable is a subprogram
5611 parameter, and it will be included in the ``retainedNodes:`` field of its
5612 :ref:`DISubprogram`.
5613
5614 .. code-block:: text
5615
5616     !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
5617                           type: !3, flags: DIFlagArtificial)
5618     !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
5619                           type: !3)
5620     !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
5621
5622 .. _DIExpression:
5623
5624 DIExpression
5625 """"""""""""
5626
5627 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
5628 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
5629 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
5630 referenced LLVM variable relates to the source language variable. Debug
5631 intrinsics are interpreted left-to-right: start by pushing the value/address
5632 operand of the intrinsic onto a stack, then repeatedly push and evaluate
5633 opcodes from the DIExpression until the final variable description is produced.
5634
5635 The current supported opcode vocabulary is limited:
5636
5637 - ``DW_OP_deref`` dereferences the top of the expression stack.
5638 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds
5639   them together and appends the result to the expression stack.
5640 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
5641   the last entry from the second last entry and appends the result to the
5642   expression stack.
5643 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
5644 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
5645   here, respectively) of the variable fragment from the working expression. Note
5646   that contrary to DW_OP_bit_piece, the offset is describing the location
5647   within the described source variable.
5648 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
5649   (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
5650   expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
5651   that references a base type constructed from the supplied values.
5652 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
5653   optionally applied to the pointer. The memory tag is derived from the
5654   given tag offset in an implementation-defined manner.
5655 - ``DW_OP_swap`` swaps top two stack entries.
5656 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
5657   of the stack is treated as an address. The second stack entry is treated as an
5658   address space identifier.
5659 - ``DW_OP_stack_value`` marks a constant value.
5660 - ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the
5661   beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE``
5662   instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a
5663   register is lowered to a ``DW_OP_entry_value [reg]``, pushing the
5664   value the register had upon function entry onto the stack.  The next
5665   ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
5666   block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value,
5667   1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an
5668   expression where the entry value of the debug value instruction's
5669   value/address operand is pushed to the stack, and is added
5670   with 123. Due to framework limitations ``N`` can currently only
5671   be 1.
5672
5673   The operation is introduced by the ``LiveDebugValues`` pass, which
5674   applies it only to function parameters that are unmodified
5675   throughout the function. Support is limited to simple register
5676   location descriptions, or as indirect locations (e.g., when a struct
5677   is passed-by-value to a callee via a pointer to a temporary copy
5678   made in the caller). The entry value op is also introduced by the
5679   ``AsmPrinter`` pass when a call site parameter value
5680   (``DW_AT_call_site_parameter_value``) is represented as entry value
5681   of the parameter.
5682 - ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
5683   value, such as one that calculates the sum of two registers. This is always
5684   used in combination with an ordered list of values, such that
5685   ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For
5686   example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
5687   DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
5688   ``%reg1 - reg2``. This list of values should be provided by the containing
5689   intrinsic/instruction.
5690 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
5691   signed offset of the specified register. The opcode is only generated by the
5692   ``AsmPrinter`` pass to describe call site parameter value which requires an
5693   expression over two registers.
5694 - ``DW_OP_push_object_address`` pushes the address of the object which can then
5695   serve as a descriptor in subsequent calculation. This opcode can be used to
5696   calculate bounds of fortran allocatable array which has array descriptors.
5697 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
5698   of the stack. This opcode can be used to calculate bounds of fortran assumed
5699   rank array which has rank known at run time and current dimension number is
5700   implicitly first element of the stack.
5701 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
5702   be used to represent pointer variables which are optimized out but the value
5703   it points to is known. This operator is required as it is different than DWARF
5704   operator DW_OP_implicit_pointer in representation and specification (number
5705   and types of operands) and later can not be used as multiple level.
5706
5707 .. code-block:: text
5708
5709     IR for "*ptr = 4;"
5710     --------------
5711     call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
5712     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5713                            type: !18)
5714     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5715     !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5716     !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
5717
5718     IR for "**ptr = 4;"
5719     --------------
5720     call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
5721     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5722                            type: !18)
5723     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5724     !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
5725     !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5726     !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
5727                         DW_OP_LLVM_implicit_pointer))
5728
5729 DWARF specifies three kinds of simple location descriptions: Register, memory,
5730 and implicit location descriptions.  Note that a location description is
5731 defined over certain ranges of a program, i.e the location of a variable may
5732 change over the course of the program. Register and memory location
5733 descriptions describe the *concrete location* of a source variable (in the
5734 sense that a debugger might modify its value), whereas *implicit locations*
5735 describe merely the actual *value* of a source variable which might not exist
5736 in registers or in memory (see ``DW_OP_stack_value``).
5737
5738 A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect
5739 value (the address) of a source variable. The first operand of the intrinsic
5740 must be an address of some kind. A DIExpression attached to the intrinsic
5741 refines this address to produce a concrete location for the source variable.
5742
5743 A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
5744 The first operand of the intrinsic may be a direct or indirect value. A
5745 DIExpression attached to the intrinsic refines the first operand to produce a
5746 direct value. For example, if the first operand is an indirect value, it may be
5747 necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
5748 valid debug intrinsic.
5749
5750 .. note::
5751
5752    A DIExpression is interpreted in the same way regardless of which kind of
5753    debug intrinsic it's attached to.
5754
5755 .. code-block:: text
5756
5757     !0 = !DIExpression(DW_OP_deref)
5758     !1 = !DIExpression(DW_OP_plus_uconst, 3)
5759     !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
5760     !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
5761     !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
5762     !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
5763     !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
5764
5765 DIArgList
5766 """"""""""""
5767
5768 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
5769 used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
5770 ``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
5771 ``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
5772 within a function, it must only be used as a function argument, must always be
5773 inlined, and cannot appear in named metadata.
5774
5775 .. code-block:: text
5776
5777     llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
5778                    metadata !16,
5779                    metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
5780
5781 DIFlags
5782 """""""""""""""
5783
5784 These flags encode various properties of DINodes.
5785
5786 The `ExportSymbols` flag marks a class, struct or union whose members
5787 may be referenced as if they were defined in the containing class or
5788 union. This flag is used to decide whether the DW_AT_export_symbols can
5789 be used for the structure type.
5790
5791 DIObjCProperty
5792 """"""""""""""
5793
5794 ``DIObjCProperty`` nodes represent Objective-C property nodes.
5795
5796 .. code-block:: text
5797
5798     !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
5799                          getter: "getFoo", attributes: 7, type: !2)
5800
5801 DIImportedEntity
5802 """"""""""""""""
5803
5804 ``DIImportedEntity`` nodes represent entities (such as modules) imported into a
5805 compile unit. The ``elements`` field is a list of renamed entities (such as
5806 variables and subprograms) in the imported entity (such as module).
5807
5808 .. code-block:: text
5809
5810    !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
5811                           entity: !1, line: 7, elements: !3)
5812    !3 = !{!4}
5813    !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
5814                           entity: !5, line: 7)
5815
5816 DIMacro
5817 """""""
5818
5819 ``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
5820 The ``name:`` field is the macro identifier, followed by macro parameters when
5821 defining a function-like macro, and the ``value`` field is the token-string
5822 used to expand the macro identifier.
5823
5824 .. code-block:: text
5825
5826    !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
5827                  value: "((x) + 1)")
5828    !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
5829
5830 DIMacroFile
5831 """""""""""
5832
5833 ``DIMacroFile`` nodes represent inclusion of source files.
5834 The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
5835 appear in the included source file.
5836
5837 .. code-block:: text
5838
5839    !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
5840                      nodes: !3)
5841
5842 .. _DILabel:
5843
5844 DILabel
5845 """""""
5846
5847 ``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
5848 a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
5849 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
5850 The ``name:`` field is the label identifier. The ``file:`` field is the
5851 :ref:`DIFile` the label is present in. The ``line:`` field is the source line
5852 within the file where the label is declared.
5853
5854 .. code-block:: text
5855
5856   !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
5857
5858 '``tbaa``' Metadata
5859 ^^^^^^^^^^^^^^^^^^^
5860
5861 In LLVM IR, memory does not have types, so LLVM's own type system is not
5862 suitable for doing type based alias analysis (TBAA). Instead, metadata is
5863 added to the IR to describe a type system of a higher level language. This
5864 can be used to implement C/C++ strict type aliasing rules, but it can also
5865 be used to implement custom alias analysis behavior for other languages.
5866
5867 This description of LLVM's TBAA system is broken into two parts:
5868 :ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
5869 :ref:`Representation<tbaa_node_representation>` talks about the metadata
5870 encoding of various entities.
5871
5872 It is always possible to trace any TBAA node to a "root" TBAA node (details
5873 in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
5874 nodes with different roots have an unknown aliasing relationship, and LLVM
5875 conservatively infers ``MayAlias`` between them.  The rules mentioned in
5876 this section only pertain to TBAA nodes living under the same root.
5877
5878 .. _tbaa_node_semantics:
5879
5880 Semantics
5881 """""""""
5882
5883 The TBAA metadata system, referred to as "struct path TBAA" (not to be
5884 confused with ``tbaa.struct``), consists of the following high level
5885 concepts: *Type Descriptors*, further subdivided into scalar type
5886 descriptors and struct type descriptors; and *Access Tags*.
5887
5888 **Type descriptors** describe the type system of the higher level language
5889 being compiled.  **Scalar type descriptors** describe types that do not
5890 contain other types.  Each scalar type has a parent type, which must also
5891 be a scalar type or the TBAA root.  Via this parent relation, scalar types
5892 within a TBAA root form a tree.  **Struct type descriptors** denote types
5893 that contain a sequence of other type descriptors, at known offsets.  These
5894 contained type descriptors can either be struct type descriptors themselves
5895 or scalar type descriptors.
5896
5897 **Access tags** are metadata nodes attached to load and store instructions.
5898 Access tags use type descriptors to describe the *location* being accessed
5899 in terms of the type system of the higher level language.  Access tags are
5900 tuples consisting of a base type, an access type and an offset.  The base
5901 type is a scalar type descriptor or a struct type descriptor, the access
5902 type is a scalar type descriptor, and the offset is a constant integer.
5903
5904 The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
5905 things:
5906
5907  * If ``BaseTy`` is a struct type, the tag describes a memory access (load
5908    or store) of a value of type ``AccessTy`` contained in the struct type
5909    ``BaseTy`` at offset ``Offset``.
5910
5911  * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
5912    ``AccessTy`` must be the same; and the access tag describes a scalar
5913    access with scalar type ``AccessTy``.
5914
5915 We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
5916 tuples this way:
5917
5918  * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
5919    ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
5920    described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
5921    undefined if ``Offset`` is non-zero.
5922
5923  * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
5924    is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
5925    ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
5926    to be relative within that inner type.
5927
5928 A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
5929 aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
5930 Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
5931 Offset2)`` via the ``Parent`` relation or vice versa.
5932
5933 As a concrete example, the type descriptor graph for the following program
5934
5935 .. code-block:: c
5936
5937     struct Inner {
5938       int i;    // offset 0
5939       float f;  // offset 4
5940     };
5941
5942     struct Outer {
5943       float f;  // offset 0
5944       double d; // offset 4
5945       struct Inner inner_a;  // offset 12
5946     };
5947
5948     void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
5949       outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
5950       outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
5951       outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
5952       *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
5953     }
5954
5955 is (note that in C and C++, ``char`` can be used to access any arbitrary
5956 type):
5957
5958 .. code-block:: text
5959
5960     Root = "TBAA Root"
5961     CharScalarTy = ("char", Root, 0)
5962     FloatScalarTy = ("float", CharScalarTy, 0)
5963     DoubleScalarTy = ("double", CharScalarTy, 0)
5964     IntScalarTy = ("int", CharScalarTy, 0)
5965     InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
5966     OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
5967                      (InnerStructTy, 12)}
5968
5969
5970 with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
5971 0)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
5972 ``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
5973
5974 .. _tbaa_node_representation:
5975
5976 Representation
5977 """"""""""""""
5978
5979 The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
5980 with exactly one ``MDString`` operand.
5981
5982 Scalar type descriptors are represented as an ``MDNode`` s with two
5983 operands.  The first operand is an ``MDString`` denoting the name of the
5984 struct type.  LLVM does not assign meaning to the value of this operand, it
5985 only cares about it being an ``MDString``.  The second operand is an
5986 ``MDNode`` which points to the parent for said scalar type descriptor,
5987 which is either another scalar type descriptor or the TBAA root.  Scalar
5988 type descriptors can have an optional third argument, but that must be the
5989 constant integer zero.
5990
5991 Struct type descriptors are represented as ``MDNode`` s with an odd number
5992 of operands greater than 1.  The first operand is an ``MDString`` denoting
5993 the name of the struct type.  Like in scalar type descriptors the actual
5994 value of this name operand is irrelevant to LLVM.  After the name operand,
5995 the struct type descriptors have a sequence of alternating ``MDNode`` and
5996 ``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
5997 an ``MDNode``, denotes a contained field, and the 2N th operand, a
5998 ``ConstantInt``, is the offset of the said contained field.  The offsets
5999 must be in non-decreasing order.
6000
6001 Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
6002 The first operand is an ``MDNode`` pointing to the node representing the
6003 base type.  The second operand is an ``MDNode`` pointing to the node
6004 representing the access type.  The third operand is a ``ConstantInt`` that
6005 states the offset of the access.  If a fourth field is present, it must be
6006 a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
6007 that the location being accessed is "constant" (meaning
6008 ``pointsToConstantMemory`` should return true; see `other useful
6009 AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
6010 the access type and the base type of an access tag must be the same, and
6011 that is the TBAA root of the access tag.
6012
6013 '``tbaa.struct``' Metadata
6014 ^^^^^^^^^^^^^^^^^^^^^^^^^^
6015
6016 The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6017 aggregate assignment operations in C and similar languages, however it
6018 is defined to copy a contiguous region of memory, which is more than
6019 strictly necessary for aggregate types which contain holes due to
6020 padding. Also, it doesn't contain any TBAA information about the fields
6021 of the aggregate.
6022
6023 ``!tbaa.struct`` metadata can describe which memory subregions in a
6024 memcpy are padding and what the TBAA tags of the struct are.
6025
6026 The current metadata format is very simple. ``!tbaa.struct`` metadata
6027 nodes are a list of operands which are in conceptual groups of three.
6028 For each group of three, the first operand gives the byte offset of a
6029 field in bytes, the second gives its size in bytes, and the third gives
6030 its tbaa tag. e.g.:
6031
6032 .. code-block:: llvm
6033
6034     !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6035
6036 This describes a struct with two fields. The first is at offset 0 bytes
6037 with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6038 and has size 4 bytes and has tbaa tag !2.
6039
6040 Note that the fields need not be contiguous. In this example, there is a
6041 4 byte gap between the two fields. This gap represents padding which
6042 does not carry useful data and need not be preserved.
6043
6044 '``noalias``' and '``alias.scope``' Metadata
6045 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6046
6047 ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6048 noalias memory-access sets. This means that some collection of memory access
6049 instructions (loads, stores, memory-accessing calls, etc.) that carry
6050 ``noalias`` metadata can specifically be specified not to alias with some other
6051 collection of memory access instructions that carry ``alias.scope`` metadata.
6052 Each type of metadata specifies a list of scopes where each scope has an id and
6053 a domain.
6054
6055 When evaluating an aliasing query, if for some domain, the set
6056 of scopes with that domain in one instruction's ``alias.scope`` list is a
6057 subset of (or equal to) the set of scopes for that domain in another
6058 instruction's ``noalias`` list, then the two memory accesses are assumed not to
6059 alias.
6060
6061 Because scopes in one domain don't affect scopes in other domains, separate
6062 domains can be used to compose multiple independent noalias sets.  This is
6063 used for example during inlining.  As the noalias function parameters are
6064 turned into noalias scope metadata, a new domain is used every time the
6065 function is inlined.
6066
6067 The metadata identifying each domain is itself a list containing one or two
6068 entries. The first entry is the name of the domain. Note that if the name is a
6069 string then it can be combined across functions and translation units. A
6070 self-reference can be used to create globally unique domain names. A
6071 descriptive string may optionally be provided as a second list entry.
6072
6073 The metadata identifying each scope is also itself a list containing two or
6074 three entries. The first entry is the name of the scope. Note that if the name
6075 is a string then it can be combined across functions and translation units. A
6076 self-reference can be used to create globally unique scope names. A metadata
6077 reference to the scope's domain is the second entry. A descriptive string may
6078 optionally be provided as a third list entry.
6079
6080 For example,
6081
6082 .. code-block:: llvm
6083
6084     ; Two scope domains:
6085     !0 = !{!0}
6086     !1 = !{!1}
6087
6088     ; Some scopes in these domains:
6089     !2 = !{!2, !0}
6090     !3 = !{!3, !0}
6091     !4 = !{!4, !1}
6092
6093     ; Some scope lists:
6094     !5 = !{!4} ; A list containing only scope !4
6095     !6 = !{!4, !3, !2}
6096     !7 = !{!3}
6097
6098     ; These two instructions don't alias:
6099     %0 = load float, float* %c, align 4, !alias.scope !5
6100     store float %0, float* %arrayidx.i, align 4, !noalias !5
6101
6102     ; These two instructions also don't alias (for domain !1, the set of scopes
6103     ; in the !alias.scope equals that in the !noalias list):
6104     %2 = load float, float* %c, align 4, !alias.scope !5
6105     store float %2, float* %arrayidx.i2, align 4, !noalias !6
6106
6107     ; These two instructions may alias (for domain !0, the set of scopes in
6108     ; the !noalias list is not a superset of, or equal to, the scopes in the
6109     ; !alias.scope list):
6110     %2 = load float, float* %c, align 4, !alias.scope !6
6111     store float %0, float* %arrayidx.i, align 4, !noalias !7
6112
6113 '``fpmath``' Metadata
6114 ^^^^^^^^^^^^^^^^^^^^^
6115
6116 ``fpmath`` metadata may be attached to any instruction of floating-point
6117 type. It can be used to express the maximum acceptable error in the
6118 result of that instruction, in ULPs, thus potentially allowing the
6119 compiler to use a more efficient but less accurate method of computing
6120 it. ULP is defined as follows:
6121
6122     If ``x`` is a real number that lies between two finite consecutive
6123     floating-point numbers ``a`` and ``b``, without being equal to one
6124     of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6125     distance between the two non-equal finite floating-point numbers
6126     nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6127
6128 The metadata node shall consist of a single positive float type number
6129 representing the maximum relative error, for example:
6130
6131 .. code-block:: llvm
6132
6133     !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6134
6135 .. _range-metadata:
6136
6137 '``range``' Metadata
6138 ^^^^^^^^^^^^^^^^^^^^
6139
6140 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6141 integer types. It expresses the possible ranges the loaded value or the value
6142 returned by the called function at this call site is in. If the loaded or
6143 returned value is not in the specified range, the behavior is undefined. The
6144 ranges are represented with a flattened list of integers. The loaded value or
6145 the value returned is known to be in the union of the ranges defined by each
6146 consecutive pair. Each pair has the following properties:
6147
6148 -  The type must match the type loaded by the instruction.
6149 -  The pair ``a,b`` represents the range ``[a,b)``.
6150 -  Both ``a`` and ``b`` are constants.
6151 -  The range is allowed to wrap.
6152 -  The range should not represent the full or empty set. That is,
6153    ``a!=b``.
6154
6155 In addition, the pairs must be in signed order of the lower bound and
6156 they must be non-contiguous.
6157
6158 Examples:
6159
6160 .. code-block:: llvm
6161
6162       %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
6163       %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6164       %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
6165       %d = invoke i8 @bar() to label %cont
6166              unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6167     ...
6168     !0 = !{ i8 0, i8 2 }
6169     !1 = !{ i8 255, i8 2 }
6170     !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6171     !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6172
6173 '``absolute_symbol``' Metadata
6174 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6175
6176 ``absolute_symbol`` metadata may be attached to a global variable
6177 declaration. It marks the declaration as a reference to an absolute symbol,
6178 which causes the backend to use absolute relocations for the symbol even
6179 in position independent code, and expresses the possible ranges that the
6180 global variable's *address* (not its value) is in, in the same format as
6181 ``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6182 may be used to represent the full set.
6183
6184 Example (assuming 64-bit pointers):
6185
6186 .. code-block:: llvm
6187
6188       @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6189       @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6190
6191     ...
6192     !0 = !{ i64 0, i64 256 }
6193     !1 = !{ i64 -1, i64 -1 }
6194
6195 '``callees``' Metadata
6196 ^^^^^^^^^^^^^^^^^^^^^^
6197
6198 ``callees`` metadata may be attached to indirect call sites. If ``callees``
6199 metadata is attached to a call site, and any callee is not among the set of
6200 functions provided by the metadata, the behavior is undefined. The intent of
6201 this metadata is to facilitate optimizations such as indirect-call promotion.
6202 For example, in the code below, the call instruction may only target the
6203 ``add`` or ``sub`` functions:
6204
6205 .. code-block:: llvm
6206
6207     %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6208
6209     ...
6210     !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
6211
6212 '``callback``' Metadata
6213 ^^^^^^^^^^^^^^^^^^^^^^^
6214
6215 ``callback`` metadata may be attached to a function declaration, or definition.
6216 (Call sites are excluded only due to the lack of a use case.) For ease of
6217 exposition, we'll refer to the function annotated w/ metadata as a broker
6218 function. The metadata describes how the arguments of a call to the broker are
6219 in turn passed to the callback function specified by the metadata. Thus, the
6220 ``callback`` metadata provides a partial description of a call site inside the
6221 broker function with regards to the arguments of a call to the broker. The only
6222 semantic restriction on the broker function itself is that it is not allowed to
6223 inspect or modify arguments referenced in the ``callback`` metadata as
6224 pass-through to the callback function.
6225
6226 The broker is not required to actually invoke the callback function at runtime.
6227 However, the assumptions about not inspecting or modifying arguments that would
6228 be passed to the specified callback function still hold, even if the callback
6229 function is not dynamically invoked. The broker is allowed to invoke the
6230 callback function more than once per invocation of the broker. The broker is
6231 also allowed to invoke (directly or indirectly) the function passed as a
6232 callback through another use. Finally, the broker is also allowed to relay the
6233 callback callee invocation to a different thread.
6234
6235 The metadata is structured as follows: At the outer level, ``callback``
6236 metadata is a list of ``callback`` encodings. Each encoding starts with a
6237 constant ``i64`` which describes the argument position of the callback function
6238 in the call to the broker. The following elements, except the last, describe
6239 what arguments are passed to the callback function. Each element is again an
6240 ``i64`` constant identifying the argument of the broker that is passed through,
6241 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6242 they are listed has to be the same in which they are passed to the callback
6243 callee. The last element of the encoding is a boolean which specifies how
6244 variadic arguments of the broker are handled. If it is true, all variadic
6245 arguments of the broker are passed through to the callback function *after* the
6246 arguments encoded explicitly before.
6247
6248 In the code below, the ``pthread_create`` function is marked as a broker
6249 through the ``!callback !1`` metadata. In the example, there is only one
6250 callback encoding, namely ``!2``, associated with the broker. This encoding
6251 identifies the callback function as the second argument of the broker (``i64
6252 2``) and the sole argument of the callback function as the third one of the
6253 broker function (``i64 3``).
6254
6255 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6256    error if the below is set to highlight as 'llvm', despite that we
6257    have misc.highlighting_failure set?
6258
6259 .. code-block:: text
6260
6261     declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*)
6262
6263     ...
6264     !2 = !{i64 2, i64 3, i1 false}
6265     !1 = !{!2}
6266
6267 Another example is shown below. The callback callee is the second argument of
6268 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6269 values (each identified by a ``i64 -1``) and afterwards all
6270 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6271 final ``i1 true``).
6272
6273 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6274    error if the below is set to highlight as 'llvm', despite that we
6275    have misc.highlighting_failure set?
6276
6277 .. code-block:: text
6278
6279     declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...)
6280
6281     ...
6282     !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6283     !0 = !{!1}
6284
6285
6286 '``unpredictable``' Metadata
6287 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6288
6289 ``unpredictable`` metadata may be attached to any branch or switch
6290 instruction. It can be used to express the unpredictability of control
6291 flow. Similar to the llvm.expect intrinsic, it may be used to alter
6292 optimizations related to compare and branch instructions. The metadata
6293 is treated as a boolean value; if it exists, it signals that the branch
6294 or switch that it is attached to is completely unpredictable.
6295
6296 .. _md_dereferenceable:
6297
6298 '``dereferenceable``' Metadata
6299 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6300
6301 The existence of the ``!dereferenceable`` metadata on the instruction
6302 tells the optimizer that the value loaded is known to be dereferenceable.
6303 The number of bytes known to be dereferenceable is specified by the integer
6304 value in the metadata node. This is analogous to the ''dereferenceable''
6305 attribute on parameters and return values.
6306
6307 .. _md_dereferenceable_or_null:
6308
6309 '``dereferenceable_or_null``' Metadata
6310 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6311
6312 The existence of the ``!dereferenceable_or_null`` metadata on the
6313 instruction tells the optimizer that the value loaded is known to be either
6314 dereferenceable or null.
6315 The number of bytes known to be dereferenceable is specified by the integer
6316 value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6317 attribute on parameters and return values.
6318
6319 .. _llvm.loop:
6320
6321 '``llvm.loop``'
6322 ^^^^^^^^^^^^^^^
6323
6324 It is sometimes useful to attach information to loop constructs. Currently,
6325 loop metadata is implemented as metadata attached to the branch instruction
6326 in the loop latch block. The loop metadata node is a list of
6327 other metadata nodes, each representing a property of the loop. Usually,
6328 the first item of the property node is a string. For example, the
6329 ``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6330 unroller:
6331
6332 .. code-block:: llvm
6333
6334       br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6335     ...
6336     !0 = !{!0, !1, !2}
6337     !1 = !{!"llvm.loop.unroll.enable"}
6338     !2 = !{!"llvm.loop.unroll.count", i32 4}
6339
6340 For legacy reasons, the first item of a loop metadata node must be a
6341 reference to itself. Before the advent of the 'distinct' keyword, this
6342 forced the preservation of otherwise identical metadata nodes. Since
6343 the loop-metadata node can be attached to multiple nodes, the 'distinct'
6344 keyword has become unnecessary.
6345
6346 Prior to the property nodes, one or two ``DILocation`` (debug location)
6347 nodes can be present in the list. The first, if present, identifies the
6348 source-code location where the loop begins. The second, if present,
6349 identifies the source-code location where the loop ends.
6350
6351 Loop metadata nodes cannot be used as unique identifiers. They are
6352 neither persistent for the same loop through transformations nor
6353 necessarily unique to just one loop.
6354
6355 '``llvm.loop.disable_nonforced``'
6356 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6357
6358 This metadata disables all optional loop transformations unless
6359 explicitly instructed using other transformation metadata such as
6360 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6361 whether a transformation is profitable. The purpose is to avoid that the
6362 loop is transformed to a different loop before an explicitly requested
6363 (forced) transformation is applied. For instance, loop fusion can make
6364 other transformations impossible. Mandatory loop canonicalizations such
6365 as loop rotation are still applied.
6366
6367 It is recommended to use this metadata in addition to any llvm.loop.*
6368 transformation directive. Also, any loop should have at most one
6369 directive applied to it (and a sequence of transformations built using
6370 followup-attributes). Otherwise, which transformation will be applied
6371 depends on implementation details such as the pass pipeline order.
6372
6373 See :ref:`transformation-metadata` for details.
6374
6375 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6376 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6377
6378 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6379 used to control per-loop vectorization and interleaving parameters such as
6380 vectorization width and interleave count. These metadata should be used in
6381 conjunction with ``llvm.loop`` loop identification metadata. The
6382 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6383 optimization hints and the optimizer will only interleave and vectorize loops if
6384 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6385 which contains information about loop-carried memory dependencies can be helpful
6386 in determining the safety of these transformations.
6387
6388 '``llvm.loop.interleave.count``' Metadata
6389 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6390
6391 This metadata suggests an interleave count to the loop interleaver.
6392 The first operand is the string ``llvm.loop.interleave.count`` and the
6393 second operand is an integer specifying the interleave count. For
6394 example:
6395
6396 .. code-block:: llvm
6397
6398    !0 = !{!"llvm.loop.interleave.count", i32 4}
6399
6400 Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6401 multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6402 then the interleave count will be determined automatically.
6403
6404 '``llvm.loop.vectorize.enable``' Metadata
6405 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6406
6407 This metadata selectively enables or disables vectorization for the loop. The
6408 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6409 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
6410 0 disables vectorization:
6411
6412 .. code-block:: llvm
6413
6414    !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6415    !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6416
6417 '``llvm.loop.vectorize.predicate.enable``' Metadata
6418 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6419
6420 This metadata selectively enables or disables creating predicated instructions
6421 for the loop, which can enable folding of the scalar epilogue loop into the
6422 main loop. The first operand is the string
6423 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6424 the bit operand value is 1 vectorization is enabled. A value of 0 disables
6425 vectorization:
6426
6427 .. code-block:: llvm
6428
6429    !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6430    !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6431
6432 '``llvm.loop.vectorize.scalable.enable``' Metadata
6433 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6434
6435 This metadata selectively enables or disables scalable vectorization for the
6436 loop, and only has any effect if vectorization for the loop is already enabled.
6437 The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6438 and the second operand is a bit. If the bit operand value is 1 scalable
6439 vectorization is enabled, whereas a value of 0 reverts to the default fixed
6440 width vectorization:
6441
6442 .. code-block:: llvm
6443
6444    !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6445    !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6446
6447 '``llvm.loop.vectorize.width``' Metadata
6448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6449
6450 This metadata sets the target width of the vectorizer. The first
6451 operand is the string ``llvm.loop.vectorize.width`` and the second
6452 operand is an integer specifying the width. For example:
6453
6454 .. code-block:: llvm
6455
6456    !0 = !{!"llvm.loop.vectorize.width", i32 4}
6457
6458 Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6459 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
6460 0 or if the loop does not have this metadata the width will be
6461 determined automatically.
6462
6463 '``llvm.loop.vectorize.followup_vectorized``' Metadata
6464 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6465
6466 This metadata defines which loop attributes the vectorized loop will
6467 have. See :ref:`transformation-metadata` for details.
6468
6469 '``llvm.loop.vectorize.followup_epilogue``' Metadata
6470 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6471
6472 This metadata defines which loop attributes the epilogue will have. The
6473 epilogue is not vectorized and is executed when either the vectorized
6474 loop is not known to preserve semantics (because e.g., it processes two
6475 arrays that are found to alias by a runtime check) or for the last
6476 iterations that do not fill a complete set of vector lanes. See
6477 :ref:`Transformation Metadata <transformation-metadata>` for details.
6478
6479 '``llvm.loop.vectorize.followup_all``' Metadata
6480 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6481
6482 Attributes in the metadata will be added to both the vectorized and
6483 epilogue loop.
6484 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6485
6486 '``llvm.loop.unroll``'
6487 ^^^^^^^^^^^^^^^^^^^^^^
6488
6489 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
6490 optimization hints such as the unroll factor. ``llvm.loop.unroll``
6491 metadata should be used in conjunction with ``llvm.loop`` loop
6492 identification metadata. The ``llvm.loop.unroll`` metadata are only
6493 optimization hints and the unrolling will only be performed if the
6494 optimizer believes it is safe to do so.
6495
6496 '``llvm.loop.unroll.count``' Metadata
6497 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6498
6499 This metadata suggests an unroll factor to the loop unroller. The
6500 first operand is the string ``llvm.loop.unroll.count`` and the second
6501 operand is a positive integer specifying the unroll factor. For
6502 example:
6503
6504 .. code-block:: llvm
6505
6506    !0 = !{!"llvm.loop.unroll.count", i32 4}
6507
6508 If the trip count of the loop is less than the unroll count the loop
6509 will be partially unrolled.
6510
6511 '``llvm.loop.unroll.disable``' Metadata
6512 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6513
6514 This metadata disables loop unrolling. The metadata has a single operand
6515 which is the string ``llvm.loop.unroll.disable``. For example:
6516
6517 .. code-block:: llvm
6518
6519    !0 = !{!"llvm.loop.unroll.disable"}
6520
6521 '``llvm.loop.unroll.runtime.disable``' Metadata
6522 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6523
6524 This metadata disables runtime loop unrolling. The metadata has a single
6525 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
6526
6527 .. code-block:: llvm
6528
6529    !0 = !{!"llvm.loop.unroll.runtime.disable"}
6530
6531 '``llvm.loop.unroll.enable``' Metadata
6532 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6533
6534 This metadata suggests that the loop should be fully unrolled if the trip count
6535 is known at compile time and partially unrolled if the trip count is not known
6536 at compile time. The metadata has a single operand which is the string
6537 ``llvm.loop.unroll.enable``.  For example:
6538
6539 .. code-block:: llvm
6540
6541    !0 = !{!"llvm.loop.unroll.enable"}
6542
6543 '``llvm.loop.unroll.full``' Metadata
6544 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6545
6546 This metadata suggests that the loop should be unrolled fully. The
6547 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
6548 For example:
6549
6550 .. code-block:: llvm
6551
6552    !0 = !{!"llvm.loop.unroll.full"}
6553
6554 '``llvm.loop.unroll.followup``' Metadata
6555 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6556
6557 This metadata defines which loop attributes the unrolled loop will have.
6558 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6559
6560 '``llvm.loop.unroll.followup_remainder``' Metadata
6561 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6562
6563 This metadata defines which loop attributes the remainder loop after
6564 partial/runtime unrolling will have. See
6565 :ref:`Transformation Metadata <transformation-metadata>` for details.
6566
6567 '``llvm.loop.unroll_and_jam``'
6568 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6569
6570 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
6571 above, but affect the unroll and jam pass. In addition any loop with
6572 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
6573 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
6574 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
6575 too.)
6576
6577 The metadata for unroll and jam otherwise is the same as for ``unroll``.
6578 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
6579 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
6580 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
6581 and the normal safety checks will still be performed.
6582
6583 '``llvm.loop.unroll_and_jam.count``' Metadata
6584 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6585
6586 This metadata suggests an unroll and jam factor to use, similarly to
6587 ``llvm.loop.unroll.count``. The first operand is the string
6588 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
6589 specifying the unroll factor. For example:
6590
6591 .. code-block:: llvm
6592
6593    !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
6594
6595 If the trip count of the loop is less than the unroll count the loop
6596 will be partially unroll and jammed.
6597
6598 '``llvm.loop.unroll_and_jam.disable``' Metadata
6599 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6600
6601 This metadata disables loop unroll and jamming. The metadata has a single
6602 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
6603
6604 .. code-block:: llvm
6605
6606    !0 = !{!"llvm.loop.unroll_and_jam.disable"}
6607
6608 '``llvm.loop.unroll_and_jam.enable``' Metadata
6609 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6610
6611 This metadata suggests that the loop should be fully unroll and jammed if the
6612 trip count is known at compile time and partially unrolled if the trip count is
6613 not known at compile time. The metadata has a single operand which is the
6614 string ``llvm.loop.unroll_and_jam.enable``.  For example:
6615
6616 .. code-block:: llvm
6617
6618    !0 = !{!"llvm.loop.unroll_and_jam.enable"}
6619
6620 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
6621 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6622
6623 This metadata defines which loop attributes the outer unrolled loop will
6624 have. See :ref:`Transformation Metadata <transformation-metadata>` for
6625 details.
6626
6627 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
6628 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6629
6630 This metadata defines which loop attributes the inner jammed loop will
6631 have. See :ref:`Transformation Metadata <transformation-metadata>` for
6632 details.
6633
6634 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
6635 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6636
6637 This metadata defines which attributes the epilogue of the outer loop
6638 will have. This loop is usually unrolled, meaning there is no such
6639 loop. This attribute will be ignored in this case. See
6640 :ref:`Transformation Metadata <transformation-metadata>` for details.
6641
6642 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
6643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6644
6645 This metadata defines which attributes the inner loop of the epilogue
6646 will have. The outer epilogue will usually be unrolled, meaning there
6647 can be multiple inner remainder loops. See
6648 :ref:`Transformation Metadata <transformation-metadata>` for details.
6649
6650 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
6651 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6652
6653 Attributes specified in the metadata is added to all
6654 ``llvm.loop.unroll_and_jam.*`` loops. See
6655 :ref:`Transformation Metadata <transformation-metadata>` for details.
6656
6657 '``llvm.loop.licm_versioning.disable``' Metadata
6658 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6659
6660 This metadata indicates that the loop should not be versioned for the purpose
6661 of enabling loop-invariant code motion (LICM). The metadata has a single operand
6662 which is the string ``llvm.loop.licm_versioning.disable``. For example:
6663
6664 .. code-block:: llvm
6665
6666    !0 = !{!"llvm.loop.licm_versioning.disable"}
6667
6668 '``llvm.loop.distribute.enable``' Metadata
6669 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6670
6671 Loop distribution allows splitting a loop into multiple loops.  Currently,
6672 this is only performed if the entire loop cannot be vectorized due to unsafe
6673 memory dependencies.  The transformation will attempt to isolate the unsafe
6674 dependencies into their own loop.
6675
6676 This metadata can be used to selectively enable or disable distribution of the
6677 loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
6678 second operand is a bit. If the bit operand value is 1 distribution is
6679 enabled. A value of 0 disables distribution:
6680
6681 .. code-block:: llvm
6682
6683    !0 = !{!"llvm.loop.distribute.enable", i1 0}
6684    !1 = !{!"llvm.loop.distribute.enable", i1 1}
6685
6686 This metadata should be used in conjunction with ``llvm.loop`` loop
6687 identification metadata.
6688
6689 '``llvm.loop.distribute.followup_coincident``' Metadata
6690 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6691
6692 This metadata defines which attributes extracted loops with no cyclic
6693 dependencies will have (i.e. can be vectorized). See
6694 :ref:`Transformation Metadata <transformation-metadata>` for details.
6695
6696 '``llvm.loop.distribute.followup_sequential``' Metadata
6697 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6698
6699 This metadata defines which attributes the isolated loops with unsafe
6700 memory dependencies will have. See
6701 :ref:`Transformation Metadata <transformation-metadata>` for details.
6702
6703 '``llvm.loop.distribute.followup_fallback``' Metadata
6704 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6705
6706 If loop versioning is necessary, this metadata defined the attributes
6707 the non-distributed fallback version will have. See
6708 :ref:`Transformation Metadata <transformation-metadata>` for details.
6709
6710 '``llvm.loop.distribute.followup_all``' Metadata
6711 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6712
6713 The attributes in this metadata is added to all followup loops of the
6714 loop distribution pass. See
6715 :ref:`Transformation Metadata <transformation-metadata>` for details.
6716
6717 '``llvm.licm.disable``' Metadata
6718 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6719
6720 This metadata indicates that loop-invariant code motion (LICM) should not be
6721 performed on this loop. The metadata has a single operand which is the string
6722 ``llvm.licm.disable``. For example:
6723
6724 .. code-block:: llvm
6725
6726    !0 = !{!"llvm.licm.disable"}
6727
6728 Note that although it operates per loop it isn't given the llvm.loop prefix
6729 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
6730
6731 '``llvm.access.group``' Metadata
6732 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6733
6734 ``llvm.access.group`` metadata can be attached to any instruction that
6735 potentially accesses memory. It can point to a single distinct metadata
6736 node, which we call access group. This node represents all memory access
6737 instructions referring to it via ``llvm.access.group``. When an
6738 instruction belongs to multiple access groups, it can also point to a
6739 list of accesses groups, illustrated by the following example.
6740
6741 .. code-block:: llvm
6742
6743    %val = load i32, i32* %arrayidx, !llvm.access.group !0
6744    ...
6745    !0 = !{!1, !2}
6746    !1 = distinct !{}
6747    !2 = distinct !{}
6748
6749 It is illegal for the list node to be empty since it might be confused
6750 with an access group.
6751
6752 The access group metadata node must be 'distinct' to avoid collapsing
6753 multiple access groups by content. A access group metadata node must
6754 always be empty which can be used to distinguish an access group
6755 metadata node from a list of access groups. Being empty avoids the
6756 situation that the content must be updated which, because metadata is
6757 immutable by design, would required finding and updating all references
6758 to the access group node.
6759
6760 The access group can be used to refer to a memory access instruction
6761 without pointing to it directly (which is not possible in global
6762 metadata). Currently, the only metadata making use of it is
6763 ``llvm.loop.parallel_accesses``.
6764
6765 '``llvm.loop.parallel_accesses``' Metadata
6766 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6767
6768 The ``llvm.loop.parallel_accesses`` metadata refers to one or more
6769 access group metadata nodes (see ``llvm.access.group``). It denotes that
6770 no loop-carried memory dependence exist between it and other instructions
6771 in the loop with this metadata.
6772
6773 Let ``m1`` and ``m2`` be two instructions that both have the
6774 ``llvm.access.group`` metadata to the access group ``g1``, respectively
6775 ``g2`` (which might be identical). If a loop contains both access groups
6776 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
6777 assume that there is no dependency between ``m1`` and ``m2`` carried by
6778 this loop. Instructions that belong to multiple access groups are
6779 considered having this property if at least one of the access groups
6780 matches the ``llvm.loop.parallel_accesses`` list.
6781
6782 If all memory-accessing instructions in a loop have
6783 ``llvm.access.group`` metadata that each refer to one of the access
6784 groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
6785 loop has no loop carried memory dependences and is considered to be a
6786 parallel loop.
6787
6788 Note that if not all memory access instructions belong to an access
6789 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
6790 not be considered trivially parallel. Additional
6791 memory dependence analysis is required to make that determination. As a fail
6792 safe mechanism, this causes loops that were originally parallel to be considered
6793 sequential (if optimization passes that are unaware of the parallel semantics
6794 insert new memory instructions into the loop body).
6795
6796 Example of a loop that is considered parallel due to its correct use of
6797 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
6798 metadata types.
6799
6800 .. code-block:: llvm
6801
6802    for.body:
6803      ...
6804      %val0 = load i32, i32* %arrayidx, !llvm.access.group !1
6805      ...
6806      store i32 %val0, i32* %arrayidx1, !llvm.access.group !1
6807      ...
6808      br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
6809
6810    for.end:
6811    ...
6812    !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
6813    !1 = distinct !{}
6814
6815 It is also possible to have nested parallel loops:
6816
6817 .. code-block:: llvm
6818
6819    outer.for.body:
6820      ...
6821      %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4
6822      ...
6823      br label %inner.for.body
6824
6825    inner.for.body:
6826      ...
6827      %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3
6828      ...
6829      store i32 %val0, i32* %arrayidx2, !llvm.access.group !3
6830      ...
6831      br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
6832
6833    inner.for.end:
6834      ...
6835      store i32 %val1, i32* %arrayidx4, !llvm.access.group !4
6836      ...
6837      br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
6838
6839    outer.for.end:                                          ; preds = %for.body
6840    ...
6841    !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}}     ; metadata for the inner loop
6842    !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
6843    !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
6844    !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
6845
6846 '``llvm.loop.mustprogress``' Metadata
6847 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6848
6849 The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
6850 terminate, unwind, or interact with the environment in an observable way e.g.
6851 via a volatile memory access, I/O, or other synchronization. If such a loop is
6852 not found to interact with the environment in an observable way, the loop may
6853 be removed. This corresponds to the ``mustprogress`` function attribute.
6854
6855 '``irr_loop``' Metadata
6856 ^^^^^^^^^^^^^^^^^^^^^^^
6857
6858 ``irr_loop`` metadata may be attached to the terminator instruction of a basic
6859 block that's an irreducible loop header (note that an irreducible loop has more
6860 than once header basic blocks.) If ``irr_loop`` metadata is attached to the
6861 terminator instruction of a basic block that is not really an irreducible loop
6862 header, the behavior is undefined. The intent of this metadata is to improve the
6863 accuracy of the block frequency propagation. For example, in the code below, the
6864 block ``header0`` may have a loop header weight (relative to the other headers of
6865 the irreducible loop) of 100:
6866
6867 .. code-block:: llvm
6868
6869     header0:
6870     ...
6871     br i1 %cmp, label %t1, label %t2, !irr_loop !0
6872
6873     ...
6874     !0 = !{"loop_header_weight", i64 100}
6875
6876 Irreducible loop header weights are typically based on profile data.
6877
6878 .. _md_invariant.group:
6879
6880 '``invariant.group``' Metadata
6881 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6882
6883 The experimental ``invariant.group`` metadata may be attached to
6884 ``load``/``store`` instructions referencing a single metadata with no entries.
6885 The existence of the ``invariant.group`` metadata on the instruction tells
6886 the optimizer that every ``load`` and ``store`` to the same pointer operand
6887 can be assumed to load or store the same
6888 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
6889 when two pointers are considered the same). Pointers returned by bitcast or
6890 getelementptr with only zero indices are considered the same.
6891
6892 Examples:
6893
6894 .. code-block:: llvm
6895
6896    @unknownPtr = external global i8
6897    ...
6898    %ptr = alloca i8
6899    store i8 42, i8* %ptr, !invariant.group !0
6900    call void @foo(i8* %ptr)
6901
6902    %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
6903    call void @foo(i8* %ptr)
6904
6905    %newPtr = call i8* @getPointer(i8* %ptr)
6906    %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
6907
6908    %unknownValue = load i8, i8* @unknownPtr
6909    store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
6910
6911    call void @foo(i8* %ptr)
6912    %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr)
6913    %d = load i8, i8* %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
6914
6915    ...
6916    declare void @foo(i8*)
6917    declare i8* @getPointer(i8*)
6918    declare i8* @llvm.launder.invariant.group(i8*)
6919
6920    !0 = !{}
6921
6922 The invariant.group metadata must be dropped when replacing one pointer by
6923 another based on aliasing information. This is because invariant.group is tied
6924 to the SSA value of the pointer operand.
6925
6926 .. code-block:: llvm
6927
6928   %v = load i8, i8* %x, !invariant.group !0
6929   ; if %x mustalias %y then we can replace the above instruction with
6930   %v = load i8, i8* %y
6931
6932 Note that this is an experimental feature, which means that its semantics might
6933 change in the future.
6934
6935 '``type``' Metadata
6936 ^^^^^^^^^^^^^^^^^^^
6937
6938 See :doc:`TypeMetadata`.
6939
6940 '``associated``' Metadata
6941 ^^^^^^^^^^^^^^^^^^^^^^^^^
6942
6943 The ``associated`` metadata may be attached to a global variable definition with
6944 a single argument that references a global object (optionally through an alias).
6945
6946 This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
6947 discarding of the global variable in linker GC unless the referenced object is
6948 also discarded. The linker support for this feature is spotty. For best
6949 compatibility, globals carrying this metadata should:
6950
6951 - Be in ``@llvm.compiler.used``.
6952 - If the referenced global variable is in a comdat, be in the same comdat.
6953
6954 ``!associated`` can not express many-to-one relationship. A global variable with
6955 the metadata should generally not be referenced by a function: the function may
6956 be inlined into other functions, leading to more references to the metadata.
6957 Ideally we would want to keep metadata alive as long as any inline location is
6958 alive, but this many-to-one relationship is not representable. Moreover, if the
6959 metadata is retained while the function is discarded, the linker will report an
6960 error of a relocation referencing a discarded section.
6961
6962 The metadata is often used with an explicit section consisting of valid C
6963 identifiers so that the runtime can find the metadata section with
6964 linker-defined encapsulation symbols ``__start_<section_name>`` and
6965 ``__stop_<section_name>``.
6966
6967 It does not have any effect on non-ELF targets.
6968
6969 Example:
6970
6971 .. code-block:: text
6972
6973     $a = comdat any
6974     @a = global i32 1, comdat $a
6975     @b = internal global i32 2, comdat $a, section "abc", !associated !0
6976     !0 = !{i32* @a}
6977
6978
6979 '``prof``' Metadata
6980 ^^^^^^^^^^^^^^^^^^^
6981
6982 The ``prof`` metadata is used to record profile data in the IR.
6983 The first operand of the metadata node indicates the profile metadata
6984 type. There are currently 3 types:
6985 :ref:`branch_weights<prof_node_branch_weights>`,
6986 :ref:`function_entry_count<prof_node_function_entry_count>`, and
6987 :ref:`VP<prof_node_VP>`.
6988
6989 .. _prof_node_branch_weights:
6990
6991 branch_weights
6992 """"""""""""""
6993
6994 Branch weight metadata attached to a branch, select, switch or call instruction
6995 represents the likeliness of the associated branch being taken.
6996 For more information, see :doc:`BranchWeightMetadata`.
6997
6998 .. _prof_node_function_entry_count:
6999
7000 function_entry_count
7001 """"""""""""""""""""
7002
7003 Function entry count metadata can be attached to function definitions
7004 to record the number of times the function is called. Used with BFI
7005 information, it is also used to derive the basic block profile count.
7006 For more information, see :doc:`BranchWeightMetadata`.
7007
7008 .. _prof_node_VP:
7009
7010 VP
7011 ""
7012
7013 VP (value profile) metadata can be attached to instructions that have
7014 value profile information. Currently this is indirect calls (where it
7015 records the hottest callees) and calls to memory intrinsics such as memcpy,
7016 memmove, and memset (where it records the hottest byte lengths).
7017
7018 Each VP metadata node contains "VP" string, then a uint32_t value for the value
7019 profiling kind, a uint64_t value for the total number of times the instruction
7020 is executed, followed by uint64_t value and execution count pairs.
7021 The value profiling kind is 0 for indirect call targets and 1 for memory
7022 operations. For indirect call targets, each profile value is a hash
7023 of the callee function name, and for memory operations each value is the
7024 byte length.
7025
7026 Note that the value counts do not need to add up to the total count
7027 listed in the third operand (in practice only the top hottest values
7028 are tracked and reported).
7029
7030 Indirect call example:
7031
7032 .. code-block:: llvm
7033
7034     call void %f(), !prof !1
7035     !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7036
7037 Note that the VP type is 0 (the second operand), which indicates this is
7038 an indirect call value profile data. The third operand indicates that the
7039 indirect call executed 1600 times. The 4th and 6th operands give the
7040 hashes of the 2 hottest target functions' names (this is the same hash used
7041 to represent function names in the profile database), and the 5th and 7th
7042 operands give the execution count that each of the respective prior target
7043 functions was called.
7044
7045 .. _md_annotation:
7046
7047 '``annotation``' Metadata
7048 ^^^^^^^^^^^^^^^^^^^^^^^^^
7049
7050 The ``annotation`` metadata can be used to attach a tuple of annotation strings
7051 to any instruction. This metadata does not impact the semantics of the program
7052 and may only be used to provide additional insight about the program and
7053 transformations to users.
7054
7055 Example:
7056
7057 .. code-block:: text
7058
7059     %a.addr = alloca float*, align 8, !annotation !0
7060     !0 = !{!"auto-init"}
7061
7062 Module Flags Metadata
7063 =====================
7064
7065 Information about the module as a whole is difficult to convey to LLVM's
7066 subsystems. The LLVM IR isn't sufficient to transmit this information.
7067 The ``llvm.module.flags`` named metadata exists in order to facilitate
7068 this. These flags are in the form of key / value pairs --- much like a
7069 dictionary --- making it easy for any subsystem who cares about a flag to
7070 look it up.
7071
7072 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7073 Each triplet has the following form:
7074
7075 -  The first element is a *behavior* flag, which specifies the behavior
7076    when two (or more) modules are merged together, and it encounters two
7077    (or more) metadata with the same ID. The supported behaviors are
7078    described below.
7079 -  The second element is a metadata string that is a unique ID for the
7080    metadata. Each module may only have one flag entry for each unique ID (not
7081    including entries with the **Require** behavior).
7082 -  The third element is the value of the flag.
7083
7084 When two (or more) modules are merged together, the resulting
7085 ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7086 each unique metadata ID string, there will be exactly one entry in the merged
7087 modules ``llvm.module.flags`` metadata table, and the value for that entry will
7088 be determined by the merge behavior flag, as described below. The only exception
7089 is that entries with the *Require* behavior are always preserved.
7090
7091 The following behaviors are supported:
7092
7093 .. list-table::
7094    :header-rows: 1
7095    :widths: 10 90
7096
7097    * - Value
7098      - Behavior
7099
7100    * - 1
7101      - **Error**
7102            Emits an error if two values disagree, otherwise the resulting value
7103            is that of the operands.
7104
7105    * - 2
7106      - **Warning**
7107            Emits a warning if two values disagree. The result value will be the
7108            operand for the flag from the first module being linked, or the max
7109            if the other module uses **Max** (in which case the resulting flag
7110            will be **Max**).
7111
7112    * - 3
7113      - **Require**
7114            Adds a requirement that another module flag be present and have a
7115            specified value after linking is performed. The value must be a
7116            metadata pair, where the first element of the pair is the ID of the
7117            module flag to be restricted, and the second element of the pair is
7118            the value the module flag should be restricted to. This behavior can
7119            be used to restrict the allowable results (via triggering of an
7120            error) of linking IDs with the **Override** behavior.
7121
7122    * - 4
7123      - **Override**
7124            Uses the specified value, regardless of the behavior or value of the
7125            other module. If both modules specify **Override**, but the values
7126            differ, an error will be emitted.
7127
7128    * - 5
7129      - **Append**
7130            Appends the two values, which are required to be metadata nodes.
7131
7132    * - 6
7133      - **AppendUnique**
7134            Appends the two values, which are required to be metadata
7135            nodes. However, duplicate entries in the second list are dropped
7136            during the append operation.
7137
7138    * - 7
7139      - **Max**
7140            Takes the max of the two values, which are required to be integers.
7141
7142 It is an error for a particular unique flag ID to have multiple behaviors,
7143 except in the case of **Require** (which adds restrictions on another metadata
7144 value) or **Override**.
7145
7146 An example of module flags:
7147
7148 .. code-block:: llvm
7149
7150     !0 = !{ i32 1, !"foo", i32 1 }
7151     !1 = !{ i32 4, !"bar", i32 37 }
7152     !2 = !{ i32 2, !"qux", i32 42 }
7153     !3 = !{ i32 3, !"qux",
7154       !{
7155         !"foo", i32 1
7156       }
7157     }
7158     !llvm.module.flags = !{ !0, !1, !2, !3 }
7159
7160 -  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7161    if two or more ``!"foo"`` flags are seen is to emit an error if their
7162    values are not equal.
7163
7164 -  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7165    behavior if two or more ``!"bar"`` flags are seen is to use the value
7166    '37'.
7167
7168 -  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7169    behavior if two or more ``!"qux"`` flags are seen is to emit a
7170    warning if their values are not equal.
7171
7172 -  Metadata ``!3`` has the ID ``!"qux"`` and the value:
7173
7174    ::
7175
7176        !{ !"foo", i32 1 }
7177
7178    The behavior is to emit an error if the ``llvm.module.flags`` does not
7179    contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7180    performed.
7181
7182 Synthesized Functions Module Flags Metadata
7183 -------------------------------------------
7184
7185 These metadata specify the default attributes synthesized functions should have.
7186 These metadata are currently respected by a few instrumentation passes, such as
7187 sanitizers.
7188
7189 These metadata correspond to a few function attributes with significant code
7190 generation behaviors. Function attributes with just optimization purposes
7191 should not be listed because the performance impact of these synthesized
7192 functions is small.
7193
7194 - "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7195   will get the "frame-pointer" function attribute, with value being "none",
7196   "non-leaf", or "all", respectively.
7197 - "uwtable": **Max**. The value can be 0 or 1. If the value is 1, a synthesized
7198   function will get the ``uwtable`` function attribute.
7199
7200 Objective-C Garbage Collection Module Flags Metadata
7201 ----------------------------------------------------
7202
7203 On the Mach-O platform, Objective-C stores metadata about garbage
7204 collection in a special section called "image info". The metadata
7205 consists of a version number and a bitmask specifying what types of
7206 garbage collection are supported (if any) by the file. If two or more
7207 modules are linked together their garbage collection metadata needs to
7208 be merged rather than appended together.
7209
7210 The Objective-C garbage collection module flags metadata consists of the
7211 following key-value pairs:
7212
7213 .. list-table::
7214    :header-rows: 1
7215    :widths: 30 70
7216
7217    * - Key
7218      - Value
7219
7220    * - ``Objective-C Version``
7221      - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7222
7223    * - ``Objective-C Image Info Version``
7224      - **[Required]** --- The version of the image info section. Currently
7225        always 0.
7226
7227    * - ``Objective-C Image Info Section``
7228      - **[Required]** --- The section to place the metadata. Valid values are
7229        ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7230        ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7231        Objective-C ABI version 2.
7232
7233    * - ``Objective-C Garbage Collection``
7234      - **[Required]** --- Specifies whether garbage collection is supported or
7235        not. Valid values are 0, for no garbage collection, and 2, for garbage
7236        collection supported.
7237
7238    * - ``Objective-C GC Only``
7239      - **[Optional]** --- Specifies that only garbage collection is supported.
7240        If present, its value must be 6. This flag requires that the
7241        ``Objective-C Garbage Collection`` flag have the value 2.
7242
7243 Some important flag interactions:
7244
7245 -  If a module with ``Objective-C Garbage Collection`` set to 0 is
7246    merged with a module with ``Objective-C Garbage Collection`` set to
7247    2, then the resulting module has the
7248    ``Objective-C Garbage Collection`` flag set to 0.
7249 -  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7250    merged with a module with ``Objective-C GC Only`` set to 6.
7251
7252 C type width Module Flags Metadata
7253 ----------------------------------
7254
7255 The ARM backend emits a section into each generated object file describing the
7256 options that it was compiled with (in a compiler-independent way) to prevent
7257 linking incompatible objects, and to allow automatic library selection. Some
7258 of these options are not visible at the IR level, namely wchar_t width and enum
7259 width.
7260
7261 To pass this information to the backend, these options are encoded in module
7262 flags metadata, using the following key-value pairs:
7263
7264 .. list-table::
7265    :header-rows: 1
7266    :widths: 30 70
7267
7268    * - Key
7269      - Value
7270
7271    * - short_wchar
7272      - * 0 --- sizeof(wchar_t) == 4
7273        * 1 --- sizeof(wchar_t) == 2
7274
7275    * - short_enum
7276      - * 0 --- Enums are at least as large as an ``int``.
7277        * 1 --- Enums are stored in the smallest integer type which can
7278          represent all of its values.
7279
7280 For example, the following metadata section specifies that the module was
7281 compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7282 enum is the smallest type which can represent all of its values::
7283
7284     !llvm.module.flags = !{!0, !1}
7285     !0 = !{i32 1, !"short_wchar", i32 1}
7286     !1 = !{i32 1, !"short_enum", i32 0}
7287
7288 LTO Post-Link Module Flags Metadata
7289 -----------------------------------
7290
7291 Some optimisations are only when the entire LTO unit is present in the current
7292 module. This is represented by the ``LTOPostLink`` module flags metadata, which
7293 will be created with a value of ``1`` when LTO linking occurs.
7294
7295 Automatic Linker Flags Named Metadata
7296 =====================================
7297
7298 Some targets support embedding of flags to the linker inside individual object
7299 files. Typically this is used in conjunction with language extensions which
7300 allow source files to contain linker command line options, and have these
7301 automatically be transmitted to the linker via object files.
7302
7303 These flags are encoded in the IR using named metadata with the name
7304 ``!llvm.linker.options``. Each operand is expected to be a metadata node
7305 which should be a list of other metadata nodes, each of which should be a
7306 list of metadata strings defining linker options.
7307
7308 For example, the following metadata section specifies two separate sets of
7309 linker options, presumably to link against ``libz`` and the ``Cocoa``
7310 framework::
7311
7312     !0 = !{ !"-lz" }
7313     !1 = !{ !"-framework", !"Cocoa" }
7314     !llvm.linker.options = !{ !0, !1 }
7315
7316 The metadata encoding as lists of lists of options, as opposed to a collapsed
7317 list of options, is chosen so that the IR encoding can use multiple option
7318 strings to specify e.g., a single library, while still having that specifier be
7319 preserved as an atomic element that can be recognized by a target specific
7320 assembly writer or object file emitter.
7321
7322 Each individual option is required to be either a valid option for the target's
7323 linker, or an option that is reserved by the target specific assembly writer or
7324 object file emitter. No other aspect of these options is defined by the IR.
7325
7326 Dependent Libs Named Metadata
7327 =============================
7328
7329 Some targets support embedding of strings into object files to indicate
7330 a set of libraries to add to the link. Typically this is used in conjunction
7331 with language extensions which allow source files to explicitly declare the
7332 libraries they depend on, and have these automatically be transmitted to the
7333 linker via object files.
7334
7335 The list is encoded in the IR using named metadata with the name
7336 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
7337 which should contain a single string operand.
7338
7339 For example, the following metadata section contains two library specifiers::
7340
7341     !0 = !{!"a library specifier"}
7342     !1 = !{!"another library specifier"}
7343     !llvm.dependent-libraries = !{ !0, !1 }
7344
7345 Each library specifier will be handled independently by the consuming linker.
7346 The effect of the library specifiers are defined by the consuming linker.
7347
7348 .. _summary:
7349
7350 ThinLTO Summary
7351 ===============
7352
7353 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
7354 causes the building of a compact summary of the module that is emitted into
7355 the bitcode. The summary is emitted into the LLVM assembly and identified
7356 in syntax by a caret ('``^``').
7357
7358 The summary is parsed into a bitcode output, along with the Module
7359 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
7360 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
7361 summary entries (just as they currently ignore summary entries in a bitcode
7362 input file).
7363
7364 Eventually, the summary will be parsed into a ModuleSummaryIndex object under
7365 the same conditions where summary index is currently built from bitcode.
7366 Specifically, tools that test the Thin Link portion of a ThinLTO compile
7367 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index
7368 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
7369 (this part is not yet implemented, use llvm-as to create a bitcode object
7370 before feeding into thin link tools for now).
7371
7372 There are currently 3 types of summary entries in the LLVM assembly:
7373 :ref:`module paths<module_path_summary>`,
7374 :ref:`global values<gv_summary>`, and
7375 :ref:`type identifiers<typeid_summary>`.
7376
7377 .. _module_path_summary:
7378
7379 Module Path Summary Entry
7380 -------------------------
7381
7382 Each module path summary entry lists a module containing global values included
7383 in the summary. For a single IR module there will be one such entry, but
7384 in a combined summary index produced during the thin link, there will be
7385 one module path entry per linked module with summary.
7386
7387 Example:
7388
7389 .. code-block:: text
7390
7391     ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
7392
7393 The ``path`` field is a string path to the bitcode file, and the ``hash``
7394 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
7395 incremental builds and caching.
7396
7397 .. _gv_summary:
7398
7399 Global Value Summary Entry
7400 --------------------------
7401
7402 Each global value summary entry corresponds to a global value defined or
7403 referenced by a summarized module.
7404
7405 Example:
7406
7407 .. code-block:: text
7408
7409     ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
7410
7411 For declarations, there will not be a summary list. For definitions, a
7412 global value will contain a list of summaries, one per module containing
7413 a definition. There can be multiple entries in a combined summary index
7414 for symbols with weak linkage.
7415
7416 Each ``Summary`` format will depend on whether the global value is a
7417 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
7418 :ref:`alias<alias_summary>`.
7419
7420 .. _function_summary:
7421
7422 Function Summary
7423 ^^^^^^^^^^^^^^^^
7424
7425 If the global value is a function, the ``Summary`` entry will look like:
7426
7427 .. code-block:: text
7428
7429     function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
7430
7431 The ``module`` field includes the summary entry id for the module containing
7432 this definition, and the ``flags`` field contains information such as
7433 the linkage type, a flag indicating whether it is legal to import the
7434 definition, whether it is globally live and whether the linker resolved it
7435 to a local definition (the latter two are populated during the thin link).
7436 The ``insts`` field contains the number of IR instructions in the function.
7437 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
7438 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
7439 :ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
7440
7441 .. _variable_summary:
7442
7443 Global Variable Summary
7444 ^^^^^^^^^^^^^^^^^^^^^^^
7445
7446 If the global value is a variable, the ``Summary`` entry will look like:
7447
7448 .. code-block:: text
7449
7450     variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
7451
7452 The variable entry contains a subset of the fields in a
7453 :ref:`function summary <function_summary>`, see the descriptions there.
7454
7455 .. _alias_summary:
7456
7457 Alias Summary
7458 ^^^^^^^^^^^^^
7459
7460 If the global value is an alias, the ``Summary`` entry will look like:
7461
7462 .. code-block:: text
7463
7464     alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
7465
7466 The ``module`` and ``flags`` fields are as described for a
7467 :ref:`function summary <function_summary>`. The ``aliasee`` field
7468 contains a reference to the global value summary entry of the aliasee.
7469
7470 .. _funcflags_summary:
7471
7472 Function Flags
7473 ^^^^^^^^^^^^^^
7474
7475 The optional ``FuncFlags`` field looks like:
7476
7477 .. code-block:: text
7478
7479     funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
7480
7481 If unspecified, flags are assumed to hold the conservative ``false`` value of
7482 ``0``.
7483
7484 .. _calls_summary:
7485
7486 Calls
7487 ^^^^^
7488
7489 The optional ``Calls`` field looks like:
7490
7491 .. code-block:: text
7492
7493     calls: ((Callee)[, (Callee)]*)
7494
7495 where each ``Callee`` looks like:
7496
7497 .. code-block:: text
7498
7499     callee: ^1[, hotness: None]?[, relbf: 0]?
7500
7501 The ``callee`` refers to the summary entry id of the callee. At most one
7502 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
7503 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
7504 branch frequency relative to the entry frequency, scaled down by 2^8)
7505 may be specified. The defaults are ``Unknown`` and ``0``, respectively.
7506
7507 .. _params_summary:
7508
7509 Params
7510 ^^^^^^
7511
7512 The optional ``Params`` is used by ``StackSafety`` and looks like:
7513
7514 .. code-block:: text
7515
7516     Params: ((Param)[, (Param)]*)
7517
7518 where each ``Param`` describes pointer parameter access inside of the
7519 function and looks like:
7520
7521 .. code-block:: text
7522
7523     param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
7524
7525 where the first ``param`` is the number of the parameter it describes,
7526 ``offset`` is the inclusive range of offsets from the pointer parameter to bytes
7527 which can be accessed by the function. This range does not include accesses by
7528 function calls from ``calls`` list.
7529
7530 where each ``Callee`` describes how parameter is forwarded into other
7531 functions and looks like:
7532
7533 .. code-block:: text
7534
7535     callee: ^3, param: 5, offset: [-3, 3]
7536
7537 The ``callee`` refers to the summary entry id of the callee,  ``param`` is
7538 the number of the callee parameter which points into the callers parameter
7539 with offset known to be inside of the ``offset`` range. ``calls`` will be
7540 consumed and removed by thin link stage to update ``Param::offset`` so it
7541 covers all accesses possible by ``calls``.
7542
7543 Pointer parameter without corresponding ``Param`` is considered unsafe and we
7544 assume that access with any offset is possible.
7545
7546 Example:
7547
7548 If we have the following function:
7549
7550 .. code-block:: text
7551
7552     define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) {
7553       store i32* %1, i32** @x
7554       %5 = getelementptr inbounds i8, i8* %2, i64 5
7555       %6 = load i8, i8* %5
7556       %7 = getelementptr inbounds i8, i8* %2, i8 %3
7557       tail call void @bar(i8 %3, i8* %7)
7558       %8 = load i64, i64* %0
7559       ret i64 %8
7560     }
7561
7562 We can expect the record like this:
7563
7564 .. code-block:: text
7565
7566     params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
7567
7568 The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
7569 so the parameter is either not used for function calls or ``offset`` already
7570 covers all accesses from nested function calls.
7571 Parameter %1 escapes, so access is unknown.
7572 The function itself can access just a single byte of the parameter %2. Additional
7573 access is possible inside of the ``@bar`` or ``^3``. The function adds signed
7574 offset to the pointer and passes the result as the argument %1 into ``^3``.
7575 This record itself does not tell us how ``^3`` will access the parameter.
7576 Parameter %3 is not a pointer.
7577
7578 .. _refs_summary:
7579
7580 Refs
7581 ^^^^
7582
7583 The optional ``Refs`` field looks like:
7584
7585 .. code-block:: text
7586
7587     refs: ((Ref)[, (Ref)]*)
7588
7589 where each ``Ref`` contains a reference to the summary id of the referenced
7590 value (e.g. ``^1``).
7591
7592 .. _typeidinfo_summary:
7593
7594 TypeIdInfo
7595 ^^^^^^^^^^
7596
7597 The optional ``TypeIdInfo`` field, used for
7598 `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7599 looks like:
7600
7601 .. code-block:: text
7602
7603     typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
7604
7605 These optional fields have the following forms:
7606
7607 TypeTests
7608 """""""""
7609
7610 .. code-block:: text
7611
7612     typeTests: (TypeIdRef[, TypeIdRef]*)
7613
7614 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7615 by summary id or ``GUID``.
7616
7617 TypeTestAssumeVCalls
7618 """"""""""""""""""""
7619
7620 .. code-block:: text
7621
7622     typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
7623
7624 Where each VFuncId has the format:
7625
7626 .. code-block:: text
7627
7628     vFuncId: (TypeIdRef, offset: 16)
7629
7630 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7631 by summary id or ``GUID`` preceded by a ``guid:`` tag.
7632
7633 TypeCheckedLoadVCalls
7634 """""""""""""""""""""
7635
7636 .. code-block:: text
7637
7638     typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
7639
7640 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
7641
7642 TypeTestAssumeConstVCalls
7643 """""""""""""""""""""""""
7644
7645 .. code-block:: text
7646
7647     typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
7648
7649 Where each ConstVCall has the format:
7650
7651 .. code-block:: text
7652
7653     (VFuncId, args: (Arg[, Arg]*))
7654
7655 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
7656 and each Arg is an integer argument number.
7657
7658 TypeCheckedLoadConstVCalls
7659 """"""""""""""""""""""""""
7660
7661 .. code-block:: text
7662
7663     typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
7664
7665 Where each ConstVCall has the format described for
7666 ``TypeTestAssumeConstVCalls``.
7667
7668 .. _typeid_summary:
7669
7670 Type ID Summary Entry
7671 ---------------------
7672
7673 Each type id summary entry corresponds to a type identifier resolution
7674 which is generated during the LTO link portion of the compile when building
7675 with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7676 so these are only present in a combined summary index.
7677
7678 Example:
7679
7680 .. code-block:: text
7681
7682     ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
7683
7684 The ``typeTestRes`` gives the type test resolution ``kind`` (which may
7685 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
7686 the ``size-1`` bit width. It is followed by optional flags, which default to 0,
7687 and an optional WpdResolutions (whole program devirtualization resolution)
7688 field that looks like:
7689
7690 .. code-block:: text
7691
7692     wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
7693
7694 where each entry is a mapping from the given byte offset to the whole-program
7695 devirtualization resolution WpdRes, that has one of the following formats:
7696
7697 .. code-block:: text
7698
7699     wpdRes: (kind: branchFunnel)
7700     wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
7701     wpdRes: (kind: indir)
7702
7703 Additionally, each wpdRes has an optional ``resByArg`` field, which
7704 describes the resolutions for calls with all constant integer arguments:
7705
7706 .. code-block:: text
7707
7708     resByArg: (ResByArg[, ResByArg]*)
7709
7710 where ResByArg is:
7711
7712 .. code-block:: text
7713
7714     args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
7715
7716 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
7717 or ``VirtualConstProp``. The ``info`` field is only used if the kind
7718 is ``UniformRetVal`` (indicates the uniform return value), or
7719 ``UniqueRetVal`` (holds the return value associated with the unique vtable
7720 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
7721 not support the use of absolute symbols to store constants.
7722
7723 .. _intrinsicglobalvariables:
7724
7725 Intrinsic Global Variables
7726 ==========================
7727
7728 LLVM has a number of "magic" global variables that contain data that
7729 affect code generation or other IR semantics. These are documented here.
7730 All globals of this sort should have a section specified as
7731 "``llvm.metadata``". This section and all globals that start with
7732 "``llvm.``" are reserved for use by LLVM.
7733
7734 .. _gv_llvmused:
7735
7736 The '``llvm.used``' Global Variable
7737 -----------------------------------
7738
7739 The ``@llvm.used`` global is an array which has
7740 :ref:`appending linkage <linkage_appending>`. This array contains a list of
7741 pointers to named global variables, functions and aliases which may optionally
7742 have a pointer cast formed of bitcast or getelementptr. For example, a legal
7743 use of it is:
7744
7745 .. code-block:: llvm
7746
7747     @X = global i8 4
7748     @Y = global i32 123
7749
7750     @llvm.used = appending global [2 x i8*] [
7751        i8* @X,
7752        i8* bitcast (i32* @Y to i8*)
7753     ], section "llvm.metadata"
7754
7755 If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
7756 and linker are required to treat the symbol as if there is a reference to the
7757 symbol that it cannot see (which is why they have to be named). For example, if
7758 a variable has internal linkage and no references other than that from the
7759 ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
7760 references from inline asms and other things the compiler cannot "see", and
7761 corresponds to "``attribute((used))``" in GNU C.
7762
7763 On some targets, the code generator must emit a directive to the
7764 assembler or object file to prevent the assembler and linker from
7765 removing the symbol.
7766
7767 .. _gv_llvmcompilerused:
7768
7769 The '``llvm.compiler.used``' Global Variable
7770 --------------------------------------------
7771
7772 The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
7773 directive, except that it only prevents the compiler from touching the
7774 symbol. On targets that support it, this allows an intelligent linker to
7775 optimize references to the symbol without being impeded as it would be
7776 by ``@llvm.used``.
7777
7778 This is a rare construct that should only be used in rare circumstances,
7779 and should not be exposed to source languages.
7780
7781 .. _gv_llvmglobalctors:
7782
7783 The '``llvm.global_ctors``' Global Variable
7784 -------------------------------------------
7785
7786 .. code-block:: llvm
7787
7788     %0 = type { i32, void ()*, i8* }
7789     @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
7790
7791 The ``@llvm.global_ctors`` array contains a list of constructor
7792 functions, priorities, and an associated global or function.
7793 The functions referenced by this array will be called in ascending order
7794 of priority (i.e. lowest first) when the module is loaded. The order of
7795 functions with the same priority is not defined.
7796
7797 If the third field is non-null, and points to a global variable
7798 or function, the initializer function will only run if the associated
7799 data from the current module is not discarded.
7800 On ELF the referenced global variable or function must be in a comdat.
7801
7802 .. _llvmglobaldtors:
7803
7804 The '``llvm.global_dtors``' Global Variable
7805 -------------------------------------------
7806
7807 .. code-block:: llvm
7808
7809     %0 = type { i32, void ()*, i8* }
7810     @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
7811
7812 The ``@llvm.global_dtors`` array contains a list of destructor
7813 functions, priorities, and an associated global or function.
7814 The functions referenced by this array will be called in descending
7815 order of priority (i.e. highest first) when the module is unloaded. The
7816 order of functions with the same priority is not defined.
7817
7818 If the third field is non-null, and points to a global variable
7819 or function, the destructor function will only run if the associated
7820 data from the current module is not discarded.
7821 On ELF the referenced global variable or function must be in a comdat.
7822
7823 Instruction Reference
7824 =====================
7825
7826 The LLVM instruction set consists of several different classifications
7827 of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
7828 instructions <binaryops>`, :ref:`bitwise binary
7829 instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
7830 :ref:`other instructions <otherops>`.
7831
7832 .. _terminators:
7833
7834 Terminator Instructions
7835 -----------------------
7836
7837 As mentioned :ref:`previously <functionstructure>`, every basic block in a
7838 program ends with a "Terminator" instruction, which indicates which
7839 block should be executed after the current block is finished. These
7840 terminator instructions typically yield a '``void``' value: they produce
7841 control flow, not values (the one exception being the
7842 ':ref:`invoke <i_invoke>`' instruction).
7843
7844 The terminator instructions are: ':ref:`ret <i_ret>`',
7845 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
7846 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
7847 ':ref:`callbr <i_callbr>`'
7848 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
7849 ':ref:`catchret <i_catchret>`',
7850 ':ref:`cleanupret <i_cleanupret>`',
7851 and ':ref:`unreachable <i_unreachable>`'.
7852
7853 .. _i_ret:
7854
7855 '``ret``' Instruction
7856 ^^^^^^^^^^^^^^^^^^^^^
7857
7858 Syntax:
7859 """""""
7860
7861 ::
7862
7863       ret <type> <value>       ; Return a value from a non-void function
7864       ret void                 ; Return from void function
7865
7866 Overview:
7867 """""""""
7868
7869 The '``ret``' instruction is used to return control flow (and optionally
7870 a value) from a function back to the caller.
7871
7872 There are two forms of the '``ret``' instruction: one that returns a
7873 value and then causes control flow, and one that just causes control
7874 flow to occur.
7875
7876 Arguments:
7877 """"""""""
7878
7879 The '``ret``' instruction optionally accepts a single argument, the
7880 return value. The type of the return value must be a ':ref:`first
7881 class <t_firstclass>`' type.
7882
7883 A function is not :ref:`well formed <wellformed>` if it has a non-void
7884 return type and contains a '``ret``' instruction with no return value or
7885 a return value with a type that does not match its type, or if it has a
7886 void return type and contains a '``ret``' instruction with a return
7887 value.
7888
7889 Semantics:
7890 """"""""""
7891
7892 When the '``ret``' instruction is executed, control flow returns back to
7893 the calling function's context. If the caller is a
7894 ":ref:`call <i_call>`" instruction, execution continues at the
7895 instruction after the call. If the caller was an
7896 ":ref:`invoke <i_invoke>`" instruction, execution continues at the
7897 beginning of the "normal" destination block. If the instruction returns
7898 a value, that value shall set the call or invoke instruction's return
7899 value.
7900
7901 Example:
7902 """"""""
7903
7904 .. code-block:: llvm
7905
7906       ret i32 5                       ; Return an integer value of 5
7907       ret void                        ; Return from a void function
7908       ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
7909
7910 .. _i_br:
7911
7912 '``br``' Instruction
7913 ^^^^^^^^^^^^^^^^^^^^
7914
7915 Syntax:
7916 """""""
7917
7918 ::
7919
7920       br i1 <cond>, label <iftrue>, label <iffalse>
7921       br label <dest>          ; Unconditional branch
7922
7923 Overview:
7924 """""""""
7925
7926 The '``br``' instruction is used to cause control flow to transfer to a
7927 different basic block in the current function. There are two forms of
7928 this instruction, corresponding to a conditional branch and an
7929 unconditional branch.
7930
7931 Arguments:
7932 """"""""""
7933
7934 The conditional branch form of the '``br``' instruction takes a single
7935 '``i1``' value and two '``label``' values. The unconditional form of the
7936 '``br``' instruction takes a single '``label``' value as a target.
7937
7938 Semantics:
7939 """"""""""
7940
7941 Upon execution of a conditional '``br``' instruction, the '``i1``'
7942 argument is evaluated. If the value is ``true``, control flows to the
7943 '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
7944 to the '``iffalse``' ``label`` argument.
7945 If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
7946 behavior.
7947
7948 Example:
7949 """"""""
7950
7951 .. code-block:: llvm
7952
7953     Test:
7954       %cond = icmp eq i32 %a, %b
7955       br i1 %cond, label %IfEqual, label %IfUnequal
7956     IfEqual:
7957       ret i32 1
7958     IfUnequal:
7959       ret i32 0
7960
7961 .. _i_switch:
7962
7963 '``switch``' Instruction
7964 ^^^^^^^^^^^^^^^^^^^^^^^^
7965
7966 Syntax:
7967 """""""
7968
7969 ::
7970
7971       switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
7972
7973 Overview:
7974 """""""""
7975
7976 The '``switch``' instruction is used to transfer control flow to one of
7977 several different places. It is a generalization of the '``br``'
7978 instruction, allowing a branch to occur to one of many possible
7979 destinations.
7980
7981 Arguments:
7982 """"""""""
7983
7984 The '``switch``' instruction uses three parameters: an integer
7985 comparison value '``value``', a default '``label``' destination, and an
7986 array of pairs of comparison value constants and '``label``'s. The table
7987 is not allowed to contain duplicate constant entries.
7988
7989 Semantics:
7990 """"""""""
7991
7992 The ``switch`` instruction specifies a table of values and destinations.
7993 When the '``switch``' instruction is executed, this table is searched
7994 for the given value. If the value is found, control flow is transferred
7995 to the corresponding destination; otherwise, control flow is transferred
7996 to the default destination.
7997 If '``value``' is ``poison`` or ``undef``, this instruction has undefined
7998 behavior.
7999
8000 Implementation:
8001 """""""""""""""
8002
8003 Depending on properties of the target machine and the particular
8004 ``switch`` instruction, this instruction may be code generated in
8005 different ways. For example, it could be generated as a series of
8006 chained conditional branches or with a lookup table.
8007
8008 Example:
8009 """"""""
8010
8011 .. code-block:: llvm
8012
8013      ; Emulate a conditional br instruction
8014      %Val = zext i1 %value to i32
8015      switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
8016
8017      ; Emulate an unconditional br instruction
8018      switch i32 0, label %dest [ ]
8019
8020      ; Implement a jump table:
8021      switch i32 %val, label %otherwise [ i32 0, label %onzero
8022                                          i32 1, label %onone
8023                                          i32 2, label %ontwo ]
8024
8025 .. _i_indirectbr:
8026
8027 '``indirectbr``' Instruction
8028 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8029
8030 Syntax:
8031 """""""
8032
8033 ::
8034
8035       indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
8036
8037 Overview:
8038 """""""""
8039
8040 The '``indirectbr``' instruction implements an indirect branch to a
8041 label within the current function, whose address is specified by
8042 "``address``". Address must be derived from a
8043 :ref:`blockaddress <blockaddress>` constant.
8044
8045 Arguments:
8046 """"""""""
8047
8048 The '``address``' argument is the address of the label to jump to. The
8049 rest of the arguments indicate the full set of possible destinations
8050 that the address may point to. Blocks are allowed to occur multiple
8051 times in the destination list, though this isn't particularly useful.
8052
8053 This destination list is required so that dataflow analysis has an
8054 accurate understanding of the CFG.
8055
8056 Semantics:
8057 """"""""""
8058
8059 Control transfers to the block specified in the address argument. All
8060 possible destination blocks must be listed in the label list, otherwise
8061 this instruction has undefined behavior. This implies that jumps to
8062 labels defined in other functions have undefined behavior as well.
8063 If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8064 behavior.
8065
8066 Implementation:
8067 """""""""""""""
8068
8069 This is typically implemented with a jump through a register.
8070
8071 Example:
8072 """"""""
8073
8074 .. code-block:: llvm
8075
8076      indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
8077
8078 .. _i_invoke:
8079
8080 '``invoke``' Instruction
8081 ^^^^^^^^^^^^^^^^^^^^^^^^
8082
8083 Syntax:
8084 """""""
8085
8086 ::
8087
8088       <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8089                     [operand bundles] to label <normal label> unwind label <exception label>
8090
8091 Overview:
8092 """""""""
8093
8094 The '``invoke``' instruction causes control to transfer to a specified
8095 function, with the possibility of control flow transfer to either the
8096 '``normal``' label or the '``exception``' label. If the callee function
8097 returns with the "``ret``" instruction, control flow will return to the
8098 "normal" label. If the callee (or any indirect callees) returns via the
8099 ":ref:`resume <i_resume>`" instruction or other exception handling
8100 mechanism, control is interrupted and continued at the dynamically
8101 nearest "exception" label.
8102
8103 The '``exception``' label is a `landing
8104 pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8105 '``exception``' label is required to have the
8106 ":ref:`landingpad <i_landingpad>`" instruction, which contains the
8107 information about the behavior of the program after unwinding happens,
8108 as its first non-PHI instruction. The restrictions on the
8109 "``landingpad``" instruction's tightly couples it to the "``invoke``"
8110 instruction, so that the important information contained within the
8111 "``landingpad``" instruction can't be lost through normal code motion.
8112
8113 Arguments:
8114 """"""""""
8115
8116 This instruction requires several arguments:
8117
8118 #. The optional "cconv" marker indicates which :ref:`calling
8119    convention <callingconv>` the call should use. If none is
8120    specified, the call defaults to using C calling conventions.
8121 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8122    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8123    are valid here.
8124 #. The optional addrspace attribute can be used to indicate the address space
8125    of the called function. If it is not specified, the program address space
8126    from the :ref:`datalayout string<langref_datalayout>` will be used.
8127 #. '``ty``': the type of the call instruction itself which is also the
8128    type of the return value. Functions that return no value are marked
8129    ``void``.
8130 #. '``fnty``': shall be the signature of the function being invoked. The
8131    argument types must match the types implied by this signature. This
8132    type can be omitted if the function is not varargs.
8133 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8134    be invoked. In most cases, this is a direct function invocation, but
8135    indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8136    to function value.
8137 #. '``function args``': argument list whose types match the function
8138    signature argument types and parameter attributes. All arguments must
8139    be of :ref:`first class <t_firstclass>` type. If the function signature
8140    indicates the function accepts a variable number of arguments, the
8141    extra arguments can be specified.
8142 #. '``normal label``': the label reached when the called function
8143    executes a '``ret``' instruction.
8144 #. '``exception label``': the label reached when a callee returns via
8145    the :ref:`resume <i_resume>` instruction or other exception handling
8146    mechanism.
8147 #. The optional :ref:`function attributes <fnattrs>` list.
8148 #. The optional :ref:`operand bundles <opbundles>` list.
8149
8150 Semantics:
8151 """"""""""
8152
8153 This instruction is designed to operate as a standard '``call``'
8154 instruction in most regards. The primary difference is that it
8155 establishes an association with a label, which is used by the runtime
8156 library to unwind the stack.
8157
8158 This instruction is used in languages with destructors to ensure that
8159 proper cleanup is performed in the case of either a ``longjmp`` or a
8160 thrown exception. Additionally, this is important for implementation of
8161 '``catch``' clauses in high-level languages that support them.
8162
8163 For the purposes of the SSA form, the definition of the value returned
8164 by the '``invoke``' instruction is deemed to occur on the edge from the
8165 current block to the "normal" label. If the callee unwinds then no
8166 return value is available.
8167
8168 Example:
8169 """"""""
8170
8171 .. code-block:: llvm
8172
8173       %retval = invoke i32 @Test(i32 15) to label %Continue
8174                   unwind label %TestCleanup              ; i32:retval set
8175       %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8176                   unwind label %TestCleanup              ; i32:retval set
8177
8178 .. _i_callbr:
8179
8180 '``callbr``' Instruction
8181 ^^^^^^^^^^^^^^^^^^^^^^^^
8182
8183 Syntax:
8184 """""""
8185
8186 ::
8187
8188       <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8189                     [operand bundles] to label <fallthrough label> [indirect labels]
8190
8191 Overview:
8192 """""""""
8193
8194 The '``callbr``' instruction causes control to transfer to a specified
8195 function, with the possibility of control flow transfer to either the
8196 '``fallthrough``' label or one of the '``indirect``' labels.
8197
8198 This instruction should only be used to implement the "goto" feature of gcc
8199 style inline assembly. Any other usage is an error in the IR verifier.
8200
8201 Arguments:
8202 """"""""""
8203
8204 This instruction requires several arguments:
8205
8206 #. The optional "cconv" marker indicates which :ref:`calling
8207    convention <callingconv>` the call should use. If none is
8208    specified, the call defaults to using C calling conventions.
8209 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8210    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8211    are valid here.
8212 #. The optional addrspace attribute can be used to indicate the address space
8213    of the called function. If it is not specified, the program address space
8214    from the :ref:`datalayout string<langref_datalayout>` will be used.
8215 #. '``ty``': the type of the call instruction itself which is also the
8216    type of the return value. Functions that return no value are marked
8217    ``void``.
8218 #. '``fnty``': shall be the signature of the function being called. The
8219    argument types must match the types implied by this signature. This
8220    type can be omitted if the function is not varargs.
8221 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8222    be called. In most cases, this is a direct function call, but
8223    other ``callbr``'s are just as possible, calling an arbitrary pointer
8224    to function value.
8225 #. '``function args``': argument list whose types match the function
8226    signature argument types and parameter attributes. All arguments must
8227    be of :ref:`first class <t_firstclass>` type. If the function signature
8228    indicates the function accepts a variable number of arguments, the
8229    extra arguments can be specified.
8230 #. '``fallthrough label``': the label reached when the inline assembly's
8231    execution exits the bottom.
8232 #. '``indirect labels``': the labels reached when a callee transfers control
8233    to a location other than the '``fallthrough label``'. The blockaddress
8234    constant for these should also be in the list of '``function args``'.
8235 #. The optional :ref:`function attributes <fnattrs>` list.
8236 #. The optional :ref:`operand bundles <opbundles>` list.
8237
8238 Semantics:
8239 """"""""""
8240
8241 This instruction is designed to operate as a standard '``call``'
8242 instruction in most regards. The primary difference is that it
8243 establishes an association with additional labels to define where control
8244 flow goes after the call.
8245
8246 The output values of a '``callbr``' instruction are available only to
8247 the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8248
8249 The only use of this today is to implement the "goto" feature of gcc inline
8250 assembly where additional labels can be provided as locations for the inline
8251 assembly to jump to.
8252
8253 Example:
8254 """"""""
8255
8256 .. code-block:: llvm
8257
8258       ; "asm goto" without output constraints.
8259       callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8260                   to label %fallthrough [label %indirect]
8261
8262       ; "asm goto" with output constraints.
8263       <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8264                   to label %fallthrough [label %indirect]
8265
8266 .. _i_resume:
8267
8268 '``resume``' Instruction
8269 ^^^^^^^^^^^^^^^^^^^^^^^^
8270
8271 Syntax:
8272 """""""
8273
8274 ::
8275
8276       resume <type> <value>
8277
8278 Overview:
8279 """""""""
8280
8281 The '``resume``' instruction is a terminator instruction that has no
8282 successors.
8283
8284 Arguments:
8285 """"""""""
8286
8287 The '``resume``' instruction requires one argument, which must have the
8288 same type as the result of any '``landingpad``' instruction in the same
8289 function.
8290
8291 Semantics:
8292 """"""""""
8293
8294 The '``resume``' instruction resumes propagation of an existing
8295 (in-flight) exception whose unwinding was interrupted with a
8296 :ref:`landingpad <i_landingpad>` instruction.
8297
8298 Example:
8299 """"""""
8300
8301 .. code-block:: llvm
8302
8303       resume { i8*, i32 } %exn
8304
8305 .. _i_catchswitch:
8306
8307 '``catchswitch``' Instruction
8308 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8309
8310 Syntax:
8311 """""""
8312
8313 ::
8314
8315       <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8316       <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8317
8318 Overview:
8319 """""""""
8320
8321 The '``catchswitch``' instruction is used by `LLVM's exception handling system
8322 <ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
8323 that may be executed by the :ref:`EH personality routine <personalityfn>`.
8324
8325 Arguments:
8326 """"""""""
8327
8328 The ``parent`` argument is the token of the funclet that contains the
8329 ``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
8330 this operand may be the token ``none``.
8331
8332 The ``default`` argument is the label of another basic block beginning with
8333 either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
8334 must be a legal target with respect to the ``parent`` links, as described in
8335 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8336
8337 The ``handlers`` are a nonempty list of successor blocks that each begin with a
8338 :ref:`catchpad <i_catchpad>` instruction.
8339
8340 Semantics:
8341 """"""""""
8342
8343 Executing this instruction transfers control to one of the successors in
8344 ``handlers``, if appropriate, or continues to unwind via the unwind label if
8345 present.
8346
8347 The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
8348 it must be both the first non-phi instruction and last instruction in the basic
8349 block. Therefore, it must be the only non-phi instruction in the block.
8350
8351 Example:
8352 """"""""
8353
8354 .. code-block:: text
8355
8356     dispatch1:
8357       %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
8358     dispatch2:
8359       %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
8360
8361 .. _i_catchret:
8362
8363 '``catchret``' Instruction
8364 ^^^^^^^^^^^^^^^^^^^^^^^^^^
8365
8366 Syntax:
8367 """""""
8368
8369 ::
8370
8371       catchret from <token> to label <normal>
8372
8373 Overview:
8374 """""""""
8375
8376 The '``catchret``' instruction is a terminator instruction that has a
8377 single successor.
8378
8379
8380 Arguments:
8381 """"""""""
8382
8383 The first argument to a '``catchret``' indicates which ``catchpad`` it
8384 exits.  It must be a :ref:`catchpad <i_catchpad>`.
8385 The second argument to a '``catchret``' specifies where control will
8386 transfer to next.
8387
8388 Semantics:
8389 """"""""""
8390
8391 The '``catchret``' instruction ends an existing (in-flight) exception whose
8392 unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
8393 :ref:`personality function <personalityfn>` gets a chance to execute arbitrary
8394 code to, for example, destroy the active exception.  Control then transfers to
8395 ``normal``.
8396
8397 The ``token`` argument must be a token produced by a ``catchpad`` instruction.
8398 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
8399 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8400 the ``catchret``'s behavior is undefined.
8401
8402 Example:
8403 """"""""
8404
8405 .. code-block:: text
8406
8407       catchret from %catch label %continue
8408
8409 .. _i_cleanupret:
8410
8411 '``cleanupret``' Instruction
8412 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8413
8414 Syntax:
8415 """""""
8416
8417 ::
8418
8419       cleanupret from <value> unwind label <continue>
8420       cleanupret from <value> unwind to caller
8421
8422 Overview:
8423 """""""""
8424
8425 The '``cleanupret``' instruction is a terminator instruction that has
8426 an optional successor.
8427
8428
8429 Arguments:
8430 """"""""""
8431
8432 The '``cleanupret``' instruction requires one argument, which indicates
8433 which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
8434 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
8435 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8436 the ``cleanupret``'s behavior is undefined.
8437
8438 The '``cleanupret``' instruction also has an optional successor, ``continue``,
8439 which must be the label of another basic block beginning with either a
8440 ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
8441 be a legal target with respect to the ``parent`` links, as described in the
8442 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8443
8444 Semantics:
8445 """"""""""
8446
8447 The '``cleanupret``' instruction indicates to the
8448 :ref:`personality function <personalityfn>` that one
8449 :ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
8450 It transfers control to ``continue`` or unwinds out of the function.
8451
8452 Example:
8453 """"""""
8454
8455 .. code-block:: text
8456
8457       cleanupret from %cleanup unwind to caller
8458       cleanupret from %cleanup unwind label %continue
8459
8460 .. _i_unreachable:
8461
8462 '``unreachable``' Instruction
8463 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8464
8465 Syntax:
8466 """""""
8467
8468 ::
8469
8470       unreachable
8471
8472 Overview:
8473 """""""""
8474
8475 The '``unreachable``' instruction has no defined semantics. This
8476 instruction is used to inform the optimizer that a particular portion of
8477 the code is not reachable. This can be used to indicate that the code
8478 after a no-return function cannot be reached, and other facts.
8479
8480 Semantics:
8481 """"""""""
8482
8483 The '``unreachable``' instruction has no defined semantics.
8484
8485 .. _unaryops:
8486
8487 Unary Operations
8488 -----------------
8489
8490 Unary operators require a single operand, execute an operation on
8491 it, and produce a single value. The operand might represent multiple
8492 data, as is the case with the :ref:`vector <t_vector>` data type. The
8493 result value has the same type as its operand.
8494
8495 .. _i_fneg:
8496
8497 '``fneg``' Instruction
8498 ^^^^^^^^^^^^^^^^^^^^^^
8499
8500 Syntax:
8501 """""""
8502
8503 ::
8504
8505       <result> = fneg [fast-math flags]* <ty> <op1>   ; yields ty:result
8506
8507 Overview:
8508 """""""""
8509
8510 The '``fneg``' instruction returns the negation of its operand.
8511
8512 Arguments:
8513 """"""""""
8514
8515 The argument to the '``fneg``' instruction must be a
8516 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8517 floating-point values.
8518
8519 Semantics:
8520 """"""""""
8521
8522 The value produced is a copy of the operand with its sign bit flipped.
8523 This instruction can also take any number of :ref:`fast-math
8524 flags <fastmath>`, which are optimization hints to enable otherwise
8525 unsafe floating-point optimizations:
8526
8527 Example:
8528 """"""""
8529
8530 .. code-block:: text
8531
8532       <result> = fneg float %val          ; yields float:result = -%var
8533
8534 .. _binaryops:
8535
8536 Binary Operations
8537 -----------------
8538
8539 Binary operators are used to do most of the computation in a program.
8540 They require two operands of the same type, execute an operation on
8541 them, and produce a single value. The operands might represent multiple
8542 data, as is the case with the :ref:`vector <t_vector>` data type. The
8543 result value has the same type as its operands.
8544
8545 There are several different binary operators:
8546
8547 .. _i_add:
8548
8549 '``add``' Instruction
8550 ^^^^^^^^^^^^^^^^^^^^^
8551
8552 Syntax:
8553 """""""
8554
8555 ::
8556
8557       <result> = add <ty> <op1>, <op2>          ; yields ty:result
8558       <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
8559       <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
8560       <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8561
8562 Overview:
8563 """""""""
8564
8565 The '``add``' instruction returns the sum of its two operands.
8566
8567 Arguments:
8568 """"""""""
8569
8570 The two arguments to the '``add``' instruction must be
8571 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8572 arguments must have identical types.
8573
8574 Semantics:
8575 """"""""""
8576
8577 The value produced is the integer sum of the two operands.
8578
8579 If the sum has unsigned overflow, the result returned is the
8580 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8581 the result.
8582
8583 Because LLVM integers use a two's complement representation, this
8584 instruction is appropriate for both signed and unsigned integers.
8585
8586 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8587 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8588 result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
8589 unsigned and/or signed overflow, respectively, occurs.
8590
8591 Example:
8592 """"""""
8593
8594 .. code-block:: text
8595
8596       <result> = add i32 4, %var          ; yields i32:result = 4 + %var
8597
8598 .. _i_fadd:
8599
8600 '``fadd``' Instruction
8601 ^^^^^^^^^^^^^^^^^^^^^^
8602
8603 Syntax:
8604 """""""
8605
8606 ::
8607
8608       <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8609
8610 Overview:
8611 """""""""
8612
8613 The '``fadd``' instruction returns the sum of its two operands.
8614
8615 Arguments:
8616 """"""""""
8617
8618 The two arguments to the '``fadd``' instruction must be
8619 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8620 floating-point values. Both arguments must have identical types.
8621
8622 Semantics:
8623 """"""""""
8624
8625 The value produced is the floating-point sum of the two operands.
8626 This instruction is assumed to execute in the default :ref:`floating-point
8627 environment <floatenv>`.
8628 This instruction can also take any number of :ref:`fast-math
8629 flags <fastmath>`, which are optimization hints to enable otherwise
8630 unsafe floating-point optimizations:
8631
8632 Example:
8633 """"""""
8634
8635 .. code-block:: text
8636
8637       <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
8638
8639 .. _i_sub:
8640
8641 '``sub``' Instruction
8642 ^^^^^^^^^^^^^^^^^^^^^
8643
8644 Syntax:
8645 """""""
8646
8647 ::
8648
8649       <result> = sub <ty> <op1>, <op2>          ; yields ty:result
8650       <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
8651       <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
8652       <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8653
8654 Overview:
8655 """""""""
8656
8657 The '``sub``' instruction returns the difference of its two operands.
8658
8659 Note that the '``sub``' instruction is used to represent the '``neg``'
8660 instruction present in most other intermediate representations.
8661
8662 Arguments:
8663 """"""""""
8664
8665 The two arguments to the '``sub``' instruction must be
8666 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8667 arguments must have identical types.
8668
8669 Semantics:
8670 """"""""""
8671
8672 The value produced is the integer difference of the two operands.
8673
8674 If the difference has unsigned overflow, the result returned is the
8675 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8676 the result.
8677
8678 Because LLVM integers use a two's complement representation, this
8679 instruction is appropriate for both signed and unsigned integers.
8680
8681 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8682 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8683 result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
8684 unsigned and/or signed overflow, respectively, occurs.
8685
8686 Example:
8687 """"""""
8688
8689 .. code-block:: text
8690
8691       <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
8692       <result> = sub i32 0, %val          ; yields i32:result = -%var
8693
8694 .. _i_fsub:
8695
8696 '``fsub``' Instruction
8697 ^^^^^^^^^^^^^^^^^^^^^^
8698
8699 Syntax:
8700 """""""
8701
8702 ::
8703
8704       <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8705
8706 Overview:
8707 """""""""
8708
8709 The '``fsub``' instruction returns the difference of its two operands.
8710
8711 Arguments:
8712 """"""""""
8713
8714 The two arguments to the '``fsub``' instruction must be
8715 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8716 floating-point values. Both arguments must have identical types.
8717
8718 Semantics:
8719 """"""""""
8720
8721 The value produced is the floating-point difference of the two operands.
8722 This instruction is assumed to execute in the default :ref:`floating-point
8723 environment <floatenv>`.
8724 This instruction can also take any number of :ref:`fast-math
8725 flags <fastmath>`, which are optimization hints to enable otherwise
8726 unsafe floating-point optimizations:
8727
8728 Example:
8729 """"""""
8730
8731 .. code-block:: text
8732
8733       <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
8734       <result> = fsub float -0.0, %val          ; yields float:result = -%var
8735
8736 .. _i_mul:
8737
8738 '``mul``' Instruction
8739 ^^^^^^^^^^^^^^^^^^^^^
8740
8741 Syntax:
8742 """""""
8743
8744 ::
8745
8746       <result> = mul <ty> <op1>, <op2>          ; yields ty:result
8747       <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
8748       <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
8749       <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8750
8751 Overview:
8752 """""""""
8753
8754 The '``mul``' instruction returns the product of its two operands.
8755
8756 Arguments:
8757 """"""""""
8758
8759 The two arguments to the '``mul``' instruction must be
8760 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8761 arguments must have identical types.
8762
8763 Semantics:
8764 """"""""""
8765
8766 The value produced is the integer product of the two operands.
8767
8768 If the result of the multiplication has unsigned overflow, the result
8769 returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
8770 bit width of the result.
8771
8772 Because LLVM integers use a two's complement representation, and the
8773 result is the same width as the operands, this instruction returns the
8774 correct result for both signed and unsigned integers. If a full product
8775 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
8776 sign-extended or zero-extended as appropriate to the width of the full
8777 product.
8778
8779 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8780 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8781 result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
8782 unsigned and/or signed overflow, respectively, occurs.
8783
8784 Example:
8785 """"""""
8786
8787 .. code-block:: text
8788
8789       <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
8790
8791 .. _i_fmul:
8792
8793 '``fmul``' Instruction
8794 ^^^^^^^^^^^^^^^^^^^^^^
8795
8796 Syntax:
8797 """""""
8798
8799 ::
8800
8801       <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8802
8803 Overview:
8804 """""""""
8805
8806 The '``fmul``' instruction returns the product of its two operands.
8807
8808 Arguments:
8809 """"""""""
8810
8811 The two arguments to the '``fmul``' instruction must be
8812 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8813 floating-point values. Both arguments must have identical types.
8814
8815 Semantics:
8816 """"""""""
8817
8818 The value produced is the floating-point product of the two operands.
8819 This instruction is assumed to execute in the default :ref:`floating-point
8820 environment <floatenv>`.
8821 This instruction can also take any number of :ref:`fast-math
8822 flags <fastmath>`, which are optimization hints to enable otherwise
8823 unsafe floating-point optimizations:
8824
8825 Example:
8826 """"""""
8827
8828 .. code-block:: text
8829
8830       <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
8831
8832 .. _i_udiv:
8833
8834 '``udiv``' Instruction
8835 ^^^^^^^^^^^^^^^^^^^^^^
8836
8837 Syntax:
8838 """""""
8839
8840 ::
8841
8842       <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
8843       <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
8844
8845 Overview:
8846 """""""""
8847
8848 The '``udiv``' instruction returns the quotient of its two operands.
8849
8850 Arguments:
8851 """"""""""
8852
8853 The two arguments to the '``udiv``' instruction must be
8854 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8855 arguments must have identical types.
8856
8857 Semantics:
8858 """"""""""
8859
8860 The value produced is the unsigned integer quotient of the two operands.
8861
8862 Note that unsigned integer division and signed integer division are
8863 distinct operations; for signed integer division, use '``sdiv``'.
8864
8865 Division by zero is undefined behavior. For vectors, if any element
8866 of the divisor is zero, the operation has undefined behavior.
8867
8868
8869 If the ``exact`` keyword is present, the result value of the ``udiv`` is
8870 a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
8871 such, "((a udiv exact b) mul b) == a").
8872
8873 Example:
8874 """"""""
8875
8876 .. code-block:: text
8877
8878       <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
8879
8880 .. _i_sdiv:
8881
8882 '``sdiv``' Instruction
8883 ^^^^^^^^^^^^^^^^^^^^^^
8884
8885 Syntax:
8886 """""""
8887
8888 ::
8889
8890       <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
8891       <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
8892
8893 Overview:
8894 """""""""
8895
8896 The '``sdiv``' instruction returns the quotient of its two operands.
8897
8898 Arguments:
8899 """"""""""
8900
8901 The two arguments to the '``sdiv``' instruction must be
8902 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8903 arguments must have identical types.
8904
8905 Semantics:
8906 """"""""""
8907
8908 The value produced is the signed integer quotient of the two operands
8909 rounded towards zero.
8910
8911 Note that signed integer division and unsigned integer division are
8912 distinct operations; for unsigned integer division, use '``udiv``'.
8913
8914 Division by zero is undefined behavior. For vectors, if any element
8915 of the divisor is zero, the operation has undefined behavior.
8916 Overflow also leads to undefined behavior; this is a rare case, but can
8917 occur, for example, by doing a 32-bit division of -2147483648 by -1.
8918
8919 If the ``exact`` keyword is present, the result value of the ``sdiv`` is
8920 a :ref:`poison value <poisonvalues>` if the result would be rounded.
8921
8922 Example:
8923 """"""""
8924
8925 .. code-block:: text
8926
8927       <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
8928
8929 .. _i_fdiv:
8930
8931 '``fdiv``' Instruction
8932 ^^^^^^^^^^^^^^^^^^^^^^
8933
8934 Syntax:
8935 """""""
8936
8937 ::
8938
8939       <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8940
8941 Overview:
8942 """""""""
8943
8944 The '``fdiv``' instruction returns the quotient of its two operands.
8945
8946 Arguments:
8947 """"""""""
8948
8949 The two arguments to the '``fdiv``' instruction must be
8950 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8951 floating-point values. Both arguments must have identical types.
8952
8953 Semantics:
8954 """"""""""
8955
8956 The value produced is the floating-point quotient of the two operands.
8957 This instruction is assumed to execute in the default :ref:`floating-point
8958 environment <floatenv>`.
8959 This instruction can also take any number of :ref:`fast-math
8960 flags <fastmath>`, which are optimization hints to enable otherwise
8961 unsafe floating-point optimizations:
8962
8963 Example:
8964 """"""""
8965
8966 .. code-block:: text
8967
8968       <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
8969
8970 .. _i_urem:
8971
8972 '``urem``' Instruction
8973 ^^^^^^^^^^^^^^^^^^^^^^
8974
8975 Syntax:
8976 """""""
8977
8978 ::
8979
8980       <result> = urem <ty> <op1>, <op2>   ; yields ty:result
8981
8982 Overview:
8983 """""""""
8984
8985 The '``urem``' instruction returns the remainder from the unsigned
8986 division of its two arguments.
8987
8988 Arguments:
8989 """"""""""
8990
8991 The two arguments to the '``urem``' instruction must be
8992 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8993 arguments must have identical types.
8994
8995 Semantics:
8996 """"""""""
8997
8998 This instruction returns the unsigned integer *remainder* of a division.
8999 This instruction always performs an unsigned division to get the
9000 remainder.
9001
9002 Note that unsigned integer remainder and signed integer remainder are
9003 distinct operations; for signed integer remainder, use '``srem``'.
9004
9005 Taking the remainder of a division by zero is undefined behavior.
9006 For vectors, if any element of the divisor is zero, the operation has
9007 undefined behavior.
9008
9009 Example:
9010 """"""""
9011
9012 .. code-block:: text
9013
9014       <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
9015
9016 .. _i_srem:
9017
9018 '``srem``' Instruction
9019 ^^^^^^^^^^^^^^^^^^^^^^
9020
9021 Syntax:
9022 """""""
9023
9024 ::
9025
9026       <result> = srem <ty> <op1>, <op2>   ; yields ty:result
9027
9028 Overview:
9029 """""""""
9030
9031 The '``srem``' instruction returns the remainder from the signed
9032 division of its two operands. This instruction can also take
9033 :ref:`vector <t_vector>` versions of the values in which case the elements
9034 must be integers.
9035
9036 Arguments:
9037 """"""""""
9038
9039 The two arguments to the '``srem``' instruction must be
9040 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9041 arguments must have identical types.
9042
9043 Semantics:
9044 """"""""""
9045
9046 This instruction returns the *remainder* of a division (where the result
9047 is either zero or has the same sign as the dividend, ``op1``), not the
9048 *modulo* operator (where the result is either zero or has the same sign
9049 as the divisor, ``op2``) of a value. For more information about the
9050 difference, see `The Math
9051 Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9052 table of how this is implemented in various languages, please see
9053 `Wikipedia: modulo
9054 operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9055
9056 Note that signed integer remainder and unsigned integer remainder are
9057 distinct operations; for unsigned integer remainder, use '``urem``'.
9058
9059 Taking the remainder of a division by zero is undefined behavior.
9060 For vectors, if any element of the divisor is zero, the operation has
9061 undefined behavior.
9062 Overflow also leads to undefined behavior; this is a rare case, but can
9063 occur, for example, by taking the remainder of a 32-bit division of
9064 -2147483648 by -1. (The remainder doesn't actually overflow, but this
9065 rule lets srem be implemented using instructions that return both the
9066 result of the division and the remainder.)
9067
9068 Example:
9069 """"""""
9070
9071 .. code-block:: text
9072
9073       <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
9074
9075 .. _i_frem:
9076
9077 '``frem``' Instruction
9078 ^^^^^^^^^^^^^^^^^^^^^^
9079
9080 Syntax:
9081 """""""
9082
9083 ::
9084
9085       <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9086
9087 Overview:
9088 """""""""
9089
9090 The '``frem``' instruction returns the remainder from the division of
9091 its two operands.
9092
9093 Arguments:
9094 """"""""""
9095
9096 The two arguments to the '``frem``' instruction must be
9097 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9098 floating-point values. Both arguments must have identical types.
9099
9100 Semantics:
9101 """"""""""
9102
9103 The value produced is the floating-point remainder of the two operands.
9104 This is the same output as a libm '``fmod``' function, but without any
9105 possibility of setting ``errno``. The remainder has the same sign as the
9106 dividend.
9107 This instruction is assumed to execute in the default :ref:`floating-point
9108 environment <floatenv>`.
9109 This instruction can also take any number of :ref:`fast-math
9110 flags <fastmath>`, which are optimization hints to enable otherwise
9111 unsafe floating-point optimizations:
9112
9113 Example:
9114 """"""""
9115
9116 .. code-block:: text
9117
9118       <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
9119
9120 .. _bitwiseops:
9121
9122 Bitwise Binary Operations
9123 -------------------------
9124
9125 Bitwise binary operators are used to do various forms of bit-twiddling
9126 in a program. They are generally very efficient instructions and can
9127 commonly be strength reduced from other instructions. They require two
9128 operands of the same type, execute an operation on them, and produce a
9129 single value. The resulting value is the same type as its operands.
9130
9131 .. _i_shl:
9132
9133 '``shl``' Instruction
9134 ^^^^^^^^^^^^^^^^^^^^^
9135
9136 Syntax:
9137 """""""
9138
9139 ::
9140
9141       <result> = shl <ty> <op1>, <op2>           ; yields ty:result
9142       <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
9143       <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
9144       <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
9145
9146 Overview:
9147 """""""""
9148
9149 The '``shl``' instruction returns the first operand shifted to the left
9150 a specified number of bits.
9151
9152 Arguments:
9153 """"""""""
9154
9155 Both arguments to the '``shl``' instruction must be the same
9156 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9157 '``op2``' is treated as an unsigned value.
9158
9159 Semantics:
9160 """"""""""
9161
9162 The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9163 where ``n`` is the width of the result. If ``op2`` is (statically or
9164 dynamically) equal to or larger than the number of bits in
9165 ``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9166 If the arguments are vectors, each vector element of ``op1`` is shifted
9167 by the corresponding shift amount in ``op2``.
9168
9169 If the ``nuw`` keyword is present, then the shift produces a poison
9170 value if it shifts out any non-zero bits.
9171 If the ``nsw`` keyword is present, then the shift produces a poison
9172 value if it shifts out any bits that disagree with the resultant sign bit.
9173
9174 Example:
9175 """"""""
9176
9177 .. code-block:: text
9178
9179       <result> = shl i32 4, %var   ; yields i32: 4 << %var
9180       <result> = shl i32 4, 2      ; yields i32: 16
9181       <result> = shl i32 1, 10     ; yields i32: 1024
9182       <result> = shl i32 1, 32     ; undefined
9183       <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
9184
9185 .. _i_lshr:
9186
9187
9188 '``lshr``' Instruction
9189 ^^^^^^^^^^^^^^^^^^^^^^
9190
9191 Syntax:
9192 """""""
9193
9194 ::
9195
9196       <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
9197       <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
9198
9199 Overview:
9200 """""""""
9201
9202 The '``lshr``' instruction (logical shift right) returns the first
9203 operand shifted to the right a specified number of bits with zero fill.
9204
9205 Arguments:
9206 """"""""""
9207
9208 Both arguments to the '``lshr``' instruction must be the same
9209 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9210 '``op2``' is treated as an unsigned value.
9211
9212 Semantics:
9213 """"""""""
9214
9215 This instruction always performs a logical shift right operation. The
9216 most significant bits of the result will be filled with zero bits after
9217 the shift. If ``op2`` is (statically or dynamically) equal to or larger
9218 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9219 value <poisonvalues>`. If the arguments are vectors, each vector element
9220 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9221
9222 If the ``exact`` keyword is present, the result value of the ``lshr`` is
9223 a poison value if any of the bits shifted out are non-zero.
9224
9225 Example:
9226 """"""""
9227
9228 .. code-block:: text
9229
9230       <result> = lshr i32 4, 1   ; yields i32:result = 2
9231       <result> = lshr i32 4, 2   ; yields i32:result = 1
9232       <result> = lshr i8  4, 3   ; yields i8:result = 0
9233       <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
9234       <result> = lshr i32 1, 32  ; undefined
9235       <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9236
9237 .. _i_ashr:
9238
9239 '``ashr``' Instruction
9240 ^^^^^^^^^^^^^^^^^^^^^^
9241
9242 Syntax:
9243 """""""
9244
9245 ::
9246
9247       <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
9248       <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
9249
9250 Overview:
9251 """""""""
9252
9253 The '``ashr``' instruction (arithmetic shift right) returns the first
9254 operand shifted to the right a specified number of bits with sign
9255 extension.
9256
9257 Arguments:
9258 """"""""""
9259
9260 Both arguments to the '``ashr``' instruction must be the same
9261 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9262 '``op2``' is treated as an unsigned value.
9263
9264 Semantics:
9265 """"""""""
9266
9267 This instruction always performs an arithmetic shift right operation,
9268 The most significant bits of the result will be filled with the sign bit
9269 of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9270 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9271 value <poisonvalues>`. If the arguments are vectors, each vector element
9272 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9273
9274 If the ``exact`` keyword is present, the result value of the ``ashr`` is
9275 a poison value if any of the bits shifted out are non-zero.
9276
9277 Example:
9278 """"""""
9279
9280 .. code-block:: text
9281
9282       <result> = ashr i32 4, 1   ; yields i32:result = 2
9283       <result> = ashr i32 4, 2   ; yields i32:result = 1
9284       <result> = ashr i8  4, 3   ; yields i8:result = 0
9285       <result> = ashr i8 -2, 1   ; yields i8:result = -1
9286       <result> = ashr i32 1, 32  ; undefined
9287       <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
9288
9289 .. _i_and:
9290
9291 '``and``' Instruction
9292 ^^^^^^^^^^^^^^^^^^^^^
9293
9294 Syntax:
9295 """""""
9296
9297 ::
9298
9299       <result> = and <ty> <op1>, <op2>   ; yields ty:result
9300
9301 Overview:
9302 """""""""
9303
9304 The '``and``' instruction returns the bitwise logical and of its two
9305 operands.
9306
9307 Arguments:
9308 """"""""""
9309
9310 The two arguments to the '``and``' instruction must be
9311 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9312 arguments must have identical types.
9313
9314 Semantics:
9315 """"""""""
9316
9317 The truth table used for the '``and``' instruction is:
9318
9319 +-----+-----+-----+
9320 | In0 | In1 | Out |
9321 +-----+-----+-----+
9322 |   0 |   0 |   0 |
9323 +-----+-----+-----+
9324 |   0 |   1 |   0 |
9325 +-----+-----+-----+
9326 |   1 |   0 |   0 |
9327 +-----+-----+-----+
9328 |   1 |   1 |   1 |
9329 +-----+-----+-----+
9330
9331 Example:
9332 """"""""
9333
9334 .. code-block:: text
9335
9336       <result> = and i32 4, %var         ; yields i32:result = 4 & %var
9337       <result> = and i32 15, 40          ; yields i32:result = 8
9338       <result> = and i32 4, 8            ; yields i32:result = 0
9339
9340 .. _i_or:
9341
9342 '``or``' Instruction
9343 ^^^^^^^^^^^^^^^^^^^^
9344
9345 Syntax:
9346 """""""
9347
9348 ::
9349
9350       <result> = or <ty> <op1>, <op2>   ; yields ty:result
9351
9352 Overview:
9353 """""""""
9354
9355 The '``or``' instruction returns the bitwise logical inclusive or of its
9356 two operands.
9357
9358 Arguments:
9359 """"""""""
9360
9361 The two arguments to the '``or``' instruction must be
9362 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9363 arguments must have identical types.
9364
9365 Semantics:
9366 """"""""""
9367
9368 The truth table used for the '``or``' instruction is:
9369
9370 +-----+-----+-----+
9371 | In0 | In1 | Out |
9372 +-----+-----+-----+
9373 |   0 |   0 |   0 |
9374 +-----+-----+-----+
9375 |   0 |   1 |   1 |
9376 +-----+-----+-----+
9377 |   1 |   0 |   1 |
9378 +-----+-----+-----+
9379 |   1 |   1 |   1 |
9380 +-----+-----+-----+
9381
9382 Example:
9383 """"""""
9384
9385 ::
9386
9387       <result> = or i32 4, %var         ; yields i32:result = 4 | %var
9388       <result> = or i32 15, 40          ; yields i32:result = 47
9389       <result> = or i32 4, 8            ; yields i32:result = 12
9390
9391 .. _i_xor:
9392
9393 '``xor``' Instruction
9394 ^^^^^^^^^^^^^^^^^^^^^
9395
9396 Syntax:
9397 """""""
9398
9399 ::
9400
9401       <result> = xor <ty> <op1>, <op2>   ; yields ty:result
9402
9403 Overview:
9404 """""""""
9405
9406 The '``xor``' instruction returns the bitwise logical exclusive or of
9407 its two operands. The ``xor`` is used to implement the "one's
9408 complement" operation, which is the "~" operator in C.
9409
9410 Arguments:
9411 """"""""""
9412
9413 The two arguments to the '``xor``' instruction must be
9414 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9415 arguments must have identical types.
9416
9417 Semantics:
9418 """"""""""
9419
9420 The truth table used for the '``xor``' instruction is:
9421
9422 +-----+-----+-----+
9423 | In0 | In1 | Out |
9424 +-----+-----+-----+
9425 |   0 |   0 |   0 |
9426 +-----+-----+-----+
9427 |   0 |   1 |   1 |
9428 +-----+-----+-----+
9429 |   1 |   0 |   1 |
9430 +-----+-----+-----+
9431 |   1 |   1 |   0 |
9432 +-----+-----+-----+
9433
9434 Example:
9435 """"""""
9436
9437 .. code-block:: text
9438
9439       <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
9440       <result> = xor i32 15, 40          ; yields i32:result = 39
9441       <result> = xor i32 4, 8            ; yields i32:result = 12
9442       <result> = xor i32 %V, -1          ; yields i32:result = ~%V
9443
9444 Vector Operations
9445 -----------------
9446
9447 LLVM supports several instructions to represent vector operations in a
9448 target-independent manner. These instructions cover the element-access
9449 and vector-specific operations needed to process vectors effectively.
9450 While LLVM does directly support these vector operations, many
9451 sophisticated algorithms will want to use target-specific intrinsics to
9452 take full advantage of a specific target.
9453
9454 .. _i_extractelement:
9455
9456 '``extractelement``' Instruction
9457 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9458
9459 Syntax:
9460 """""""
9461
9462 ::
9463
9464       <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
9465       <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
9466
9467 Overview:
9468 """""""""
9469
9470 The '``extractelement``' instruction extracts a single scalar element
9471 from a vector at a specified index.
9472
9473 Arguments:
9474 """"""""""
9475
9476 The first operand of an '``extractelement``' instruction is a value of
9477 :ref:`vector <t_vector>` type. The second operand is an index indicating
9478 the position from which to extract the element. The index may be a
9479 variable of any integer type.
9480
9481 Semantics:
9482 """"""""""
9483
9484 The result is a scalar of the same type as the element type of ``val``.
9485 Its value is the value at position ``idx`` of ``val``. If ``idx``
9486 exceeds the length of ``val`` for a fixed-length vector, the result is a
9487 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value
9488 of ``idx`` exceeds the runtime length of the vector, the result is a
9489 :ref:`poison value <poisonvalues>`.
9490
9491 Example:
9492 """"""""
9493
9494 .. code-block:: text
9495
9496       <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
9497
9498 .. _i_insertelement:
9499
9500 '``insertelement``' Instruction
9501 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9502
9503 Syntax:
9504 """""""
9505
9506 ::
9507
9508       <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
9509       <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
9510
9511 Overview:
9512 """""""""
9513
9514 The '``insertelement``' instruction inserts a scalar element into a
9515 vector at a specified index.
9516
9517 Arguments:
9518 """"""""""
9519
9520 The first operand of an '``insertelement``' instruction is a value of
9521 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
9522 type must equal the element type of the first operand. The third operand
9523 is an index indicating the position at which to insert the value. The
9524 index may be a variable of any integer type.
9525
9526 Semantics:
9527 """"""""""
9528
9529 The result is a vector of the same type as ``val``. Its element values
9530 are those of ``val`` except at position ``idx``, where it gets the value
9531 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
9532 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
9533 if the value of ``idx`` exceeds the runtime length of the vector, the result
9534 is a :ref:`poison value <poisonvalues>`.
9535
9536 Example:
9537 """"""""
9538
9539 .. code-block:: text
9540
9541       <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
9542
9543 .. _i_shufflevector:
9544
9545 '``shufflevector``' Instruction
9546 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9547
9548 Syntax:
9549 """""""
9550
9551 ::
9552
9553       <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
9554       <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask>  ; yields <vscale x m x <ty>>
9555
9556 Overview:
9557 """""""""
9558
9559 The '``shufflevector``' instruction constructs a permutation of elements
9560 from two input vectors, returning a vector with the same element type as
9561 the input and length that is the same as the shuffle mask.
9562
9563 Arguments:
9564 """"""""""
9565
9566 The first two operands of a '``shufflevector``' instruction are vectors
9567 with the same type. The third argument is a shuffle mask vector constant
9568 whose element type is ``i32``. The mask vector elements must be constant
9569 integers or ``undef`` values. The result of the instruction is a vector
9570 whose length is the same as the shuffle mask and whose element type is the
9571 same as the element type of the first two operands.
9572
9573 Semantics:
9574 """"""""""
9575
9576 The elements of the two input vectors are numbered from left to right
9577 across both of the vectors. For each element of the result vector, the
9578 shuffle mask selects an element from one of the input vectors to copy
9579 to the result. Non-negative elements in the mask represent an index
9580 into the concatenated pair of input vectors.
9581
9582 If the shuffle mask is undefined, the result vector is undefined. If
9583 the shuffle mask selects an undefined element from one of the input
9584 vectors, the resulting element is undefined. An undefined element
9585 in the mask vector specifies that the resulting element is undefined.
9586 An undefined element in the mask vector prevents a poisoned vector
9587 element from propagating.
9588
9589 For scalable vectors, the only valid mask values at present are
9590 ``zeroinitializer`` and ``undef``, since we cannot write all indices as
9591 literals for a vector with a length unknown at compile time.
9592
9593 Example:
9594 """"""""
9595
9596 .. code-block:: text
9597
9598       <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9599                               <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
9600       <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
9601                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
9602       <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
9603                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
9604       <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9605                               <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
9606
9607 Aggregate Operations
9608 --------------------
9609
9610 LLVM supports several instructions for working with
9611 :ref:`aggregate <t_aggregate>` values.
9612
9613 .. _i_extractvalue:
9614
9615 '``extractvalue``' Instruction
9616 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9617
9618 Syntax:
9619 """""""
9620
9621 ::
9622
9623       <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
9624
9625 Overview:
9626 """""""""
9627
9628 The '``extractvalue``' instruction extracts the value of a member field
9629 from an :ref:`aggregate <t_aggregate>` value.
9630
9631 Arguments:
9632 """"""""""
9633
9634 The first operand of an '``extractvalue``' instruction is a value of
9635 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
9636 constant indices to specify which value to extract in a similar manner
9637 as indices in a '``getelementptr``' instruction.
9638
9639 The major differences to ``getelementptr`` indexing are:
9640
9641 -  Since the value being indexed is not a pointer, the first index is
9642    omitted and assumed to be zero.
9643 -  At least one index must be specified.
9644 -  Not only struct indices but also array indices must be in bounds.
9645
9646 Semantics:
9647 """"""""""
9648
9649 The result is the value at the position in the aggregate specified by
9650 the index operands.
9651
9652 Example:
9653 """"""""
9654
9655 .. code-block:: text
9656
9657       <result> = extractvalue {i32, float} %agg, 0    ; yields i32
9658
9659 .. _i_insertvalue:
9660
9661 '``insertvalue``' Instruction
9662 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9663
9664 Syntax:
9665 """""""
9666
9667 ::
9668
9669       <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
9670
9671 Overview:
9672 """""""""
9673
9674 The '``insertvalue``' instruction inserts a value into a member field in
9675 an :ref:`aggregate <t_aggregate>` value.
9676
9677 Arguments:
9678 """"""""""
9679
9680 The first operand of an '``insertvalue``' instruction is a value of
9681 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
9682 a first-class value to insert. The following operands are constant
9683 indices indicating the position at which to insert the value in a
9684 similar manner as indices in a '``extractvalue``' instruction. The value
9685 to insert must have the same type as the value identified by the
9686 indices.
9687
9688 Semantics:
9689 """"""""""
9690
9691 The result is an aggregate of the same type as ``val``. Its value is
9692 that of ``val`` except that the value at the position specified by the
9693 indices is that of ``elt``.
9694
9695 Example:
9696 """"""""
9697
9698 .. code-block:: llvm
9699
9700       %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
9701       %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
9702       %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0    ; yields {i32 undef, {float %val}}
9703
9704 .. _memoryops:
9705
9706 Memory Access and Addressing Operations
9707 ---------------------------------------
9708
9709 A key design point of an SSA-based representation is how it represents
9710 memory. In LLVM, no memory locations are in SSA form, which makes things
9711 very simple. This section describes how to read, write, and allocate
9712 memory in LLVM.
9713
9714 .. _i_alloca:
9715
9716 '``alloca``' Instruction
9717 ^^^^^^^^^^^^^^^^^^^^^^^^
9718
9719 Syntax:
9720 """""""
9721
9722 ::
9723
9724       <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
9725
9726 Overview:
9727 """""""""
9728
9729 The '``alloca``' instruction allocates memory on the stack frame of the
9730 currently executing function, to be automatically released when this
9731 function returns to its caller.  If the address space is not explicitly
9732 specified, the object is allocated in the alloca address space from the
9733 :ref:`datalayout string<langref_datalayout>`.
9734
9735 Arguments:
9736 """"""""""
9737
9738 The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
9739 bytes of memory on the runtime stack, returning a pointer of the
9740 appropriate type to the program. If "NumElements" is specified, it is
9741 the number of elements allocated, otherwise "NumElements" is defaulted
9742 to be one. If a constant alignment is specified, the value result of the
9743 allocation is guaranteed to be aligned to at least that boundary. The
9744 alignment may not be greater than ``1 << 32``. If not specified, or if
9745 zero, the target can choose to align the allocation on any convenient
9746 boundary compatible with the type.
9747
9748 '``type``' may be any sized type.
9749
9750 Semantics:
9751 """"""""""
9752
9753 Memory is allocated; a pointer is returned. The allocated memory is
9754 uninitialized, and loading from uninitialized memory produces an undefined
9755 value. The operation itself is undefined if there is insufficient stack
9756 space for the allocation.'``alloca``'d memory is automatically released
9757 when the function returns. The '``alloca``' instruction is commonly used
9758 to represent automatic variables that must have an address available. When
9759 the function returns (either with the ``ret`` or ``resume`` instructions),
9760 the memory is reclaimed. Allocating zero bytes is legal, but the returned
9761 pointer may not be unique. The order in which memory is allocated (ie.,
9762 which way the stack grows) is not specified.
9763
9764 Note that '``alloca``' outside of the alloca address space from the
9765 :ref:`datalayout string<langref_datalayout>` is meaningful only if the
9766 target has assigned it a semantics.
9767
9768 If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
9769 the returned object is initially dead.
9770 See :ref:`llvm.lifetime.start <int_lifestart>` and
9771 :ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
9772 lifetime-manipulating intrinsics.
9773
9774 Example:
9775 """"""""
9776
9777 .. code-block:: llvm
9778
9779       %ptr = alloca i32                             ; yields i32*:ptr
9780       %ptr = alloca i32, i32 4                      ; yields i32*:ptr
9781       %ptr = alloca i32, i32 4, align 1024          ; yields i32*:ptr
9782       %ptr = alloca i32, align 1024                 ; yields i32*:ptr
9783
9784 .. _i_load:
9785
9786 '``load``' Instruction
9787 ^^^^^^^^^^^^^^^^^^^^^^
9788
9789 Syntax:
9790 """""""
9791
9792 ::
9793
9794       <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
9795       <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
9796       !<nontemp_node> = !{ i32 1 }
9797       !<empty_node> = !{}
9798       !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
9799       !<align_node> = !{ i64 <value_alignment> }
9800
9801 Overview:
9802 """""""""
9803
9804 The '``load``' instruction is used to read from memory.
9805
9806 Arguments:
9807 """"""""""
9808
9809 The argument to the ``load`` instruction specifies the memory address from which
9810 to load. The type specified must be a :ref:`first class <t_firstclass>` type of
9811 known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
9812 the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
9813 modify the number or order of execution of this ``load`` with other
9814 :ref:`volatile operations <volatile>`.
9815
9816 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
9817 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9818 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
9819 Atomic loads produce :ref:`defined <memmodel>` results when they may see
9820 multiple atomic stores. The type of the pointee must be an integer, pointer, or
9821 floating-point type whose bit width is a power of two greater than or equal to
9822 eight and less than or equal to a target-specific size limit.  ``align`` must be
9823 explicitly specified on atomic loads, and the load has undefined behavior if the
9824 alignment is not set to a value which is at least the size in bytes of the
9825 pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
9826
9827 The optional constant ``align`` argument specifies the alignment of the
9828 operation (that is, the alignment of the memory address). A value of 0
9829 or an omitted ``align`` argument means that the operation has the ABI
9830 alignment for the target. It is the responsibility of the code emitter
9831 to ensure that the alignment information is correct. Overestimating the
9832 alignment results in undefined behavior. Underestimating the alignment
9833 may produce less efficient code. An alignment of 1 is always safe. The
9834 maximum possible alignment is ``1 << 32``. An alignment value higher
9835 than the size of the loaded type implies memory up to the alignment
9836 value bytes can be safely loaded without trapping in the default
9837 address space. Access of the high bytes can interfere with debugging
9838 tools, so should not be accessed if the function has the
9839 ``sanitize_thread`` or ``sanitize_address`` attributes.
9840
9841 The optional ``!nontemporal`` metadata must reference a single
9842 metadata name ``<nontemp_node>`` corresponding to a metadata node with one
9843 ``i32`` entry of value 1. The existence of the ``!nontemporal``
9844 metadata on the instruction tells the optimizer and code generator
9845 that this load is not expected to be reused in the cache. The code
9846 generator may select special instructions to save cache bandwidth, such
9847 as the ``MOVNT`` instruction on x86.
9848
9849 The optional ``!invariant.load`` metadata must reference a single
9850 metadata name ``<empty_node>`` corresponding to a metadata node with no
9851 entries. If a load instruction tagged with the ``!invariant.load``
9852 metadata is executed, the memory location referenced by the load has
9853 to contain the same value at all points in the program where the
9854 memory location is dereferenceable; otherwise, the behavior is
9855 undefined.
9856
9857 The optional ``!invariant.group`` metadata must reference a single metadata name
9858  ``<empty_node>`` corresponding to a metadata node with no entries.
9859  See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
9860
9861 The optional ``!nonnull`` metadata must reference a single
9862 metadata name ``<empty_node>`` corresponding to a metadata node with no
9863 entries. The existence of the ``!nonnull`` metadata on the
9864 instruction tells the optimizer that the value loaded is known to
9865 never be null. If the value is null at runtime, the behavior is undefined.
9866 This is analogous to the ``nonnull`` attribute on parameters and return
9867 values. This metadata can only be applied to loads of a pointer type.
9868
9869 The optional ``!dereferenceable`` metadata must reference a single metadata
9870 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
9871 entry.
9872 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
9873
9874 The optional ``!dereferenceable_or_null`` metadata must reference a single
9875 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
9876 ``i64`` entry.
9877 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
9878 <md_dereferenceable_or_null>`.
9879
9880 The optional ``!align`` metadata must reference a single metadata name
9881 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
9882 The existence of the ``!align`` metadata on the instruction tells the
9883 optimizer that the value loaded is known to be aligned to a boundary specified
9884 by the integer value in the metadata node. The alignment must be a power of 2.
9885 This is analogous to the ''align'' attribute on parameters and return values.
9886 This metadata can only be applied to loads of a pointer type. If the returned
9887 value is not appropriately aligned at runtime, the behavior is undefined.
9888
9889 The optional ``!noundef`` metadata must reference a single metadata name
9890 ``<empty_node>`` corresponding to a node with no entries. The existence of
9891 ``!noundef`` metadata on the instruction tells the optimizer that the value
9892 loaded is known to be :ref:`well defined <welldefinedvalues>`.
9893 If the value isn't well defined, the behavior is undefined.
9894
9895 Semantics:
9896 """"""""""
9897
9898 The location of memory pointed to is loaded. If the value being loaded
9899 is of scalar type then the number of bytes read does not exceed the
9900 minimum number of bytes needed to hold all bits of the type. For
9901 example, loading an ``i24`` reads at most three bytes. When loading a
9902 value of a type like ``i20`` with a size that is not an integral number
9903 of bytes, the result is undefined if the value was not originally
9904 written using a store of the same type.
9905 If the value being loaded is of aggregate type, the bytes that correspond to
9906 padding may be accessed but are ignored, because it is impossible to observe
9907 padding from the loaded aggregate value.
9908 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
9909
9910 Examples:
9911 """""""""
9912
9913 .. code-block:: llvm
9914
9915       %ptr = alloca i32                               ; yields i32*:ptr
9916       store i32 3, i32* %ptr                          ; yields void
9917       %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
9918
9919 .. _i_store:
9920
9921 '``store``' Instruction
9922 ^^^^^^^^^^^^^^^^^^^^^^^
9923
9924 Syntax:
9925 """""""
9926
9927 ::
9928
9929       store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void
9930       store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
9931       !<nontemp_node> = !{ i32 1 }
9932       !<empty_node> = !{}
9933
9934 Overview:
9935 """""""""
9936
9937 The '``store``' instruction is used to write to memory.
9938
9939 Arguments:
9940 """"""""""
9941
9942 There are two arguments to the ``store`` instruction: a value to store and an
9943 address at which to store it. The type of the ``<pointer>`` operand must be a
9944 pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
9945 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
9946 allowed to modify the number or order of execution of this ``store`` with other
9947 :ref:`volatile operations <volatile>`.  Only values of :ref:`first class
9948 <t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
9949 structural type <t_opaque>`) can be stored.
9950
9951 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
9952 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9953 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
9954 Atomic loads produce :ref:`defined <memmodel>` results when they may see
9955 multiple atomic stores. The type of the pointee must be an integer, pointer, or
9956 floating-point type whose bit width is a power of two greater than or equal to
9957 eight and less than or equal to a target-specific size limit.  ``align`` must be
9958 explicitly specified on atomic stores, and the store has undefined behavior if
9959 the alignment is not set to a value which is at least the size in bytes of the
9960 pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
9961
9962 The optional constant ``align`` argument specifies the alignment of the
9963 operation (that is, the alignment of the memory address). A value of 0
9964 or an omitted ``align`` argument means that the operation has the ABI
9965 alignment for the target. It is the responsibility of the code emitter
9966 to ensure that the alignment information is correct. Overestimating the
9967 alignment results in undefined behavior. Underestimating the
9968 alignment may produce less efficient code. An alignment of 1 is always
9969 safe. The maximum possible alignment is ``1 << 32``. An alignment
9970 value higher than the size of the stored type implies memory up to the
9971 alignment value bytes can be stored to without trapping in the default
9972 address space. Storing to the higher bytes however may result in data
9973 races if another thread can access the same address. Introducing a
9974 data race is not allowed. Storing to the extra bytes is not allowed
9975 even in situations where a data race is known to not exist if the
9976 function has the ``sanitize_address`` attribute.
9977
9978 The optional ``!nontemporal`` metadata must reference a single metadata
9979 name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
9980 of value 1. The existence of the ``!nontemporal`` metadata on the instruction
9981 tells the optimizer and code generator that this load is not expected to
9982 be reused in the cache. The code generator may select special
9983 instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
9984 x86.
9985
9986 The optional ``!invariant.group`` metadata must reference a
9987 single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
9988
9989 Semantics:
9990 """"""""""
9991
9992 The contents of memory are updated to contain ``<value>`` at the
9993 location specified by the ``<pointer>`` operand. If ``<value>`` is
9994 of scalar type then the number of bytes written does not exceed the
9995 minimum number of bytes needed to hold all bits of the type. For
9996 example, storing an ``i24`` writes at most three bytes. When writing a
9997 value of a type like ``i20`` with a size that is not an integral number
9998 of bytes, it is unspecified what happens to the extra bits that do not
9999 belong to the type, but they will typically be overwritten.
10000 If ``<value>`` is of aggregate type, padding is filled with
10001 :ref:`undef <undefvalues>`.
10002 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10003
10004 Example:
10005 """"""""
10006
10007 .. code-block:: llvm
10008
10009       %ptr = alloca i32                               ; yields i32*:ptr
10010       store i32 3, i32* %ptr                          ; yields void
10011       %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
10012
10013 .. _i_fence:
10014
10015 '``fence``' Instruction
10016 ^^^^^^^^^^^^^^^^^^^^^^^
10017
10018 Syntax:
10019 """""""
10020
10021 ::
10022
10023       fence [syncscope("<target-scope>")] <ordering>  ; yields void
10024
10025 Overview:
10026 """""""""
10027
10028 The '``fence``' instruction is used to introduce happens-before edges
10029 between operations.
10030
10031 Arguments:
10032 """"""""""
10033
10034 '``fence``' instructions take an :ref:`ordering <ordering>` argument which
10035 defines what *synchronizes-with* edges they add. They can only be given
10036 ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10037
10038 Semantics:
10039 """"""""""
10040
10041 A fence A which has (at least) ``release`` ordering semantics
10042 *synchronizes with* a fence B with (at least) ``acquire`` ordering
10043 semantics if and only if there exist atomic operations X and Y, both
10044 operating on some atomic object M, such that A is sequenced before X, X
10045 modifies M (either directly or through some side effect of a sequence
10046 headed by X), Y is sequenced before B, and Y observes M. This provides a
10047 *happens-before* dependency between A and B. Rather than an explicit
10048 ``fence``, one (but not both) of the atomic operations X or Y might
10049 provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10050 still *synchronize-with* the explicit ``fence`` and establish the
10051 *happens-before* edge.
10052
10053 A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10054 ``acquire`` and ``release`` semantics specified above, participates in
10055 the global program order of other ``seq_cst`` operations and/or fences.
10056
10057 A ``fence`` instruction can also take an optional
10058 ":ref:`syncscope <syncscope>`" argument.
10059
10060 Example:
10061 """"""""
10062
10063 .. code-block:: text
10064
10065       fence acquire                                        ; yields void
10066       fence syncscope("singlethread") seq_cst              ; yields void
10067       fence syncscope("agent") seq_cst                     ; yields void
10068
10069 .. _i_cmpxchg:
10070
10071 '``cmpxchg``' Instruction
10072 ^^^^^^^^^^^^^^^^^^^^^^^^^
10073
10074 Syntax:
10075 """""""
10076
10077 ::
10078
10079       cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields  { ty, i1 }
10080
10081 Overview:
10082 """""""""
10083
10084 The '``cmpxchg``' instruction is used to atomically modify memory. It
10085 loads a value in memory and compares it to a given value. If they are
10086 equal, it tries to store a new value into the memory.
10087
10088 Arguments:
10089 """"""""""
10090
10091 There are three arguments to the '``cmpxchg``' instruction: an address
10092 to operate on, a value to compare to the value currently be at that
10093 address, and a new value to place at that address if the compared values
10094 are equal. The type of '<cmp>' must be an integer or pointer type whose
10095 bit width is a power of two greater than or equal to eight and less
10096 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10097 have the same type, and the type of '<pointer>' must be a pointer to
10098 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10099 optimizer is not allowed to modify the number or order of execution of
10100 this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10101
10102 The success and failure :ref:`ordering <ordering>` arguments specify how this
10103 ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10104 must be at least ``monotonic``, the failure ordering cannot be either
10105 ``release`` or ``acq_rel``.
10106
10107 A ``cmpxchg`` instruction can also take an optional
10108 ":ref:`syncscope <syncscope>`" argument.
10109
10110 The instruction can take an optional ``align`` attribute.
10111 The alignment must be a power of two greater or equal to the size of the
10112 `<value>` type. If unspecified, the alignment is assumed to be equal to the
10113 size of the '<value>' type. Note that this default alignment assumption is
10114 different from the alignment used for the load/store instructions when align
10115 isn't specified.
10116
10117 The pointer passed into cmpxchg must have alignment greater than or
10118 equal to the size in memory of the operand.
10119
10120 Semantics:
10121 """"""""""
10122
10123 The contents of memory at the location specified by the '``<pointer>``' operand
10124 is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10125 written to the location. The original value at the location is returned,
10126 together with a flag indicating success (true) or failure (false).
10127
10128 If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10129 permitted: the operation may not write ``<new>`` even if the comparison
10130 matched.
10131
10132 If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10133 if the value loaded equals ``cmp``.
10134
10135 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10136 identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10137 load with an ordering parameter determined the second ordering parameter.
10138
10139 Example:
10140 """"""""
10141
10142 .. code-block:: llvm
10143
10144     entry:
10145       %orig = load atomic i32, i32* %ptr unordered, align 4                      ; yields i32
10146       br label %loop
10147
10148     loop:
10149       %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10150       %squared = mul i32 %cmp, %cmp
10151       %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
10152       %value_loaded = extractvalue { i32, i1 } %val_success, 0
10153       %success = extractvalue { i32, i1 } %val_success, 1
10154       br i1 %success, label %done, label %loop
10155
10156     done:
10157       ...
10158
10159 .. _i_atomicrmw:
10160
10161 '``atomicrmw``' Instruction
10162 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
10163
10164 Syntax:
10165 """""""
10166
10167 ::
10168
10169       atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>]  ; yields ty
10170
10171 Overview:
10172 """""""""
10173
10174 The '``atomicrmw``' instruction is used to atomically modify memory.
10175
10176 Arguments:
10177 """"""""""
10178
10179 There are three arguments to the '``atomicrmw``' instruction: an
10180 operation to apply, an address whose value to modify, an argument to the
10181 operation. The operation must be one of the following keywords:
10182
10183 -  xchg
10184 -  add
10185 -  sub
10186 -  and
10187 -  nand
10188 -  or
10189 -  xor
10190 -  max
10191 -  min
10192 -  umax
10193 -  umin
10194 -  fadd
10195 -  fsub
10196
10197 For most of these operations, the type of '<value>' must be an integer
10198 type whose bit width is a power of two greater than or equal to eight
10199 and less than or equal to a target-specific size limit. For xchg, this
10200 may also be a floating point type with the same size constraints as
10201 integers.  For fadd/fsub, this must be a floating point type.  The
10202 type of the '``<pointer>``' operand must be a pointer to that type. If
10203 the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10204 allowed to modify the number or order of execution of this
10205 ``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10206
10207 The instruction can take an optional ``align`` attribute.
10208 The alignment must be a power of two greater or equal to the size of the
10209 `<value>` type. If unspecified, the alignment is assumed to be equal to the
10210 size of the '<value>' type. Note that this default alignment assumption is
10211 different from the alignment used for the load/store instructions when align
10212 isn't specified.
10213
10214 A ``atomicrmw`` instruction can also take an optional
10215 ":ref:`syncscope <syncscope>`" argument.
10216
10217 Semantics:
10218 """"""""""
10219
10220 The contents of memory at the location specified by the '``<pointer>``'
10221 operand are atomically read, modified, and written back. The original
10222 value at the location is returned. The modification is specified by the
10223 operation argument:
10224
10225 -  xchg: ``*ptr = val``
10226 -  add: ``*ptr = *ptr + val``
10227 -  sub: ``*ptr = *ptr - val``
10228 -  and: ``*ptr = *ptr & val``
10229 -  nand: ``*ptr = ~(*ptr & val)``
10230 -  or: ``*ptr = *ptr | val``
10231 -  xor: ``*ptr = *ptr ^ val``
10232 -  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10233 -  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10234 -  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10235 -  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10236 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10237 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10238
10239 Example:
10240 """"""""
10241
10242 .. code-block:: llvm
10243
10244       %old = atomicrmw add i32* %ptr, i32 1 acquire                        ; yields i32
10245
10246 .. _i_getelementptr:
10247
10248 '``getelementptr``' Instruction
10249 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10250
10251 Syntax:
10252 """""""
10253
10254 ::
10255
10256       <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10257       <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10258       <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
10259
10260 Overview:
10261 """""""""
10262
10263 The '``getelementptr``' instruction is used to get the address of a
10264 subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10265 address calculation only and does not access memory. The instruction can also
10266 be used to calculate a vector of such addresses.
10267
10268 Arguments:
10269 """"""""""
10270
10271 The first argument is always a type used as the basis for the calculations.
10272 The second argument is always a pointer or a vector of pointers, and is the
10273 base address to start from. The remaining arguments are indices
10274 that indicate which of the elements of the aggregate object are indexed.
10275 The interpretation of each index is dependent on the type being indexed
10276 into. The first index always indexes the pointer value given as the
10277 second argument, the second index indexes a value of the type pointed to
10278 (not necessarily the value directly pointed to, since the first index
10279 can be non-zero), etc. The first type indexed into must be a pointer
10280 value, subsequent types can be arrays, vectors, and structs. Note that
10281 subsequent types being indexed into can never be pointers, since that
10282 would require loading the pointer before continuing calculation.
10283
10284 The type of each index argument depends on the type it is indexing into.
10285 When indexing into a (optionally packed) structure, only ``i32`` integer
10286 **constants** are allowed (when using a vector of indices they must all
10287 be the **same** ``i32`` integer constant). When indexing into an array,
10288 pointer or vector, integers of any width are allowed, and they are not
10289 required to be constant. These integers are treated as signed values
10290 where relevant.
10291
10292 For example, let's consider a C code fragment and how it gets compiled
10293 to LLVM:
10294
10295 .. code-block:: c
10296
10297     struct RT {
10298       char A;
10299       int B[10][20];
10300       char C;
10301     };
10302     struct ST {
10303       int X;
10304       double Y;
10305       struct RT Z;
10306     };
10307
10308     int *foo(struct ST *s) {
10309       return &s[1].Z.B[5][13];
10310     }
10311
10312 The LLVM code generated by Clang is:
10313
10314 .. code-block:: llvm
10315
10316     %struct.RT = type { i8, [10 x [20 x i32]], i8 }
10317     %struct.ST = type { i32, double, %struct.RT }
10318
10319     define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
10320     entry:
10321       %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
10322       ret i32* %arrayidx
10323     }
10324
10325 Semantics:
10326 """"""""""
10327
10328 In the example above, the first index is indexing into the
10329 '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
10330 = '``{ i32, double, %struct.RT }``' type, a structure. The second index
10331 indexes into the third element of the structure, yielding a
10332 '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
10333 structure. The third index indexes into the second element of the
10334 structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
10335 dimensions of the array are subscripted into, yielding an '``i32``'
10336 type. The '``getelementptr``' instruction returns a pointer to this
10337 element, thus computing a value of '``i32*``' type.
10338
10339 Note that it is perfectly legal to index partially through a structure,
10340 returning a pointer to an inner element. Because of this, the LLVM code
10341 for the given testcase is equivalent to:
10342
10343 .. code-block:: llvm
10344
10345     define i32* @foo(%struct.ST* %s) {
10346       %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1                        ; yields %struct.ST*:%t1
10347       %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2                ; yields %struct.RT*:%t2
10348       %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
10349       %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
10350       %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13               ; yields i32*:%t5
10351       ret i32* %t5
10352     }
10353
10354 If the ``inbounds`` keyword is present, the result value of the
10355 ``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
10356 following rules is violated:
10357
10358 *  The base pointer has an *in bounds* address of an allocated object, which
10359    means that it points into an allocated object, or to its end. The only
10360    *in bounds* address for a null pointer in the default address-space is the
10361    null pointer itself.
10362 *  If the type of an index is larger than the pointer index type, the
10363    truncation to the pointer index type preserves the signed value.
10364 *  The multiplication of an index by the type size does not wrap the pointer
10365    index type in a signed sense (``nsw``).
10366 *  The successive addition of offsets (without adding the base address) does
10367    not wrap the pointer index type in a signed sense (``nsw``).
10368 *  The successive addition of the current address, interpreted as an unsigned
10369    number, and an offset, interpreted as a signed number, does not wrap the
10370    unsigned address space and remains *in bounds* of the allocated object.
10371    As a corollary, if the added offset is non-negative, the addition does not
10372    wrap in an unsigned sense (``nuw``).
10373 *  In cases where the base is a vector of pointers, the ``inbounds`` keyword
10374    applies to each of the computations element-wise.
10375
10376 These rules are based on the assumption that no allocated object may cross
10377 the unsigned address space boundary, and no allocated object may be larger
10378 than half the pointer index type space.
10379
10380 If the ``inbounds`` keyword is not present, the offsets are added to the
10381 base address with silently-wrapping two's complement arithmetic. If the
10382 offsets have a different width from the pointer, they are sign-extended
10383 or truncated to the width of the pointer. The result value of the
10384 ``getelementptr`` may be outside the object pointed to by the base
10385 pointer. The result value may not necessarily be used to access memory
10386 though, even if it happens to point into allocated storage. See the
10387 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
10388 information.
10389
10390 If the ``inrange`` keyword is present before any index, loading from or
10391 storing to any pointer derived from the ``getelementptr`` has undefined
10392 behavior if the load or store would access memory outside of the bounds of
10393 the element selected by the index marked as ``inrange``. The result of a
10394 pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
10395 involving memory) involving a pointer derived from a ``getelementptr`` with
10396 the ``inrange`` keyword is undefined, with the exception of comparisons
10397 in the case where both operands are in the range of the element selected
10398 by the ``inrange`` keyword, inclusive of the address one past the end of
10399 that element. Note that the ``inrange`` keyword is currently only allowed
10400 in constant ``getelementptr`` expressions.
10401
10402 The getelementptr instruction is often confusing. For some more insight
10403 into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
10404
10405 Example:
10406 """"""""
10407
10408 .. code-block:: llvm
10409
10410         ; yields [12 x i8]*:aptr
10411         %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
10412         ; yields i8*:vptr
10413         %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
10414         ; yields i8*:eptr
10415         %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
10416         ; yields i32*:iptr
10417         %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
10418
10419 Vector of pointers:
10420 """""""""""""""""""
10421
10422 The ``getelementptr`` returns a vector of pointers, instead of a single address,
10423 when one or more of its arguments is a vector. In such cases, all vector
10424 arguments should have the same number of elements, and every scalar argument
10425 will be effectively broadcast into a vector during address calculation.
10426
10427 .. code-block:: llvm
10428
10429      ; All arguments are vectors:
10430      ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
10431      %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
10432
10433      ; Add the same scalar offset to each pointer of a vector:
10434      ;   A[i] = ptrs[i] + offset*sizeof(i8)
10435      %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
10436
10437      ; Add distinct offsets to the same pointer:
10438      ;   A[i] = ptr + offsets[i]*sizeof(i8)
10439      %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
10440
10441      ; In all cases described above the type of the result is <4 x i8*>
10442
10443 The two following instructions are equivalent:
10444
10445 .. code-block:: llvm
10446
10447      getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10448        <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
10449        <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
10450        <4 x i32> %ind4,
10451        <4 x i64> <i64 13, i64 13, i64 13, i64 13>
10452
10453      getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10454        i32 2, i32 1, <4 x i32> %ind4, i64 13
10455
10456 Let's look at the C code, where the vector version of ``getelementptr``
10457 makes sense:
10458
10459 .. code-block:: c
10460
10461     // Let's assume that we vectorize the following loop:
10462     double *A, *B; int *C;
10463     for (int i = 0; i < size; ++i) {
10464       A[i] = B[C[i]];
10465     }
10466
10467 .. code-block:: llvm
10468
10469     ; get pointers for 8 elements from array B
10470     %ptrs = getelementptr double, double* %B, <8 x i32> %C
10471     ; load 8 elements from array B into A
10472     %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
10473          i32 8, <8 x i1> %mask, <8 x double> %passthru)
10474
10475 Conversion Operations
10476 ---------------------
10477
10478 The instructions in this category are the conversion instructions
10479 (casting) which all take a single operand and a type. They perform
10480 various bit conversions on the operand.
10481
10482 .. _i_trunc:
10483
10484 '``trunc .. to``' Instruction
10485 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10486
10487 Syntax:
10488 """""""
10489
10490 ::
10491
10492       <result> = trunc <ty> <value> to <ty2>             ; yields ty2
10493
10494 Overview:
10495 """""""""
10496
10497 The '``trunc``' instruction truncates its operand to the type ``ty2``.
10498
10499 Arguments:
10500 """"""""""
10501
10502 The '``trunc``' instruction takes a value to trunc, and a type to trunc
10503 it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
10504 of the same number of integers. The bit size of the ``value`` must be
10505 larger than the bit size of the destination type, ``ty2``. Equal sized
10506 types are not allowed.
10507
10508 Semantics:
10509 """"""""""
10510
10511 The '``trunc``' instruction truncates the high order bits in ``value``
10512 and converts the remaining bits to ``ty2``. Since the source size must
10513 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
10514 It will always truncate bits.
10515
10516 Example:
10517 """"""""
10518
10519 .. code-block:: llvm
10520
10521       %X = trunc i32 257 to i8                        ; yields i8:1
10522       %Y = trunc i32 123 to i1                        ; yields i1:true
10523       %Z = trunc i32 122 to i1                        ; yields i1:false
10524       %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
10525
10526 .. _i_zext:
10527
10528 '``zext .. to``' Instruction
10529 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10530
10531 Syntax:
10532 """""""
10533
10534 ::
10535
10536       <result> = zext <ty> <value> to <ty2>             ; yields ty2
10537
10538 Overview:
10539 """""""""
10540
10541 The '``zext``' instruction zero extends its operand to type ``ty2``.
10542
10543 Arguments:
10544 """"""""""
10545
10546 The '``zext``' instruction takes a value to cast, and a type to cast it
10547 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10548 the same number of integers. The bit size of the ``value`` must be
10549 smaller than the bit size of the destination type, ``ty2``.
10550
10551 Semantics:
10552 """"""""""
10553
10554 The ``zext`` fills the high order bits of the ``value`` with zero bits
10555 until it reaches the size of the destination type, ``ty2``.
10556
10557 When zero extending from i1, the result will always be either 0 or 1.
10558
10559 Example:
10560 """"""""
10561
10562 .. code-block:: llvm
10563
10564       %X = zext i32 257 to i64              ; yields i64:257
10565       %Y = zext i1 true to i32              ; yields i32:1
10566       %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10567
10568 .. _i_sext:
10569
10570 '``sext .. to``' Instruction
10571 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10572
10573 Syntax:
10574 """""""
10575
10576 ::
10577
10578       <result> = sext <ty> <value> to <ty2>             ; yields ty2
10579
10580 Overview:
10581 """""""""
10582
10583 The '``sext``' sign extends ``value`` to the type ``ty2``.
10584
10585 Arguments:
10586 """"""""""
10587
10588 The '``sext``' instruction takes a value to cast, and a type to cast it
10589 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10590 the same number of integers. The bit size of the ``value`` must be
10591 smaller than the bit size of the destination type, ``ty2``.
10592
10593 Semantics:
10594 """"""""""
10595
10596 The '``sext``' instruction performs a sign extension by copying the sign
10597 bit (highest order bit) of the ``value`` until it reaches the bit size
10598 of the type ``ty2``.
10599
10600 When sign extending from i1, the extension always results in -1 or 0.
10601
10602 Example:
10603 """"""""
10604
10605 .. code-block:: llvm
10606
10607       %X = sext i8  -1 to i16              ; yields i16   :65535
10608       %Y = sext i1 true to i32             ; yields i32:-1
10609       %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10610
10611 '``fptrunc .. to``' Instruction
10612 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10613
10614 Syntax:
10615 """""""
10616
10617 ::
10618
10619       <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
10620
10621 Overview:
10622 """""""""
10623
10624 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
10625
10626 Arguments:
10627 """"""""""
10628
10629 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
10630 value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
10631 The size of ``value`` must be larger than the size of ``ty2``. This
10632 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
10633
10634 Semantics:
10635 """"""""""
10636
10637 The '``fptrunc``' instruction casts a ``value`` from a larger
10638 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
10639 <t_floating>` type.
10640 This instruction is assumed to execute in the default :ref:`floating-point
10641 environment <floatenv>`.
10642
10643 Example:
10644 """"""""
10645
10646 .. code-block:: llvm
10647
10648       %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
10649       %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
10650
10651 '``fpext .. to``' Instruction
10652 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10653
10654 Syntax:
10655 """""""
10656
10657 ::
10658
10659       <result> = fpext <ty> <value> to <ty2>             ; yields ty2
10660
10661 Overview:
10662 """""""""
10663
10664 The '``fpext``' extends a floating-point ``value`` to a larger floating-point
10665 value.
10666
10667 Arguments:
10668 """"""""""
10669
10670 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
10671 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
10672 to. The source type must be smaller than the destination type.
10673
10674 Semantics:
10675 """"""""""
10676
10677 The '``fpext``' instruction extends the ``value`` from a smaller
10678 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
10679 <t_floating>` type. The ``fpext`` cannot be used to make a
10680 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
10681 *no-op cast* for a floating-point cast.
10682
10683 Example:
10684 """"""""
10685
10686 .. code-block:: llvm
10687
10688       %X = fpext float 3.125 to double         ; yields double:3.125000e+00
10689       %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
10690
10691 '``fptoui .. to``' Instruction
10692 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10693
10694 Syntax:
10695 """""""
10696
10697 ::
10698
10699       <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
10700
10701 Overview:
10702 """""""""
10703
10704 The '``fptoui``' converts a floating-point ``value`` to its unsigned
10705 integer equivalent of type ``ty2``.
10706
10707 Arguments:
10708 """"""""""
10709
10710 The '``fptoui``' instruction takes a value to cast, which must be a
10711 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10712 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10713 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10714 type with the same number of elements as ``ty``
10715
10716 Semantics:
10717 """"""""""
10718
10719 The '``fptoui``' instruction converts its :ref:`floating-point
10720 <t_floating>` operand into the nearest (rounding towards zero)
10721 unsigned integer value. If the value cannot fit in ``ty2``, the result
10722 is a :ref:`poison value <poisonvalues>`.
10723
10724 Example:
10725 """"""""
10726
10727 .. code-block:: llvm
10728
10729       %X = fptoui double 123.0 to i32      ; yields i32:123
10730       %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
10731       %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
10732
10733 '``fptosi .. to``' Instruction
10734 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10735
10736 Syntax:
10737 """""""
10738
10739 ::
10740
10741       <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
10742
10743 Overview:
10744 """""""""
10745
10746 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
10747 ``value`` to type ``ty2``.
10748
10749 Arguments:
10750 """"""""""
10751
10752 The '``fptosi``' instruction takes a value to cast, which must be a
10753 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10754 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10755 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10756 type with the same number of elements as ``ty``
10757
10758 Semantics:
10759 """"""""""
10760
10761 The '``fptosi``' instruction converts its :ref:`floating-point
10762 <t_floating>` operand into the nearest (rounding towards zero)
10763 signed integer value. If the value cannot fit in ``ty2``, the result
10764 is a :ref:`poison value <poisonvalues>`.
10765
10766 Example:
10767 """"""""
10768
10769 .. code-block:: llvm
10770
10771       %X = fptosi double -123.0 to i32      ; yields i32:-123
10772       %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
10773       %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
10774
10775 '``uitofp .. to``' Instruction
10776 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10777
10778 Syntax:
10779 """""""
10780
10781 ::
10782
10783       <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
10784
10785 Overview:
10786 """""""""
10787
10788 The '``uitofp``' instruction regards ``value`` as an unsigned integer
10789 and converts that value to the ``ty2`` type.
10790
10791 Arguments:
10792 """"""""""
10793
10794 The '``uitofp``' instruction takes a value to cast, which must be a
10795 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10796 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10797 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10798 type with the same number of elements as ``ty``
10799
10800 Semantics:
10801 """"""""""
10802
10803 The '``uitofp``' instruction interprets its operand as an unsigned
10804 integer quantity and converts it to the corresponding floating-point
10805 value. If the value cannot be exactly represented, it is rounded using
10806 the default rounding mode.
10807
10808
10809 Example:
10810 """"""""
10811
10812 .. code-block:: llvm
10813
10814       %X = uitofp i32 257 to float         ; yields float:257.0
10815       %Y = uitofp i8 -1 to double          ; yields double:255.0
10816
10817 '``sitofp .. to``' Instruction
10818 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10819
10820 Syntax:
10821 """""""
10822
10823 ::
10824
10825       <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
10826
10827 Overview:
10828 """""""""
10829
10830 The '``sitofp``' instruction regards ``value`` as a signed integer and
10831 converts that value to the ``ty2`` type.
10832
10833 Arguments:
10834 """"""""""
10835
10836 The '``sitofp``' instruction takes a value to cast, which must be a
10837 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10838 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10839 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10840 type with the same number of elements as ``ty``
10841
10842 Semantics:
10843 """"""""""
10844
10845 The '``sitofp``' instruction interprets its operand as a signed integer
10846 quantity and converts it to the corresponding floating-point value. If the
10847 value cannot be exactly represented, it is rounded using the default rounding
10848 mode.
10849
10850 Example:
10851 """"""""
10852
10853 .. code-block:: llvm
10854
10855       %X = sitofp i32 257 to float         ; yields float:257.0
10856       %Y = sitofp i8 -1 to double          ; yields double:-1.0
10857
10858 .. _i_ptrtoint:
10859
10860 '``ptrtoint .. to``' Instruction
10861 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10862
10863 Syntax:
10864 """""""
10865
10866 ::
10867
10868       <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
10869
10870 Overview:
10871 """""""""
10872
10873 The '``ptrtoint``' instruction converts the pointer or a vector of
10874 pointers ``value`` to the integer (or vector of integers) type ``ty2``.
10875
10876 Arguments:
10877 """"""""""
10878
10879 The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
10880 a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
10881 type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
10882 a vector of integers type.
10883
10884 Semantics:
10885 """"""""""
10886
10887 The '``ptrtoint``' instruction converts ``value`` to integer type
10888 ``ty2`` by interpreting the pointer value as an integer and either
10889 truncating or zero extending that value to the size of the integer type.
10890 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
10891 ``value`` is larger than ``ty2`` then a truncation is done. If they are
10892 the same size, then nothing is done (*no-op cast*) other than a type
10893 change.
10894
10895 Example:
10896 """"""""
10897
10898 .. code-block:: llvm
10899
10900       %X = ptrtoint i32* %P to i8                         ; yields truncation on 32-bit architecture
10901       %Y = ptrtoint i32* %P to i64                        ; yields zero extension on 32-bit architecture
10902       %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
10903
10904 .. _i_inttoptr:
10905
10906 '``inttoptr .. to``' Instruction
10907 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10908
10909 Syntax:
10910 """""""
10911
10912 ::
10913
10914       <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
10915
10916 Overview:
10917 """""""""
10918
10919 The '``inttoptr``' instruction converts an integer ``value`` to a
10920 pointer type, ``ty2``.
10921
10922 Arguments:
10923 """"""""""
10924
10925 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
10926 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
10927 type.
10928
10929 The optional ``!dereferenceable`` metadata must reference a single metadata
10930 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10931 entry.
10932 See ``dereferenceable`` metadata.
10933
10934 The optional ``!dereferenceable_or_null`` metadata must reference a single
10935 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10936 ``i64`` entry.
10937 See ``dereferenceable_or_null`` metadata.
10938
10939 Semantics:
10940 """"""""""
10941
10942 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
10943 applying either a zero extension or a truncation depending on the size
10944 of the integer ``value``. If ``value`` is larger than the size of a
10945 pointer then a truncation is done. If ``value`` is smaller than the size
10946 of a pointer then a zero extension is done. If they are the same size,
10947 nothing is done (*no-op cast*).
10948
10949 Example:
10950 """"""""
10951
10952 .. code-block:: llvm
10953
10954       %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
10955       %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
10956       %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
10957       %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
10958
10959 .. _i_bitcast:
10960
10961 '``bitcast .. to``' Instruction
10962 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10963
10964 Syntax:
10965 """""""
10966
10967 ::
10968
10969       <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
10970
10971 Overview:
10972 """""""""
10973
10974 The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
10975 changing any bits.
10976
10977 Arguments:
10978 """"""""""
10979
10980 The '``bitcast``' instruction takes a value to cast, which must be a
10981 non-aggregate first class value, and a type to cast it to, which must
10982 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
10983 bit sizes of ``value`` and the destination type, ``ty2``, must be
10984 identical. If the source type is a pointer, the destination type must
10985 also be a pointer of the same size. This instruction supports bitwise
10986 conversion of vectors to integers and to vectors of other types (as
10987 long as they have the same size).
10988
10989 Semantics:
10990 """"""""""
10991
10992 The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
10993 is always a *no-op cast* because no bits change with this
10994 conversion. The conversion is done as if the ``value`` had been stored
10995 to memory and read back as type ``ty2``. Pointer (or vector of
10996 pointers) types may only be converted to other pointer (or vector of
10997 pointers) types with the same address space through this instruction.
10998 To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
10999 or :ref:`ptrtoint <i_ptrtoint>` instructions first.
11000
11001 There is a caveat for bitcasts involving vector types in relation to
11002 endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
11003 of the vector in the least significant bits of the i16 for little-endian while
11004 element zero ends up in the most significant bits for big-endian.
11005
11006 Example:
11007 """"""""
11008
11009 .. code-block:: text
11010
11011       %X = bitcast i8 255 to i8          ; yields i8 :-1
11012       %Y = bitcast i32* %x to sint*      ; yields sint*:%x
11013       %Z = bitcast <2 x int> %V to i64;  ; yields i64: %V (depends on endianess)
11014       %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
11015
11016 .. _i_addrspacecast:
11017
11018 '``addrspacecast .. to``' Instruction
11019 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11020
11021 Syntax:
11022 """""""
11023
11024 ::
11025
11026       <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
11027
11028 Overview:
11029 """""""""
11030
11031 The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
11032 address space ``n`` to type ``pty2`` in address space ``m``.
11033
11034 Arguments:
11035 """"""""""
11036
11037 The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11038 to cast and a pointer type to cast it to, which must have a different
11039 address space.
11040
11041 Semantics:
11042 """"""""""
11043
11044 The '``addrspacecast``' instruction converts the pointer value
11045 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11046 value modification, depending on the target and the address space
11047 pair. Pointer conversions within the same address space must be
11048 performed with the ``bitcast`` instruction. Note that if the address space
11049 conversion is legal then both result and operand refer to the same memory
11050 location.
11051
11052 Example:
11053 """"""""
11054
11055 .. code-block:: llvm
11056
11057       %X = addrspacecast i32* %x to i32 addrspace(1)*    ; yields i32 addrspace(1)*:%x
11058       %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)*    ; yields i64 addrspace(2)*:%y
11059       %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*>   ; yields <4 x float addrspace(3)*>:%z
11060
11061 .. _otherops:
11062
11063 Other Operations
11064 ----------------
11065
11066 The instructions in this category are the "miscellaneous" instructions,
11067 which defy better classification.
11068
11069 .. _i_icmp:
11070
11071 '``icmp``' Instruction
11072 ^^^^^^^^^^^^^^^^^^^^^^
11073
11074 Syntax:
11075 """""""
11076
11077 ::
11078
11079       <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
11080
11081 Overview:
11082 """""""""
11083
11084 The '``icmp``' instruction returns a boolean value or a vector of
11085 boolean values based on comparison of its two integer, integer vector,
11086 pointer, or pointer vector operands.
11087
11088 Arguments:
11089 """"""""""
11090
11091 The '``icmp``' instruction takes three operands. The first operand is
11092 the condition code indicating the kind of comparison to perform. It is
11093 not a value, just a keyword. The possible condition codes are:
11094
11095 #. ``eq``: equal
11096 #. ``ne``: not equal
11097 #. ``ugt``: unsigned greater than
11098 #. ``uge``: unsigned greater or equal
11099 #. ``ult``: unsigned less than
11100 #. ``ule``: unsigned less or equal
11101 #. ``sgt``: signed greater than
11102 #. ``sge``: signed greater or equal
11103 #. ``slt``: signed less than
11104 #. ``sle``: signed less or equal
11105
11106 The remaining two arguments must be :ref:`integer <t_integer>` or
11107 :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11108 must also be identical types.
11109
11110 Semantics:
11111 """"""""""
11112
11113 The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11114 code given as ``cond``. The comparison performed always yields either an
11115 :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11116
11117 #. ``eq``: yields ``true`` if the operands are equal, ``false``
11118    otherwise. No sign interpretation is necessary or performed.
11119 #. ``ne``: yields ``true`` if the operands are unequal, ``false``
11120    otherwise. No sign interpretation is necessary or performed.
11121 #. ``ugt``: interprets the operands as unsigned values and yields
11122    ``true`` if ``op1`` is greater than ``op2``.
11123 #. ``uge``: interprets the operands as unsigned values and yields
11124    ``true`` if ``op1`` is greater than or equal to ``op2``.
11125 #. ``ult``: interprets the operands as unsigned values and yields
11126    ``true`` if ``op1`` is less than ``op2``.
11127 #. ``ule``: interprets the operands as unsigned values and yields
11128    ``true`` if ``op1`` is less than or equal to ``op2``.
11129 #. ``sgt``: interprets the operands as signed values and yields ``true``
11130    if ``op1`` is greater than ``op2``.
11131 #. ``sge``: interprets the operands as signed values and yields ``true``
11132    if ``op1`` is greater than or equal to ``op2``.
11133 #. ``slt``: interprets the operands as signed values and yields ``true``
11134    if ``op1`` is less than ``op2``.
11135 #. ``sle``: interprets the operands as signed values and yields ``true``
11136    if ``op1`` is less than or equal to ``op2``.
11137
11138 If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11139 are compared as if they were integers.
11140
11141 If the operands are integer vectors, then they are compared element by
11142 element. The result is an ``i1`` vector with the same number of elements
11143 as the values being compared. Otherwise, the result is an ``i1``.
11144
11145 Example:
11146 """"""""
11147
11148 .. code-block:: text
11149
11150       <result> = icmp eq i32 4, 5          ; yields: result=false
11151       <result> = icmp ne float* %X, %X     ; yields: result=false
11152       <result> = icmp ult i16  4, 5        ; yields: result=true
11153       <result> = icmp sgt i16  4, 5        ; yields: result=false
11154       <result> = icmp ule i16 -4, 5        ; yields: result=false
11155       <result> = icmp sge i16  4, 5        ; yields: result=false
11156
11157 .. _i_fcmp:
11158
11159 '``fcmp``' Instruction
11160 ^^^^^^^^^^^^^^^^^^^^^^
11161
11162 Syntax:
11163 """""""
11164
11165 ::
11166
11167       <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
11168
11169 Overview:
11170 """""""""
11171
11172 The '``fcmp``' instruction returns a boolean value or vector of boolean
11173 values based on comparison of its operands.
11174
11175 If the operands are floating-point scalars, then the result type is a
11176 boolean (:ref:`i1 <t_integer>`).
11177
11178 If the operands are floating-point vectors, then the result type is a
11179 vector of boolean with the same number of elements as the operands being
11180 compared.
11181
11182 Arguments:
11183 """"""""""
11184
11185 The '``fcmp``' instruction takes three operands. The first operand is
11186 the condition code indicating the kind of comparison to perform. It is
11187 not a value, just a keyword. The possible condition codes are:
11188
11189 #. ``false``: no comparison, always returns false
11190 #. ``oeq``: ordered and equal
11191 #. ``ogt``: ordered and greater than
11192 #. ``oge``: ordered and greater than or equal
11193 #. ``olt``: ordered and less than
11194 #. ``ole``: ordered and less than or equal
11195 #. ``one``: ordered and not equal
11196 #. ``ord``: ordered (no nans)
11197 #. ``ueq``: unordered or equal
11198 #. ``ugt``: unordered or greater than
11199 #. ``uge``: unordered or greater than or equal
11200 #. ``ult``: unordered or less than
11201 #. ``ule``: unordered or less than or equal
11202 #. ``une``: unordered or not equal
11203 #. ``uno``: unordered (either nans)
11204 #. ``true``: no comparison, always returns true
11205
11206 *Ordered* means that neither operand is a QNAN while *unordered* means
11207 that either operand may be a QNAN.
11208
11209 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11210 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11211 They must have identical types.
11212
11213 Semantics:
11214 """"""""""
11215
11216 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11217 condition code given as ``cond``. If the operands are vectors, then the
11218 vectors are compared element by element. Each comparison performed
11219 always yields an :ref:`i1 <t_integer>` result, as follows:
11220
11221 #. ``false``: always yields ``false``, regardless of operands.
11222 #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11223    is equal to ``op2``.
11224 #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11225    is greater than ``op2``.
11226 #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11227    is greater than or equal to ``op2``.
11228 #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
11229    is less than ``op2``.
11230 #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
11231    is less than or equal to ``op2``.
11232 #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
11233    is not equal to ``op2``.
11234 #. ``ord``: yields ``true`` if both operands are not a QNAN.
11235 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
11236    equal to ``op2``.
11237 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
11238    greater than ``op2``.
11239 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
11240    greater than or equal to ``op2``.
11241 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
11242    less than ``op2``.
11243 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
11244    less than or equal to ``op2``.
11245 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
11246    not equal to ``op2``.
11247 #. ``uno``: yields ``true`` if either operand is a QNAN.
11248 #. ``true``: always yields ``true``, regardless of operands.
11249
11250 The ``fcmp`` instruction can also optionally take any number of
11251 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11252 otherwise unsafe floating-point optimizations.
11253
11254 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
11255 only flags that have any effect on its semantics are those that allow
11256 assumptions to be made about the values of input arguments; namely
11257 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
11258
11259 Example:
11260 """"""""
11261
11262 .. code-block:: text
11263
11264       <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
11265       <result> = fcmp one float 4.0, 5.0    ; yields: result=true
11266       <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
11267       <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
11268
11269 .. _i_phi:
11270
11271 '``phi``' Instruction
11272 ^^^^^^^^^^^^^^^^^^^^^
11273
11274 Syntax:
11275 """""""
11276
11277 ::
11278
11279       <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
11280
11281 Overview:
11282 """""""""
11283
11284 The '``phi``' instruction is used to implement the φ node in the SSA
11285 graph representing the function.
11286
11287 Arguments:
11288 """"""""""
11289
11290 The type of the incoming values is specified with the first type field.
11291 After this, the '``phi``' instruction takes a list of pairs as
11292 arguments, with one pair for each predecessor basic block of the current
11293 block. Only values of :ref:`first class <t_firstclass>` type may be used as
11294 the value arguments to the PHI node. Only labels may be used as the
11295 label arguments.
11296
11297 There must be no non-phi instructions between the start of a basic block
11298 and the PHI instructions: i.e. PHI instructions must be first in a basic
11299 block.
11300
11301 For the purposes of the SSA form, the use of each incoming value is
11302 deemed to occur on the edge from the corresponding predecessor block to
11303 the current block (but after any definition of an '``invoke``'
11304 instruction's return value on the same edge).
11305
11306 The optional ``fast-math-flags`` marker indicates that the phi has one
11307 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
11308 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
11309 are only valid for phis that return a floating-point scalar or vector
11310 type, or an array (nested to any depth) of floating-point scalar or vector
11311 types.
11312
11313 Semantics:
11314 """"""""""
11315
11316 At runtime, the '``phi``' instruction logically takes on the value
11317 specified by the pair corresponding to the predecessor basic block that
11318 executed just prior to the current block.
11319
11320 Example:
11321 """"""""
11322
11323 .. code-block:: llvm
11324
11325     Loop:       ; Infinite loop that counts from 0 on up...
11326       %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
11327       %nextindvar = add i32 %indvar, 1
11328       br label %Loop
11329
11330 .. _i_select:
11331
11332 '``select``' Instruction
11333 ^^^^^^^^^^^^^^^^^^^^^^^^
11334
11335 Syntax:
11336 """""""
11337
11338 ::
11339
11340       <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
11341
11342       selty is either i1 or {<N x i1>}
11343
11344 Overview:
11345 """""""""
11346
11347 The '``select``' instruction is used to choose one value based on a
11348 condition, without IR-level branching.
11349
11350 Arguments:
11351 """"""""""
11352
11353 The '``select``' instruction requires an 'i1' value or a vector of 'i1'
11354 values indicating the condition, and two values of the same :ref:`first
11355 class <t_firstclass>` type.
11356
11357 #. The optional ``fast-math flags`` marker indicates that the select has one or more
11358    :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
11359    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11360    for selects that return a floating-point scalar or vector type, or an array
11361    (nested to any depth) of floating-point scalar or vector types.
11362
11363 Semantics:
11364 """"""""""
11365
11366 If the condition is an i1 and it evaluates to 1, the instruction returns
11367 the first value argument; otherwise, it returns the second value
11368 argument.
11369
11370 If the condition is a vector of i1, then the value arguments must be
11371 vectors of the same size, and the selection is done element by element.
11372
11373 If the condition is an i1 and the value arguments are vectors of the
11374 same size, then an entire vector is selected.
11375
11376 Example:
11377 """"""""
11378
11379 .. code-block:: llvm
11380
11381       %X = select i1 true, i8 17, i8 42          ; yields i8:17
11382
11383
11384 .. _i_freeze:
11385
11386 '``freeze``' Instruction
11387 ^^^^^^^^^^^^^^^^^^^^^^^^
11388
11389 Syntax:
11390 """""""
11391
11392 ::
11393
11394       <result> = freeze ty <val>    ; yields ty:result
11395
11396 Overview:
11397 """""""""
11398
11399 The '``freeze``' instruction is used to stop propagation of
11400 :ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
11401
11402 Arguments:
11403 """"""""""
11404
11405 The '``freeze``' instruction takes a single argument.
11406
11407 Semantics:
11408 """"""""""
11409
11410 If the argument is ``undef`` or ``poison``, '``freeze``' returns an
11411 arbitrary, but fixed, value of type '``ty``'.
11412 Otherwise, this instruction is a no-op and returns the input argument.
11413 All uses of a value returned by the same '``freeze``' instruction are
11414 guaranteed to always observe the same value, while different '``freeze``'
11415 instructions may yield different values.
11416
11417 While ``undef`` and ``poison`` pointers can be frozen, the result is a
11418 non-dereferenceable pointer. See the
11419 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
11420 If an aggregate value or vector is frozen, the operand is frozen element-wise.
11421 The padding of an aggregate isn't considered, since it isn't visible
11422 without storing it into memory and loading it with a different type.
11423
11424
11425 Example:
11426 """"""""
11427
11428 .. code-block:: text
11429
11430       %w = i32 undef
11431       %x = freeze i32 %w
11432       %y = add i32 %w, %w         ; undef
11433       %z = add i32 %x, %x         ; even number because all uses of %x observe
11434                                   ; the same value
11435       %x2 = freeze i32 %w
11436       %cmp = icmp eq i32 %x, %x2  ; can be true or false
11437
11438       ; example with vectors
11439       %v = <2 x i32> <i32 undef, i32 poison>
11440       %a = extractelement <2 x i32> %v, i32 0    ; undef
11441       %b = extractelement <2 x i32> %v, i32 1    ; poison
11442       %add = add i32 %a, %a                      ; undef
11443
11444       %v.fr = freeze <2 x i32> %v                ; element-wise freeze
11445       %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
11446       %add.f = add i32 %d, %d                    ; even number
11447
11448       ; branching on frozen value
11449       %poison = add nsw i1 %k, undef   ; poison
11450       %c = freeze i1 %poison
11451       br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
11452
11453
11454 .. _i_call:
11455
11456 '``call``' Instruction
11457 ^^^^^^^^^^^^^^^^^^^^^^
11458
11459 Syntax:
11460 """""""
11461
11462 ::
11463
11464       <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
11465                  <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
11466
11467 Overview:
11468 """""""""
11469
11470 The '``call``' instruction represents a simple function call.
11471
11472 Arguments:
11473 """"""""""
11474
11475 This instruction requires several arguments:
11476
11477 #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
11478    should perform tail call optimization. The ``tail`` marker is a hint that
11479    `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
11480    means that the call must be tail call optimized in order for the program to
11481    be correct. The ``musttail`` marker provides these guarantees:
11482
11483    #. The call will not cause unbounded stack growth if it is part of a
11484       recursive cycle in the call graph.
11485    #. Arguments with the :ref:`inalloca <attr_inalloca>` or
11486       :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
11487    #. If the musttail call appears in a function with the ``"thunk"`` attribute
11488       and the caller and callee both have varargs, than any unprototyped
11489       arguments in register or memory are forwarded to the callee. Similarly,
11490       the return value of the callee is returned to the caller's caller, even
11491       if a void return type is in use.
11492
11493    Both markers imply that the callee does not access allocas from the caller.
11494    The ``tail`` marker additionally implies that the callee does not access
11495    varargs from the caller. Calls marked ``musttail`` must obey the following
11496    additional  rules:
11497
11498    - The call must immediately precede a :ref:`ret <i_ret>` instruction,
11499      or a pointer bitcast followed by a ret instruction.
11500    - The ret instruction must return the (possibly bitcasted) value
11501      produced by the call, undef, or void.
11502    - The calling conventions of the caller and callee must match.
11503    - The callee must be varargs iff the caller is varargs. Bitcasting a
11504      non-varargs function to the appropriate varargs type is legal so
11505      long as the non-varargs prefixes obey the other rules.
11506    - The return type must not undergo automatic conversion to an `sret` pointer.
11507
11508   In addition, if the calling convention is not `swifttailcc` or `tailcc`:
11509
11510    - All ABI-impacting function attributes, such as sret, byval, inreg,
11511      returned, and inalloca, must match.
11512    - The caller and callee prototypes must match. Pointer types of parameters
11513      or return types may differ in pointee type, but not in address space.
11514
11515   On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
11516
11517    - Only these ABI-impacting attributes attributes are allowed: sret, byval,
11518      swiftself, and swiftasync.
11519    - Prototypes are not required to match.
11520
11521    Tail call optimization for calls marked ``tail`` is guaranteed to occur if
11522    the following conditions are met:
11523
11524    -  Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
11525    -  The call is in tail position (ret immediately follows call and ret
11526       uses value of call or is void).
11527    -  Option ``-tailcallopt`` is enabled,
11528       ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
11529       is ``tailcc``
11530    -  `Platform-specific constraints are
11531       met. <CodeGenerator.html#tailcallopt>`_
11532
11533 #. The optional ``notail`` marker indicates that the optimizers should not add
11534    ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
11535    call optimization from being performed on the call.
11536
11537 #. The optional ``fast-math flags`` marker indicates that the call has one or more
11538    :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11539    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11540    for calls that return a floating-point scalar or vector type, or an array
11541    (nested to any depth) of floating-point scalar or vector types.
11542
11543 #. The optional "cconv" marker indicates which :ref:`calling
11544    convention <callingconv>` the call should use. If none is
11545    specified, the call defaults to using C calling conventions. The
11546    calling convention of the call must match the calling convention of
11547    the target function, or else the behavior is undefined.
11548 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
11549    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
11550    are valid here.
11551 #. The optional addrspace attribute can be used to indicate the address space
11552    of the called function. If it is not specified, the program address space
11553    from the :ref:`datalayout string<langref_datalayout>` will be used.
11554 #. '``ty``': the type of the call instruction itself which is also the
11555    type of the return value. Functions that return no value are marked
11556    ``void``.
11557 #. '``fnty``': shall be the signature of the function being called. The
11558    argument types must match the types implied by this signature. This
11559    type can be omitted if the function is not varargs.
11560 #. '``fnptrval``': An LLVM value containing a pointer to a function to
11561    be called. In most cases, this is a direct function call, but
11562    indirect ``call``'s are just as possible, calling an arbitrary pointer
11563    to function value.
11564 #. '``function args``': argument list whose types match the function
11565    signature argument types and parameter attributes. All arguments must
11566    be of :ref:`first class <t_firstclass>` type. If the function signature
11567    indicates the function accepts a variable number of arguments, the
11568    extra arguments can be specified.
11569 #. The optional :ref:`function attributes <fnattrs>` list.
11570 #. The optional :ref:`operand bundles <opbundles>` list.
11571
11572 Semantics:
11573 """"""""""
11574
11575 The '``call``' instruction is used to cause control flow to transfer to
11576 a specified function, with its incoming arguments bound to the specified
11577 values. Upon a '``ret``' instruction in the called function, control
11578 flow continues with the instruction after the function call, and the
11579 return value of the function is bound to the result argument.
11580
11581 Example:
11582 """"""""
11583
11584 .. code-block:: llvm
11585
11586       %retval = call i32 @test(i32 %argc)
11587       call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42)        ; yields i32
11588       %X = tail call i32 @foo()                                    ; yields i32
11589       %Y = tail call fastcc i32 @foo()  ; yields i32
11590       call void %foo(i8 97 signext)
11591
11592       %struct.A = type { i32, i8 }
11593       %r = call %struct.A @foo()                        ; yields { i32, i8 }
11594       %gr = extractvalue %struct.A %r, 0                ; yields i32
11595       %gr1 = extractvalue %struct.A %r, 1               ; yields i8
11596       %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
11597       %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
11598
11599 llvm treats calls to some functions with names and arguments that match
11600 the standard C99 library as being the C99 library functions, and may
11601 perform optimizations or generate code for them under that assumption.
11602 This is something we'd like to change in the future to provide better
11603 support for freestanding environments and non-C-based languages.
11604
11605 .. _i_va_arg:
11606
11607 '``va_arg``' Instruction
11608 ^^^^^^^^^^^^^^^^^^^^^^^^
11609
11610 Syntax:
11611 """""""
11612
11613 ::
11614
11615       <resultval> = va_arg <va_list*> <arglist>, <argty>
11616
11617 Overview:
11618 """""""""
11619
11620 The '``va_arg``' instruction is used to access arguments passed through
11621 the "variable argument" area of a function call. It is used to implement
11622 the ``va_arg`` macro in C.
11623
11624 Arguments:
11625 """"""""""
11626
11627 This instruction takes a ``va_list*`` value and the type of the
11628 argument. It returns a value of the specified argument type and
11629 increments the ``va_list`` to point to the next argument. The actual
11630 type of ``va_list`` is target specific.
11631
11632 Semantics:
11633 """"""""""
11634
11635 The '``va_arg``' instruction loads an argument of the specified type
11636 from the specified ``va_list`` and causes the ``va_list`` to point to
11637 the next argument. For more information, see the variable argument
11638 handling :ref:`Intrinsic Functions <int_varargs>`.
11639
11640 It is legal for this instruction to be called in a function which does
11641 not take a variable number of arguments, for example, the ``vfprintf``
11642 function.
11643
11644 ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
11645 function <intrinsics>` because it takes a type as an argument.
11646
11647 Example:
11648 """"""""
11649
11650 See the :ref:`variable argument processing <int_varargs>` section.
11651
11652 Note that the code generator does not yet fully support va\_arg on many
11653 targets. Also, it does not currently support va\_arg with aggregate
11654 types on any target.
11655
11656 .. _i_landingpad:
11657
11658 '``landingpad``' Instruction
11659 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11660
11661 Syntax:
11662 """""""
11663
11664 ::
11665
11666       <resultval> = landingpad <resultty> <clause>+
11667       <resultval> = landingpad <resultty> cleanup <clause>*
11668
11669       <clause> := catch <type> <value>
11670       <clause> := filter <array constant type> <array constant>
11671
11672 Overview:
11673 """""""""
11674
11675 The '``landingpad``' instruction is used by `LLVM's exception handling
11676 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11677 is a landing pad --- one where the exception lands, and corresponds to the
11678 code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
11679 defines values supplied by the :ref:`personality function <personalityfn>` upon
11680 re-entry to the function. The ``resultval`` has the type ``resultty``.
11681
11682 Arguments:
11683 """"""""""
11684
11685 The optional
11686 ``cleanup`` flag indicates that the landing pad block is a cleanup.
11687
11688 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
11689 contains the global variable representing the "type" that may be caught
11690 or filtered respectively. Unlike the ``catch`` clause, the ``filter``
11691 clause takes an array constant as its argument. Use
11692 "``[0 x i8**] undef``" for a filter which cannot throw. The
11693 '``landingpad``' instruction must contain *at least* one ``clause`` or
11694 the ``cleanup`` flag.
11695
11696 Semantics:
11697 """"""""""
11698
11699 The '``landingpad``' instruction defines the values which are set by the
11700 :ref:`personality function <personalityfn>` upon re-entry to the function, and
11701 therefore the "result type" of the ``landingpad`` instruction. As with
11702 calling conventions, how the personality function results are
11703 represented in LLVM IR is target specific.
11704
11705 The clauses are applied in order from top to bottom. If two
11706 ``landingpad`` instructions are merged together through inlining, the
11707 clauses from the calling function are appended to the list of clauses.
11708 When the call stack is being unwound due to an exception being thrown,
11709 the exception is compared against each ``clause`` in turn. If it doesn't
11710 match any of the clauses, and the ``cleanup`` flag is not set, then
11711 unwinding continues further up the call stack.
11712
11713 The ``landingpad`` instruction has several restrictions:
11714
11715 -  A landing pad block is a basic block which is the unwind destination
11716    of an '``invoke``' instruction.
11717 -  A landing pad block must have a '``landingpad``' instruction as its
11718    first non-PHI instruction.
11719 -  There can be only one '``landingpad``' instruction within the landing
11720    pad block.
11721 -  A basic block that is not a landing pad block may not include a
11722    '``landingpad``' instruction.
11723
11724 Example:
11725 """"""""
11726
11727 .. code-block:: llvm
11728
11729       ;; A landing pad which can catch an integer.
11730       %res = landingpad { i8*, i32 }
11731                catch i8** @_ZTIi
11732       ;; A landing pad that is a cleanup.
11733       %res = landingpad { i8*, i32 }
11734                cleanup
11735       ;; A landing pad which can catch an integer and can only throw a double.
11736       %res = landingpad { i8*, i32 }
11737                catch i8** @_ZTIi
11738                filter [1 x i8**] [@_ZTId]
11739
11740 .. _i_catchpad:
11741
11742 '``catchpad``' Instruction
11743 ^^^^^^^^^^^^^^^^^^^^^^^^^^
11744
11745 Syntax:
11746 """""""
11747
11748 ::
11749
11750       <resultval> = catchpad within <catchswitch> [<args>*]
11751
11752 Overview:
11753 """""""""
11754
11755 The '``catchpad``' instruction is used by `LLVM's exception handling
11756 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11757 begins a catch handler --- one where a personality routine attempts to transfer
11758 control to catch an exception.
11759
11760 Arguments:
11761 """"""""""
11762
11763 The ``catchswitch`` operand must always be a token produced by a
11764 :ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
11765 ensures that each ``catchpad`` has exactly one predecessor block, and it always
11766 terminates in a ``catchswitch``.
11767
11768 The ``args`` correspond to whatever information the personality routine
11769 requires to know if this is an appropriate handler for the exception. Control
11770 will transfer to the ``catchpad`` if this is the first appropriate handler for
11771 the exception.
11772
11773 The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
11774 ``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
11775 pads.
11776
11777 Semantics:
11778 """"""""""
11779
11780 When the call stack is being unwound due to an exception being thrown, the
11781 exception is compared against the ``args``. If it doesn't match, control will
11782 not reach the ``catchpad`` instruction.  The representation of ``args`` is
11783 entirely target and personality function-specific.
11784
11785 Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
11786 instruction must be the first non-phi of its parent basic block.
11787
11788 The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
11789 instructions is described in the
11790 `Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
11791
11792 When a ``catchpad`` has been "entered" but not yet "exited" (as
11793 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11794 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11795 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11796
11797 Example:
11798 """"""""
11799
11800 .. code-block:: text
11801
11802     dispatch:
11803       %cs = catchswitch within none [label %handler0] unwind to caller
11804       ;; A catch block which can catch an integer.
11805     handler0:
11806       %tok = catchpad within %cs [i8** @_ZTIi]
11807
11808 .. _i_cleanuppad:
11809
11810 '``cleanuppad``' Instruction
11811 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11812
11813 Syntax:
11814 """""""
11815
11816 ::
11817
11818       <resultval> = cleanuppad within <parent> [<args>*]
11819
11820 Overview:
11821 """""""""
11822
11823 The '``cleanuppad``' instruction is used by `LLVM's exception handling
11824 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11825 is a cleanup block --- one where a personality routine attempts to
11826 transfer control to run cleanup actions.
11827 The ``args`` correspond to whatever additional
11828 information the :ref:`personality function <personalityfn>` requires to
11829 execute the cleanup.
11830 The ``resultval`` has the type :ref:`token <t_token>` and is used to
11831 match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
11832 The ``parent`` argument is the token of the funclet that contains the
11833 ``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
11834 this operand may be the token ``none``.
11835
11836 Arguments:
11837 """"""""""
11838
11839 The instruction takes a list of arbitrary values which are interpreted
11840 by the :ref:`personality function <personalityfn>`.
11841
11842 Semantics:
11843 """"""""""
11844
11845 When the call stack is being unwound due to an exception being thrown,
11846 the :ref:`personality function <personalityfn>` transfers control to the
11847 ``cleanuppad`` with the aid of the personality-specific arguments.
11848 As with calling conventions, how the personality function results are
11849 represented in LLVM IR is target specific.
11850
11851 The ``cleanuppad`` instruction has several restrictions:
11852
11853 -  A cleanup block is a basic block which is the unwind destination of
11854    an exceptional instruction.
11855 -  A cleanup block must have a '``cleanuppad``' instruction as its
11856    first non-PHI instruction.
11857 -  There can be only one '``cleanuppad``' instruction within the
11858    cleanup block.
11859 -  A basic block that is not a cleanup block may not include a
11860    '``cleanuppad``' instruction.
11861
11862 When a ``cleanuppad`` has been "entered" but not yet "exited" (as
11863 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11864 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11865 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11866
11867 Example:
11868 """"""""
11869
11870 .. code-block:: text
11871
11872       %tok = cleanuppad within %cs []
11873
11874 .. _intrinsics:
11875
11876 Intrinsic Functions
11877 ===================
11878
11879 LLVM supports the notion of an "intrinsic function". These functions
11880 have well known names and semantics and are required to follow certain
11881 restrictions. Overall, these intrinsics represent an extension mechanism
11882 for the LLVM language that does not require changing all of the
11883 transformations in LLVM when adding to the language (or the bitcode
11884 reader/writer, the parser, etc...).
11885
11886 Intrinsic function names must all start with an "``llvm.``" prefix. This
11887 prefix is reserved in LLVM for intrinsic names; thus, function names may
11888 not begin with this prefix. Intrinsic functions must always be external
11889 functions: you cannot define the body of intrinsic functions. Intrinsic
11890 functions may only be used in call or invoke instructions: it is illegal
11891 to take the address of an intrinsic function. Additionally, because
11892 intrinsic functions are part of the LLVM language, it is required if any
11893 are added that they be documented here.
11894
11895 Some intrinsic functions can be overloaded, i.e., the intrinsic
11896 represents a family of functions that perform the same operation but on
11897 different data types. Because LLVM can represent over 8 million
11898 different integer types, overloading is used commonly to allow an
11899 intrinsic function to operate on any integer type. One or more of the
11900 argument types or the result type can be overloaded to accept any
11901 integer type. Argument types may also be defined as exactly matching a
11902 previous argument's type or the result type. This allows an intrinsic
11903 function which accepts multiple arguments, but needs all of them to be
11904 of the same type, to only be overloaded with respect to a single
11905 argument or the result.
11906
11907 Overloaded intrinsics will have the names of its overloaded argument
11908 types encoded into its function name, each preceded by a period. Only
11909 those types which are overloaded result in a name suffix. Arguments
11910 whose type is matched against another type do not. For example, the
11911 ``llvm.ctpop`` function can take an integer of any width and returns an
11912 integer of exactly the same integer width. This leads to a family of
11913 functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
11914 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
11915 overloaded, and only one type suffix is required. Because the argument's
11916 type is matched against the return type, it does not require its own
11917 name suffix.
11918
11919 :ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
11920 that depend on an unnamed type in one of its overloaded argument types get an
11921 additional ``.<number>`` suffix. This allows differentiating intrinsics with
11922 different unnamed types as arguments. (For example:
11923 ``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
11924 it ensures unique names in the module. While linking together two modules, it is
11925 still possible to get a name clash. In that case one of the names will be
11926 changed by getting a new number.
11927
11928 For target developers who are defining intrinsics for back-end code
11929 generation, any intrinsic overloads based solely the distinction between
11930 integer or floating point types should not be relied upon for correct
11931 code generation. In such cases, the recommended approach for target
11932 maintainers when defining intrinsics is to create separate integer and
11933 FP intrinsics rather than rely on overloading. For example, if different
11934 codegen is required for ``llvm.target.foo(<4 x i32>)`` and
11935 ``llvm.target.foo(<4 x float>)`` then these should be split into
11936 different intrinsics.
11937
11938 To learn how to add an intrinsic function, please see the `Extending
11939 LLVM Guide <ExtendingLLVM.html>`_.
11940
11941 .. _int_varargs:
11942
11943 Variable Argument Handling Intrinsics
11944 -------------------------------------
11945
11946 Variable argument support is defined in LLVM with the
11947 :ref:`va_arg <i_va_arg>` instruction and these three intrinsic
11948 functions. These functions are related to the similarly named macros
11949 defined in the ``<stdarg.h>`` header file.
11950
11951 All of these functions operate on arguments that use a target-specific
11952 value type "``va_list``". The LLVM assembly language reference manual
11953 does not define what this type is, so all transformations should be
11954 prepared to handle these functions regardless of the type used.
11955
11956 This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
11957 variable argument handling intrinsic functions are used.
11958
11959 .. code-block:: llvm
11960
11961     ; This struct is different for every platform. For most platforms,
11962     ; it is merely an i8*.
11963     %struct.va_list = type { i8* }
11964
11965     ; For Unix x86_64 platforms, va_list is the following struct:
11966     ; %struct.va_list = type { i32, i32, i8*, i8* }
11967
11968     define i32 @test(i32 %X, ...) {
11969       ; Initialize variable argument processing
11970       %ap = alloca %struct.va_list
11971       %ap2 = bitcast %struct.va_list* %ap to i8*
11972       call void @llvm.va_start(i8* %ap2)
11973
11974       ; Read a single integer argument
11975       %tmp = va_arg i8* %ap2, i32
11976
11977       ; Demonstrate usage of llvm.va_copy and llvm.va_end
11978       %aq = alloca i8*
11979       %aq2 = bitcast i8** %aq to i8*
11980       call void @llvm.va_copy(i8* %aq2, i8* %ap2)
11981       call void @llvm.va_end(i8* %aq2)
11982
11983       ; Stop processing of arguments.
11984       call void @llvm.va_end(i8* %ap2)
11985       ret i32 %tmp
11986     }
11987
11988     declare void @llvm.va_start(i8*)
11989     declare void @llvm.va_copy(i8*, i8*)
11990     declare void @llvm.va_end(i8*)
11991
11992 .. _int_va_start:
11993
11994 '``llvm.va_start``' Intrinsic
11995 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11996
11997 Syntax:
11998 """""""
11999
12000 ::
12001
12002       declare void @llvm.va_start(i8* <arglist>)
12003
12004 Overview:
12005 """""""""
12006
12007 The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
12008 subsequent use by ``va_arg``.
12009
12010 Arguments:
12011 """"""""""
12012
12013 The argument is a pointer to a ``va_list`` element to initialize.
12014
12015 Semantics:
12016 """"""""""
12017
12018 The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
12019 available in C. In a target-dependent way, it initializes the
12020 ``va_list`` element to which the argument points, so that the next call
12021 to ``va_arg`` will produce the first variable argument passed to the
12022 function. Unlike the C ``va_start`` macro, this intrinsic does not need
12023 to know the last argument of the function as the compiler can figure
12024 that out.
12025
12026 '``llvm.va_end``' Intrinsic
12027 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12028
12029 Syntax:
12030 """""""
12031
12032 ::
12033
12034       declare void @llvm.va_end(i8* <arglist>)
12035
12036 Overview:
12037 """""""""
12038
12039 The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
12040 initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12041
12042 Arguments:
12043 """"""""""
12044
12045 The argument is a pointer to a ``va_list`` to destroy.
12046
12047 Semantics:
12048 """"""""""
12049
12050 The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12051 available in C. In a target-dependent way, it destroys the ``va_list``
12052 element to which the argument points. Calls to
12053 :ref:`llvm.va_start <int_va_start>` and
12054 :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12055 ``llvm.va_end``.
12056
12057 .. _int_va_copy:
12058
12059 '``llvm.va_copy``' Intrinsic
12060 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12061
12062 Syntax:
12063 """""""
12064
12065 ::
12066
12067       declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
12068
12069 Overview:
12070 """""""""
12071
12072 The '``llvm.va_copy``' intrinsic copies the current argument position
12073 from the source argument list to the destination argument list.
12074
12075 Arguments:
12076 """"""""""
12077
12078 The first argument is a pointer to a ``va_list`` element to initialize.
12079 The second argument is a pointer to a ``va_list`` element to copy from.
12080
12081 Semantics:
12082 """"""""""
12083
12084 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12085 available in C. In a target-dependent way, it copies the source
12086 ``va_list`` element into the destination ``va_list`` element. This
12087 intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12088 arbitrarily complex and require, for example, memory allocation.
12089
12090 Accurate Garbage Collection Intrinsics
12091 --------------------------------------
12092
12093 LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12094 (GC) requires the frontend to generate code containing appropriate intrinsic
12095 calls and select an appropriate GC strategy which knows how to lower these
12096 intrinsics in a manner which is appropriate for the target collector.
12097
12098 These intrinsics allow identification of :ref:`GC roots on the
12099 stack <int_gcroot>`, as well as garbage collector implementations that
12100 require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12101 Frontends for type-safe garbage collected languages should generate
12102 these intrinsics to make use of the LLVM garbage collectors. For more
12103 details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12104
12105 LLVM provides an second experimental set of intrinsics for describing garbage
12106 collection safepoints in compiled code. These intrinsics are an alternative
12107 to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12108 :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12109 differences in approach are covered in the `Garbage Collection with LLVM
12110 <GarbageCollection.html>`_ documentation. The intrinsics themselves are
12111 described in :doc:`Statepoints`.
12112
12113 .. _int_gcroot:
12114
12115 '``llvm.gcroot``' Intrinsic
12116 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12117
12118 Syntax:
12119 """""""
12120
12121 ::
12122
12123       declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
12124
12125 Overview:
12126 """""""""
12127
12128 The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12129 the code generator, and allows some metadata to be associated with it.
12130
12131 Arguments:
12132 """"""""""
12133
12134 The first argument specifies the address of a stack object that contains
12135 the root pointer. The second pointer (which must be either a constant or
12136 a global value address) contains the meta-data to be associated with the
12137 root.
12138
12139 Semantics:
12140 """"""""""
12141
12142 At runtime, a call to this intrinsic stores a null pointer into the
12143 "ptrloc" location. At compile-time, the code generator generates
12144 information to allow the runtime to find the pointer at GC safe points.
12145 The '``llvm.gcroot``' intrinsic may only be used in a function which
12146 :ref:`specifies a GC algorithm <gc>`.
12147
12148 .. _int_gcread:
12149
12150 '``llvm.gcread``' Intrinsic
12151 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12152
12153 Syntax:
12154 """""""
12155
12156 ::
12157
12158       declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
12159
12160 Overview:
12161 """""""""
12162
12163 The '``llvm.gcread``' intrinsic identifies reads of references from heap
12164 locations, allowing garbage collector implementations that require read
12165 barriers.
12166
12167 Arguments:
12168 """"""""""
12169
12170 The second argument is the address to read from, which should be an
12171 address allocated from the garbage collector. The first object is a
12172 pointer to the start of the referenced object, if needed by the language
12173 runtime (otherwise null).
12174
12175 Semantics:
12176 """"""""""
12177
12178 The '``llvm.gcread``' intrinsic has the same semantics as a load
12179 instruction, but may be replaced with substantially more complex code by
12180 the garbage collector runtime, as needed. The '``llvm.gcread``'
12181 intrinsic may only be used in a function which :ref:`specifies a GC
12182 algorithm <gc>`.
12183
12184 .. _int_gcwrite:
12185
12186 '``llvm.gcwrite``' Intrinsic
12187 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12188
12189 Syntax:
12190 """""""
12191
12192 ::
12193
12194       declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
12195
12196 Overview:
12197 """""""""
12198
12199 The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12200 locations, allowing garbage collector implementations that require write
12201 barriers (such as generational or reference counting collectors).
12202
12203 Arguments:
12204 """"""""""
12205
12206 The first argument is the reference to store, the second is the start of
12207 the object to store it to, and the third is the address of the field of
12208 Obj to store to. If the runtime does not require a pointer to the
12209 object, Obj may be null.
12210
12211 Semantics:
12212 """"""""""
12213
12214 The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12215 instruction, but may be replaced with substantially more complex code by
12216 the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12217 intrinsic may only be used in a function which :ref:`specifies a GC
12218 algorithm <gc>`.
12219
12220
12221 .. _gc_statepoint:
12222
12223 'llvm.experimental.gc.statepoint' Intrinsic
12224 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12225
12226 Syntax:
12227 """""""
12228
12229 ::
12230
12231       declare token
12232         @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
12233                        func_type <target>,
12234                        i64 <#call args>, i64 <flags>,
12235                        ... (call parameters),
12236                        i64 0, i64 0)
12237
12238 Overview:
12239 """""""""
12240
12241 The statepoint intrinsic represents a call which is parse-able by the
12242 runtime.
12243
12244 Operands:
12245 """""""""
12246
12247 The 'id' operand is a constant integer that is reported as the ID
12248 field in the generated stackmap.  LLVM does not interpret this
12249 parameter in any way and its meaning is up to the statepoint user to
12250 decide.  Note that LLVM is free to duplicate code containing
12251 statepoint calls, and this may transform IR that had a unique 'id' per
12252 lexical call to statepoint to IR that does not.
12253
12254 If 'num patch bytes' is non-zero then the call instruction
12255 corresponding to the statepoint is not emitted and LLVM emits 'num
12256 patch bytes' bytes of nops in its place.  LLVM will emit code to
12257 prepare the function arguments and retrieve the function return value
12258 in accordance to the calling convention; the former before the nop
12259 sequence and the latter after the nop sequence.  It is expected that
12260 the user will patch over the 'num patch bytes' bytes of nops with a
12261 calling sequence specific to their runtime before executing the
12262 generated machine code.  There are no guarantees with respect to the
12263 alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
12264 not have a concept of shadow bytes.  Note that semantically the
12265 statepoint still represents a call or invoke to 'target', and the nop
12266 sequence after patching is expected to represent an operation
12267 equivalent to a call or invoke to 'target'.
12268
12269 The 'target' operand is the function actually being called.  The
12270 target can be specified as either a symbolic LLVM function, or as an
12271 arbitrary Value of appropriate function type.  Note that the function
12272 type must match the signature of the callee and the types of the 'call
12273 parameters' arguments.
12274
12275 The '#call args' operand is the number of arguments to the actual
12276 call.  It must exactly match the number of arguments passed in the
12277 'call parameters' variable length section.
12278
12279 The 'flags' operand is used to specify extra information about the
12280 statepoint. This is currently only used to mark certain statepoints
12281 as GC transitions. This operand is a 64-bit integer with the following
12282 layout, where bit 0 is the least significant bit:
12283
12284   +-------+---------------------------------------------------+
12285   | Bit # | Usage                                             |
12286   +=======+===================================================+
12287   |     0 | Set if the statepoint is a GC transition, cleared |
12288   |       | otherwise.                                        |
12289   +-------+---------------------------------------------------+
12290   |  1-63 | Reserved for future use; must be cleared.         |
12291   +-------+---------------------------------------------------+
12292
12293 The 'call parameters' arguments are simply the arguments which need to
12294 be passed to the call target.  They will be lowered according to the
12295 specified calling convention and otherwise handled like a normal call
12296 instruction.  The number of arguments must exactly match what is
12297 specified in '# call args'.  The types must match the signature of
12298 'target'.
12299
12300 The 'call parameter' attributes must be followed by two 'i64 0' constants.
12301 These were originally the length prefixes for 'gc transition parameter' and
12302 'deopt parameter' arguments, but the role of these parameter sets have been
12303 entirely replaced with the corresponding operand bundles.  In a future
12304 revision, these now redundant arguments will be removed.
12305
12306 Semantics:
12307 """"""""""
12308
12309 A statepoint is assumed to read and write all memory.  As a result,
12310 memory operations can not be reordered past a statepoint.  It is
12311 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
12312
12313 Note that legal IR can not perform any memory operation on a 'gc
12314 pointer' argument of the statepoint in a location statically reachable
12315 from the statepoint.  Instead, the explicitly relocated value (from a
12316 ``gc.relocate``) must be used.
12317
12318 'llvm.experimental.gc.result' Intrinsic
12319 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12320
12321 Syntax:
12322 """""""
12323
12324 ::
12325
12326       declare type*
12327         @llvm.experimental.gc.result(token %statepoint_token)
12328
12329 Overview:
12330 """""""""
12331
12332 ``gc.result`` extracts the result of the original call instruction
12333 which was replaced by the ``gc.statepoint``.  The ``gc.result``
12334 intrinsic is actually a family of three intrinsics due to an
12335 implementation limitation.  Other than the type of the return value,
12336 the semantics are the same.
12337
12338 Operands:
12339 """""""""
12340
12341 The first and only argument is the ``gc.statepoint`` which starts
12342 the safepoint sequence of which this ``gc.result`` is a part.
12343 Despite the typing of this as a generic token, *only* the value defined
12344 by a ``gc.statepoint`` is legal here.
12345
12346 Semantics:
12347 """"""""""
12348
12349 The ``gc.result`` represents the return value of the call target of
12350 the ``statepoint``.  The type of the ``gc.result`` must exactly match
12351 the type of the target.  If the call target returns void, there will
12352 be no ``gc.result``.
12353
12354 A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
12355 side effects since it is just a projection of the return value of the
12356 previous call represented by the ``gc.statepoint``.
12357
12358 'llvm.experimental.gc.relocate' Intrinsic
12359 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12360
12361 Syntax:
12362 """""""
12363
12364 ::
12365
12366       declare <pointer type>
12367         @llvm.experimental.gc.relocate(token %statepoint_token,
12368                                        i32 %base_offset,
12369                                        i32 %pointer_offset)
12370
12371 Overview:
12372 """""""""
12373
12374 A ``gc.relocate`` returns the potentially relocated value of a pointer
12375 at the safepoint.
12376
12377 Operands:
12378 """""""""
12379
12380 The first argument is the ``gc.statepoint`` which starts the
12381 safepoint sequence of which this ``gc.relocation`` is a part.
12382 Despite the typing of this as a generic token, *only* the value defined
12383 by a ``gc.statepoint`` is legal here.
12384
12385 The second and third arguments are both indices into operands of the
12386 corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
12387
12388 The second argument is an index which specifies the allocation for the pointer
12389 being relocated. The associated value must be within the object with which the
12390 pointer being relocated is associated. The optimizer is free to change *which*
12391 interior derived pointer is reported, provided that it does not replace an
12392 actual base pointer with another interior derived pointer. Collectors are
12393 allowed to rely on the base pointer operand remaining an actual base pointer if
12394 so constructed.
12395
12396 The third argument is an index which specify the (potentially) derived pointer
12397 being relocated.  It is legal for this index to be the same as the second
12398 argument if-and-only-if a base pointer is being relocated.
12399
12400 Semantics:
12401 """"""""""
12402
12403 The return value of ``gc.relocate`` is the potentially relocated value
12404 of the pointer specified by its arguments.  It is unspecified how the
12405 value of the returned pointer relates to the argument to the
12406 ``gc.statepoint`` other than that a) it points to the same source
12407 language object with the same offset, and b) the 'based-on'
12408 relationship of the newly relocated pointers is a projection of the
12409 unrelocated pointers.  In particular, the integer value of the pointer
12410 returned is unspecified.
12411
12412 A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
12413 side effects since it is just a way to extract information about work
12414 done during the actual call modeled by the ``gc.statepoint``.
12415
12416 .. _gc.get.pointer.base:
12417
12418 'llvm.experimental.gc.get.pointer.base' Intrinsic
12419 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12420
12421 Syntax:
12422 """""""
12423
12424 ::
12425
12426       declare <pointer type>
12427         @llvm.experimental.gc.get.pointer.base(
12428           <pointer type> readnone nocapture %derived_ptr)
12429           nounwind readnone willreturn
12430
12431 Overview:
12432 """""""""
12433
12434 ``gc.get.pointer.base`` for a derived pointer returns its base pointer.
12435
12436 Operands:
12437 """""""""
12438
12439 The only argument is a pointer which is based on some object with
12440 an unknown offset from the base of said object.
12441
12442 Semantics:
12443 """"""""""
12444
12445 This intrinsic is used in the abstract machine model for GC to represent
12446 the base pointer for an arbitrary derived pointer.
12447
12448 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12449 replacing all uses of this callsite with the offset of a derived pointer from
12450 its base pointer value. The replacement is done as part of the lowering to the
12451 explicit statepoint model.
12452
12453 The return pointer type must be the same as the type of the parameter.
12454
12455
12456 'llvm.experimental.gc.get.pointer.offset' Intrinsic
12457 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12458
12459 Syntax:
12460 """""""
12461
12462 ::
12463
12464       declare i64
12465         @llvm.experimental.gc.get.pointer.offset(
12466           <pointer type> readnone nocapture %derived_ptr)
12467           nounwind readnone willreturn
12468
12469 Overview:
12470 """""""""
12471
12472 ``gc.get.pointer.offset`` for a derived pointer returns the offset from its
12473 base pointer.
12474
12475 Operands:
12476 """""""""
12477
12478 The only argument is a pointer which is based on some object with
12479 an unknown offset from the base of said object.
12480
12481 Semantics:
12482 """"""""""
12483
12484 This intrinsic is used in the abstract machine model for GC to represent
12485 the offset of an arbitrary derived pointer from its base pointer.
12486
12487 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12488 replacing all uses of this callsite with the offset of a derived pointer from
12489 its base pointer value. The replacement is done as part of the lowering to the
12490 explicit statepoint model.
12491
12492 Basically this call calculates difference between the derived pointer and its
12493 base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
12494 this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
12495 in the pointers lost for further lowering from the abstract model to the
12496 explicit physical one.
12497
12498 Code Generator Intrinsics
12499 -------------------------
12500
12501 These intrinsics are provided by LLVM to expose special features that
12502 may only be implemented with code generator support.
12503
12504 '``llvm.returnaddress``' Intrinsic
12505 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12506
12507 Syntax:
12508 """""""
12509
12510 ::
12511
12512       declare i8* @llvm.returnaddress(i32 <level>)
12513
12514 Overview:
12515 """""""""
12516
12517 The '``llvm.returnaddress``' intrinsic attempts to compute a
12518 target-specific value indicating the return address of the current
12519 function or one of its callers.
12520
12521 Arguments:
12522 """"""""""
12523
12524 The argument to this intrinsic indicates which function to return the
12525 address for. Zero indicates the calling function, one indicates its
12526 caller, etc. The argument is **required** to be a constant integer
12527 value.
12528
12529 Semantics:
12530 """"""""""
12531
12532 The '``llvm.returnaddress``' intrinsic either returns a pointer
12533 indicating the return address of the specified call frame, or zero if it
12534 cannot be identified. The value returned by this intrinsic is likely to
12535 be incorrect or 0 for arguments other than zero, so it should only be
12536 used for debugging purposes.
12537
12538 Note that calling this intrinsic does not prevent function inlining or
12539 other aggressive transformations, so the value returned may not be that
12540 of the obvious source-language caller.
12541
12542 '``llvm.addressofreturnaddress``' Intrinsic
12543 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12544
12545 Syntax:
12546 """""""
12547
12548 ::
12549
12550       declare i8* @llvm.addressofreturnaddress()
12551
12552 Overview:
12553 """""""""
12554
12555 The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
12556 pointer to the place in the stack frame where the return address of the
12557 current function is stored.
12558
12559 Semantics:
12560 """"""""""
12561
12562 Note that calling this intrinsic does not prevent function inlining or
12563 other aggressive transformations, so the value returned may not be that
12564 of the obvious source-language caller.
12565
12566 This intrinsic is only implemented for x86 and aarch64.
12567
12568 '``llvm.sponentry``' Intrinsic
12569 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12570
12571 Syntax:
12572 """""""
12573
12574 ::
12575
12576       declare i8* @llvm.sponentry()
12577
12578 Overview:
12579 """""""""
12580
12581 The '``llvm.sponentry``' intrinsic returns the stack pointer value at
12582 the entry of the current function calling this intrinsic.
12583
12584 Semantics:
12585 """"""""""
12586
12587 Note this intrinsic is only verified on AArch64.
12588
12589 '``llvm.frameaddress``' Intrinsic
12590 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12591
12592 Syntax:
12593 """""""
12594
12595 ::
12596
12597       declare i8* @llvm.frameaddress(i32 <level>)
12598
12599 Overview:
12600 """""""""
12601
12602 The '``llvm.frameaddress``' intrinsic attempts to return the
12603 target-specific frame pointer value for the specified stack frame.
12604
12605 Arguments:
12606 """"""""""
12607
12608 The argument to this intrinsic indicates which function to return the
12609 frame pointer for. Zero indicates the calling function, one indicates
12610 its caller, etc. The argument is **required** to be a constant integer
12611 value.
12612
12613 Semantics:
12614 """"""""""
12615
12616 The '``llvm.frameaddress``' intrinsic either returns a pointer
12617 indicating the frame address of the specified call frame, or zero if it
12618 cannot be identified. The value returned by this intrinsic is likely to
12619 be incorrect or 0 for arguments other than zero, so it should only be
12620 used for debugging purposes.
12621
12622 Note that calling this intrinsic does not prevent function inlining or
12623 other aggressive transformations, so the value returned may not be that
12624 of the obvious source-language caller.
12625
12626 '``llvm.swift.async.context.addr``' Intrinsic
12627 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12628
12629 Syntax:
12630 """""""
12631
12632 ::
12633
12634       declare i8** @llvm.swift.async.context.addr()
12635
12636 Overview:
12637 """""""""
12638
12639 The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
12640 the part of the extended frame record containing the asynchronous
12641 context of a Swift execution.
12642
12643 Semantics:
12644 """"""""""
12645
12646 If the caller has a ``swiftasync`` parameter, that argument will initially
12647 be stored at the returned address. If not, it will be initialized to null.
12648
12649 '``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
12650 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12651
12652 Syntax:
12653 """""""
12654
12655 ::
12656
12657       declare void @llvm.localescape(...)
12658       declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
12659
12660 Overview:
12661 """""""""
12662
12663 The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
12664 allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
12665 live frame pointer to recover the address of the allocation. The offset is
12666 computed during frame layout of the caller of ``llvm.localescape``.
12667
12668 Arguments:
12669 """"""""""
12670
12671 All arguments to '``llvm.localescape``' must be pointers to static allocas or
12672 casts of static allocas. Each function can only call '``llvm.localescape``'
12673 once, and it can only do so from the entry block.
12674
12675 The ``func`` argument to '``llvm.localrecover``' must be a constant
12676 bitcasted pointer to a function defined in the current module. The code
12677 generator cannot determine the frame allocation offset of functions defined in
12678 other modules.
12679
12680 The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
12681 call frame that is currently live. The return value of '``llvm.localaddress``'
12682 is one way to produce such a value, but various runtimes also expose a suitable
12683 pointer in platform-specific ways.
12684
12685 The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
12686 '``llvm.localescape``' to recover. It is zero-indexed.
12687
12688 Semantics:
12689 """"""""""
12690
12691 These intrinsics allow a group of functions to share access to a set of local
12692 stack allocations of a one parent function. The parent function may call the
12693 '``llvm.localescape``' intrinsic once from the function entry block, and the
12694 child functions can use '``llvm.localrecover``' to access the escaped allocas.
12695 The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
12696 the escaped allocas are allocated, which would break attempts to use
12697 '``llvm.localrecover``'.
12698
12699 '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
12700 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12701
12702 Syntax:
12703 """""""
12704
12705 ::
12706
12707       declare void @llvm.seh.try.begin()
12708       declare void @llvm.seh.try.end()
12709
12710 Overview:
12711 """""""""
12712
12713 The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
12714 the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
12715
12716 Semantics:
12717 """"""""""
12718
12719 When a C-function is compiled with Windows SEH Asynchrous Exception option,
12720 -feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
12721 boundary and to prevent potential exceptions from being moved across boundary.
12722 Any set of operations can then be confined to the region by reading their leaf
12723 inputs via volatile loads and writing their root outputs via volatile stores.
12724
12725 '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
12726 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12727
12728 Syntax:
12729 """""""
12730
12731 ::
12732
12733       declare void @llvm.seh.scope.begin()
12734       declare void @llvm.seh.scope.end()
12735
12736 Overview:
12737 """""""""
12738
12739 The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
12740 the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
12741 Handling (MSVC option -EHa).
12742
12743 Semantics:
12744 """"""""""
12745
12746 LLVM's ordinary exception-handling representation associates EH cleanups and
12747 handlers only with ``invoke``s, which normally correspond only to call sites.  To
12748 support arbitrary faulting instructions, it must be possible to recover the current
12749 EH scope for any instruction.  Turning every operation in LLVM that could fault
12750 into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
12751 large number of intrinsics, impede optimization of those operations, and make
12752 compilation slower by introducing many extra basic blocks.  These intrinsics can
12753 be used instead to mark the region protected by a cleanup, such as for a local
12754 C++ object with a non-trivial destructor.  ``llvm.seh.scope.begin`` is used to mark
12755 the start of the region; it is always called with ``invoke``, with the unwind block
12756 being the desired unwind destination for any potentially-throwing instructions
12757 within the region.  `llvm.seh.scope.end` is used to mark when the scope ends
12758 and the EH cleanup is no longer required (e.g. because the destructor is being
12759 called).
12760
12761 .. _int_read_register:
12762 .. _int_read_volatile_register:
12763 .. _int_write_register:
12764
12765 '``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
12766 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12767
12768 Syntax:
12769 """""""
12770
12771 ::
12772
12773       declare i32 @llvm.read_register.i32(metadata)
12774       declare i64 @llvm.read_register.i64(metadata)
12775       declare i32 @llvm.read_volatile_register.i32(metadata)
12776       declare i64 @llvm.read_volatile_register.i64(metadata)
12777       declare void @llvm.write_register.i32(metadata, i32 @value)
12778       declare void @llvm.write_register.i64(metadata, i64 @value)
12779       !0 = !{!"sp\00"}
12780
12781 Overview:
12782 """""""""
12783
12784 The '``llvm.read_register``', '``llvm.read_volatile_register``', and
12785 '``llvm.write_register``' intrinsics provide access to the named register.
12786 The register must be valid on the architecture being compiled to. The type
12787 needs to be compatible with the register being read.
12788
12789 Semantics:
12790 """"""""""
12791
12792 The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
12793 return the current value of the register, where possible. The
12794 '``llvm.write_register``' intrinsic sets the current value of the register,
12795 where possible.
12796
12797 A call to '``llvm.read_volatile_register``' is assumed to have side-effects
12798 and possibly return a different value each time (e.g. for a timer register).
12799
12800 This is useful to implement named register global variables that need
12801 to always be mapped to a specific register, as is common practice on
12802 bare-metal programs including OS kernels.
12803
12804 The compiler doesn't check for register availability or use of the used
12805 register in surrounding code, including inline assembly. Because of that,
12806 allocatable registers are not supported.
12807
12808 Warning: So far it only works with the stack pointer on selected
12809 architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
12810 work is needed to support other registers and even more so, allocatable
12811 registers.
12812
12813 .. _int_stacksave:
12814
12815 '``llvm.stacksave``' Intrinsic
12816 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12817
12818 Syntax:
12819 """""""
12820
12821 ::
12822
12823       declare i8* @llvm.stacksave()
12824
12825 Overview:
12826 """""""""
12827
12828 The '``llvm.stacksave``' intrinsic is used to remember the current state
12829 of the function stack, for use with
12830 :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
12831 implementing language features like scoped automatic variable sized
12832 arrays in C99.
12833
12834 Semantics:
12835 """"""""""
12836
12837 This intrinsic returns an opaque pointer value that can be passed to
12838 :ref:`llvm.stackrestore <int_stackrestore>`. When an
12839 ``llvm.stackrestore`` intrinsic is executed with a value saved from
12840 ``llvm.stacksave``, it effectively restores the state of the stack to
12841 the state it was in when the ``llvm.stacksave`` intrinsic executed. In
12842 practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
12843 were allocated after the ``llvm.stacksave`` was executed.
12844
12845 .. _int_stackrestore:
12846
12847 '``llvm.stackrestore``' Intrinsic
12848 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12849
12850 Syntax:
12851 """""""
12852
12853 ::
12854
12855       declare void @llvm.stackrestore(i8* %ptr)
12856
12857 Overview:
12858 """""""""
12859
12860 The '``llvm.stackrestore``' intrinsic is used to restore the state of
12861 the function stack to the state it was in when the corresponding
12862 :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
12863 useful for implementing language features like scoped automatic variable
12864 sized arrays in C99.
12865
12866 Semantics:
12867 """"""""""
12868
12869 See the description for :ref:`llvm.stacksave <int_stacksave>`.
12870
12871 .. _int_get_dynamic_area_offset:
12872
12873 '``llvm.get.dynamic.area.offset``' Intrinsic
12874 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12875
12876 Syntax:
12877 """""""
12878
12879 ::
12880
12881       declare i32 @llvm.get.dynamic.area.offset.i32()
12882       declare i64 @llvm.get.dynamic.area.offset.i64()
12883
12884 Overview:
12885 """""""""
12886
12887       The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
12888       get the offset from native stack pointer to the address of the most
12889       recent dynamic alloca on the caller's stack. These intrinsics are
12890       intended for use in combination with
12891       :ref:`llvm.stacksave <int_stacksave>` to get a
12892       pointer to the most recent dynamic alloca. This is useful, for example,
12893       for AddressSanitizer's stack unpoisoning routines.
12894
12895 Semantics:
12896 """"""""""
12897
12898       These intrinsics return a non-negative integer value that can be used to
12899       get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
12900       on the caller's stack. In particular, for targets where stack grows downwards,
12901       adding this offset to the native stack pointer would get the address of the most
12902       recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
12903       complicated, because subtracting this value from stack pointer would get the address
12904       one past the end of the most recent dynamic alloca.
12905
12906       Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12907       returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
12908       compile-time-known constant value.
12909
12910       The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12911       must match the target's default address space's (address space 0) pointer type.
12912
12913 '``llvm.prefetch``' Intrinsic
12914 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12915
12916 Syntax:
12917 """""""
12918
12919 ::
12920
12921       declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
12922
12923 Overview:
12924 """""""""
12925
12926 The '``llvm.prefetch``' intrinsic is a hint to the code generator to
12927 insert a prefetch instruction if supported; otherwise, it is a noop.
12928 Prefetches have no effect on the behavior of the program but can change
12929 its performance characteristics.
12930
12931 Arguments:
12932 """"""""""
12933
12934 ``address`` is the address to be prefetched, ``rw`` is the specifier
12935 determining if the fetch should be for a read (0) or write (1), and
12936 ``locality`` is a temporal locality specifier ranging from (0) - no
12937 locality, to (3) - extremely local keep in cache. The ``cache type``
12938 specifies whether the prefetch is performed on the data (1) or
12939 instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
12940 arguments must be constant integers.
12941
12942 Semantics:
12943 """"""""""
12944
12945 This intrinsic does not modify the behavior of the program. In
12946 particular, prefetches cannot trap and do not produce a value. On
12947 targets that support this intrinsic, the prefetch can provide hints to
12948 the processor cache for better performance.
12949
12950 '``llvm.pcmarker``' Intrinsic
12951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12952
12953 Syntax:
12954 """""""
12955
12956 ::
12957
12958       declare void @llvm.pcmarker(i32 <id>)
12959
12960 Overview:
12961 """""""""
12962
12963 The '``llvm.pcmarker``' intrinsic is a method to export a Program
12964 Counter (PC) in a region of code to simulators and other tools. The
12965 method is target specific, but it is expected that the marker will use
12966 exported symbols to transmit the PC of the marker. The marker makes no
12967 guarantees that it will remain with any specific instruction after
12968 optimizations. It is possible that the presence of a marker will inhibit
12969 optimizations. The intended use is to be inserted after optimizations to
12970 allow correlations of simulation runs.
12971
12972 Arguments:
12973 """"""""""
12974
12975 ``id`` is a numerical id identifying the marker.
12976
12977 Semantics:
12978 """"""""""
12979
12980 This intrinsic does not modify the behavior of the program. Backends
12981 that do not support this intrinsic may ignore it.
12982
12983 '``llvm.readcyclecounter``' Intrinsic
12984 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12985
12986 Syntax:
12987 """""""
12988
12989 ::
12990
12991       declare i64 @llvm.readcyclecounter()
12992
12993 Overview:
12994 """""""""
12995
12996 The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
12997 counter register (or similar low latency, high accuracy clocks) on those
12998 targets that support it. On X86, it should map to RDTSC. On Alpha, it
12999 should map to RPCC. As the backing counters overflow quickly (on the
13000 order of 9 seconds on alpha), this should only be used for small
13001 timings.
13002
13003 Semantics:
13004 """"""""""
13005
13006 When directly supported, reading the cycle counter should not modify any
13007 memory. Implementations are allowed to either return an application
13008 specific value or a system wide value. On backends without support, this
13009 is lowered to a constant 0.
13010
13011 Note that runtime support may be conditional on the privilege-level code is
13012 running at and the host platform.
13013
13014 '``llvm.clear_cache``' Intrinsic
13015 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13016
13017 Syntax:
13018 """""""
13019
13020 ::
13021
13022       declare void @llvm.clear_cache(i8*, i8*)
13023
13024 Overview:
13025 """""""""
13026
13027 The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
13028 in the specified range to the execution unit of the processor. On
13029 targets with non-unified instruction and data cache, the implementation
13030 flushes the instruction cache.
13031
13032 Semantics:
13033 """"""""""
13034
13035 On platforms with coherent instruction and data caches (e.g. x86), this
13036 intrinsic is a nop. On platforms with non-coherent instruction and data
13037 cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13038 instructions or a system call, if cache flushing requires special
13039 privileges.
13040
13041 The default behavior is to emit a call to ``__clear_cache`` from the run
13042 time library.
13043
13044 This intrinsic does *not* empty the instruction pipeline. Modifications
13045 of the current function are outside the scope of the intrinsic.
13046
13047 '``llvm.instrprof.increment``' Intrinsic
13048 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13049
13050 Syntax:
13051 """""""
13052
13053 ::
13054
13055       declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
13056                                              i32 <num-counters>, i32 <index>)
13057
13058 Overview:
13059 """""""""
13060
13061 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13062 frontend for use with instrumentation based profiling. These will be
13063 lowered by the ``-instrprof`` pass to generate execution counts of a
13064 program at runtime.
13065
13066 Arguments:
13067 """"""""""
13068
13069 The first argument is a pointer to a global variable containing the
13070 name of the entity being instrumented. This should generally be the
13071 (mangled) function name for a set of counters.
13072
13073 The second argument is a hash value that can be used by the consumer
13074 of the profile data to detect changes to the instrumented source, and
13075 the third is the number of counters associated with ``name``. It is an
13076 error if ``hash`` or ``num-counters`` differ between two instances of
13077 ``instrprof.increment`` that refer to the same name.
13078
13079 The last argument refers to which of the counters for ``name`` should
13080 be incremented. It should be a value between 0 and ``num-counters``.
13081
13082 Semantics:
13083 """"""""""
13084
13085 This intrinsic represents an increment of a profiling counter. It will
13086 cause the ``-instrprof`` pass to generate the appropriate data
13087 structures and the code to increment the appropriate value, in a
13088 format that can be written out by a compiler runtime and consumed via
13089 the ``llvm-profdata`` tool.
13090
13091 '``llvm.instrprof.increment.step``' Intrinsic
13092 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13093
13094 Syntax:
13095 """""""
13096
13097 ::
13098
13099       declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
13100                                                   i32 <num-counters>,
13101                                                   i32 <index>, i64 <step>)
13102
13103 Overview:
13104 """""""""
13105
13106 The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13107 the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13108 argument to specify the step of the increment.
13109
13110 Arguments:
13111 """"""""""
13112 The first four arguments are the same as '``llvm.instrprof.increment``'
13113 intrinsic.
13114
13115 The last argument specifies the value of the increment of the counter variable.
13116
13117 Semantics:
13118 """"""""""
13119 See description of '``llvm.instrprof.increment``' intrinsic.
13120
13121
13122 '``llvm.instrprof.value.profile``' Intrinsic
13123 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13124
13125 Syntax:
13126 """""""
13127
13128 ::
13129
13130       declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
13131                                                  i64 <value>, i32 <value_kind>,
13132                                                  i32 <index>)
13133
13134 Overview:
13135 """""""""
13136
13137 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13138 frontend for use with instrumentation based profiling. This will be
13139 lowered by the ``-instrprof`` pass to find out the target values,
13140 instrumented expressions take in a program at runtime.
13141
13142 Arguments:
13143 """"""""""
13144
13145 The first argument is a pointer to a global variable containing the
13146 name of the entity being instrumented. ``name`` should generally be the
13147 (mangled) function name for a set of counters.
13148
13149 The second argument is a hash value that can be used by the consumer
13150 of the profile data to detect changes to the instrumented source. It
13151 is an error if ``hash`` differs between two instances of
13152 ``llvm.instrprof.*`` that refer to the same name.
13153
13154 The third argument is the value of the expression being profiled. The profiled
13155 expression's value should be representable as an unsigned 64-bit value. The
13156 fourth argument represents the kind of value profiling that is being done. The
13157 supported value profiling kinds are enumerated through the
13158 ``InstrProfValueKind`` type declared in the
13159 ``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13160 index of the instrumented expression within ``name``. It should be >= 0.
13161
13162 Semantics:
13163 """"""""""
13164
13165 This intrinsic represents the point where a call to a runtime routine
13166 should be inserted for value profiling of target expressions. ``-instrprof``
13167 pass will generate the appropriate data structures and replace the
13168 ``llvm.instrprof.value.profile`` intrinsic with the call to the profile
13169 runtime library with proper arguments.
13170
13171 '``llvm.thread.pointer``' Intrinsic
13172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13173
13174 Syntax:
13175 """""""
13176
13177 ::
13178
13179       declare i8* @llvm.thread.pointer()
13180
13181 Overview:
13182 """""""""
13183
13184 The '``llvm.thread.pointer``' intrinsic returns the value of the thread
13185 pointer.
13186
13187 Semantics:
13188 """"""""""
13189
13190 The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
13191 for the current thread.  The exact semantics of this value are target
13192 specific: it may point to the start of TLS area, to the end, or somewhere
13193 in the middle.  Depending on the target, this intrinsic may read a register,
13194 call a helper function, read from an alternate memory space, or perform
13195 other operations necessary to locate the TLS area.  Not all targets support
13196 this intrinsic.
13197
13198 '``llvm.call.preallocated.setup``' Intrinsic
13199 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13200
13201 Syntax:
13202 """""""
13203
13204 ::
13205
13206       declare token @llvm.call.preallocated.setup(i32 %num_args)
13207
13208 Overview:
13209 """""""""
13210
13211 The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
13212 be used with a call's ``"preallocated"`` operand bundle to indicate that
13213 certain arguments are allocated and initialized before the call.
13214
13215 Semantics:
13216 """"""""""
13217
13218 The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
13219 associated with at most one call. The token can be passed to
13220 '``@llvm.call.preallocated.arg``' to get a pointer to get that
13221 corresponding argument. The token must be the parameter to a
13222 ``"preallocated"`` operand bundle for the corresponding call.
13223
13224 Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
13225 be properly nested. e.g.
13226
13227 :: code-block:: llvm
13228
13229       %t1 = call token @llvm.call.preallocated.setup(i32 0)
13230       %t2 = call token @llvm.call.preallocated.setup(i32 0)
13231       call void foo() ["preallocated"(token %t2)]
13232       call void foo() ["preallocated"(token %t1)]
13233
13234 is allowed, but not
13235
13236 :: code-block:: llvm
13237
13238       %t1 = call token @llvm.call.preallocated.setup(i32 0)
13239       %t2 = call token @llvm.call.preallocated.setup(i32 0)
13240       call void foo() ["preallocated"(token %t1)]
13241       call void foo() ["preallocated"(token %t2)]
13242
13243 .. _int_call_preallocated_arg:
13244
13245 '``llvm.call.preallocated.arg``' Intrinsic
13246 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13247
13248 Syntax:
13249 """""""
13250
13251 ::
13252
13253       declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
13254
13255 Overview:
13256 """""""""
13257
13258 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13259 corresponding preallocated argument for the preallocated call.
13260
13261 Semantics:
13262 """"""""""
13263
13264 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13265 ``%arg_index``th argument with the ``preallocated`` attribute for
13266 the call associated with the ``%setup_token``, which must be from
13267 '``llvm.call.preallocated.setup``'.
13268
13269 A call to '``llvm.call.preallocated.arg``' must have a call site
13270 ``preallocated`` attribute. The type of the ``preallocated`` attribute must
13271 match the type used by the ``preallocated`` attribute of the corresponding
13272 argument at the preallocated call. The type is used in the case that an
13273 ``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
13274 to DCE), where otherwise we cannot know how large the arguments are.
13275
13276 It is undefined behavior if this is called with a token from an
13277 '``llvm.call.preallocated.setup``' if another
13278 '``llvm.call.preallocated.setup``' has already been called or if the
13279 preallocated call corresponding to the '``llvm.call.preallocated.setup``'
13280 has already been called.
13281
13282 .. _int_call_preallocated_teardown:
13283
13284 '``llvm.call.preallocated.teardown``' Intrinsic
13285 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13286
13287 Syntax:
13288 """""""
13289
13290 ::
13291
13292       declare i8* @llvm.call.preallocated.teardown(token %setup_token)
13293
13294 Overview:
13295 """""""""
13296
13297 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13298 created by a '``llvm.call.preallocated.setup``'.
13299
13300 Semantics:
13301 """"""""""
13302
13303 The token argument must be a '``llvm.call.preallocated.setup``'.
13304
13305 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13306 allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
13307 one of this or the preallocated call must be called to prevent stack leaks.
13308 It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
13309 and the preallocated call for a given '``llvm.call.preallocated.setup``'.
13310
13311 For example, if the stack is allocated for a preallocated call by a
13312 '``llvm.call.preallocated.setup``', then an initializer function called on an
13313 allocated argument throws an exception, there should be a
13314 '``llvm.call.preallocated.teardown``' in the exception handler to prevent
13315 stack leaks.
13316
13317 Following the nesting rules in '``llvm.call.preallocated.setup``', nested
13318 calls to '``llvm.call.preallocated.setup``' and
13319 '``llvm.call.preallocated.teardown``' are allowed but must be properly
13320 nested.
13321
13322 Example:
13323 """"""""
13324
13325 .. code-block:: llvm
13326
13327         %cs = call token @llvm.call.preallocated.setup(i32 1)
13328         %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
13329         %y = bitcast i8* %x to i32*
13330         invoke void @constructor(i32* %y) to label %conta unwind label %contb
13331     conta:
13332         call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)]
13333         ret void
13334     contb:
13335         %s = catchswitch within none [label %catch] unwind to caller
13336     catch:
13337         %p = catchpad within %s []
13338         call void @llvm.call.preallocated.teardown(token %cs)
13339         ret void
13340
13341 Standard C/C++ Library Intrinsics
13342 ---------------------------------
13343
13344 LLVM provides intrinsics for a few important standard C/C++ library
13345 functions. These intrinsics allow source-language front-ends to pass
13346 information about the alignment of the pointer arguments to the code
13347 generator, providing opportunity for more efficient code generation.
13348
13349
13350 '``llvm.abs.*``' Intrinsic
13351 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13352
13353 Syntax:
13354 """""""
13355
13356 This is an overloaded intrinsic. You can use ``llvm.abs`` on any
13357 integer bit width or any vector of integer elements.
13358
13359 ::
13360
13361       declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
13362       declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
13363
13364 Overview:
13365 """""""""
13366
13367 The '``llvm.abs``' family of intrinsic functions returns the absolute value
13368 of an argument.
13369
13370 Arguments:
13371 """"""""""
13372
13373 The first argument is the value for which the absolute value is to be returned.
13374 This argument may be of any integer type or a vector with integer element type.
13375 The return type must match the first argument type.
13376
13377 The second argument must be a constant and is a flag to indicate whether the
13378 result value of the '``llvm.abs``' intrinsic is a
13379 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
13380 an ``INT_MIN`` value.
13381
13382 Semantics:
13383 """"""""""
13384
13385 The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
13386 argument or each element of a vector argument.". If the argument is ``INT_MIN``,
13387 then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
13388 ``poison`` otherwise.
13389
13390
13391 '``llvm.smax.*``' Intrinsic
13392 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13393
13394 Syntax:
13395 """""""
13396
13397 This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
13398 integer bit width or any vector of integer elements.
13399
13400 ::
13401
13402       declare i32 @llvm.smax.i32(i32 %a, i32 %b)
13403       declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
13404
13405 Overview:
13406 """""""""
13407
13408 Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
13409 Vector intrinsics operate on a per-element basis. The larger element of ``%a``
13410 and ``%b`` at a given index is returned for that index.
13411
13412 Arguments:
13413 """"""""""
13414
13415 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13416 integer element type. The argument types must match each other, and the return
13417 type must match the argument type.
13418
13419
13420 '``llvm.smin.*``' Intrinsic
13421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13422
13423 Syntax:
13424 """""""
13425
13426 This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
13427 integer bit width or any vector of integer elements.
13428
13429 ::
13430
13431       declare i32 @llvm.smin.i32(i32 %a, i32 %b)
13432       declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
13433
13434 Overview:
13435 """""""""
13436
13437 Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
13438 Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
13439 and ``%b`` at a given index is returned for that index.
13440
13441 Arguments:
13442 """"""""""
13443
13444 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13445 integer element type. The argument types must match each other, and the return
13446 type must match the argument type.
13447
13448
13449 '``llvm.umax.*``' Intrinsic
13450 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13451
13452 Syntax:
13453 """""""
13454
13455 This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
13456 integer bit width or any vector of integer elements.
13457
13458 ::
13459
13460       declare i32 @llvm.umax.i32(i32 %a, i32 %b)
13461       declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
13462
13463 Overview:
13464 """""""""
13465
13466 Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
13467 integers. Vector intrinsics operate on a per-element basis. The larger element
13468 of ``%a`` and ``%b`` at a given index is returned for that index.
13469
13470 Arguments:
13471 """"""""""
13472
13473 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13474 integer element type. The argument types must match each other, and the return
13475 type must match the argument type.
13476
13477
13478 '``llvm.umin.*``' Intrinsic
13479 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13480
13481 Syntax:
13482 """""""
13483
13484 This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
13485 integer bit width or any vector of integer elements.
13486
13487 ::
13488
13489       declare i32 @llvm.umin.i32(i32 %a, i32 %b)
13490       declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
13491
13492 Overview:
13493 """""""""
13494
13495 Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
13496 integers. Vector intrinsics operate on a per-element basis. The smaller element
13497 of ``%a`` and ``%b`` at a given index is returned for that index.
13498
13499 Arguments:
13500 """"""""""
13501
13502 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13503 integer element type. The argument types must match each other, and the return
13504 type must match the argument type.
13505
13506
13507 .. _int_memcpy:
13508
13509 '``llvm.memcpy``' Intrinsic
13510 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13511
13512 Syntax:
13513 """""""
13514
13515 This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
13516 integer bit width and for different address spaces. Not all targets
13517 support all bit widths however.
13518
13519 ::
13520
13521       declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13522                                               i32 <len>, i1 <isvolatile>)
13523       declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13524                                               i64 <len>, i1 <isvolatile>)
13525
13526 Overview:
13527 """""""""
13528
13529 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
13530 source location to the destination location.
13531
13532 Note that, unlike the standard libc function, the ``llvm.memcpy.*``
13533 intrinsics do not return a value, takes extra isvolatile
13534 arguments and the pointers can be in specified address spaces.
13535
13536 Arguments:
13537 """"""""""
13538
13539 The first argument is a pointer to the destination, the second is a
13540 pointer to the source. The third argument is an integer argument
13541 specifying the number of bytes to copy, and the fourth is a
13542 boolean indicating a volatile access.
13543
13544 The :ref:`align <attr_align>` parameter attribute can be provided
13545 for the first and second arguments.
13546
13547 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
13548 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13549 very cleanly specified and it is unwise to depend on it.
13550
13551 Semantics:
13552 """"""""""
13553
13554 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
13555 location to the destination location, which must either be equal or
13556 non-overlapping. It copies "len" bytes of memory over. If the argument is known
13557 to be aligned to some boundary, this can be specified as an attribute on the
13558 argument.
13559
13560 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13561 the arguments.
13562 If ``<len>`` is not a well-defined value, the behavior is undefined.
13563 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13564 otherwise the behavior is undefined.
13565
13566 .. _int_memcpy_inline:
13567
13568 '``llvm.memcpy.inline``' Intrinsic
13569 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13570
13571 Syntax:
13572 """""""
13573
13574 This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
13575 integer bit width and for different address spaces. Not all targets
13576 support all bit widths however.
13577
13578 ::
13579
13580       declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13581                                                      i32 <len>, i1 <isvolatile>)
13582       declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13583                                                      i64 <len>, i1 <isvolatile>)
13584
13585 Overview:
13586 """""""""
13587
13588 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13589 source location to the destination location and guarantees that no external
13590 functions are called.
13591
13592 Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
13593 intrinsics do not return a value, takes extra isvolatile
13594 arguments and the pointers can be in specified address spaces.
13595
13596 Arguments:
13597 """"""""""
13598
13599 The first argument is a pointer to the destination, the second is a
13600 pointer to the source. The third argument is a constant integer argument
13601 specifying the number of bytes to copy, and the fourth is a
13602 boolean indicating a volatile access.
13603
13604 The :ref:`align <attr_align>` parameter attribute can be provided
13605 for the first and second arguments.
13606
13607 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
13608 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13609 very cleanly specified and it is unwise to depend on it.
13610
13611 Semantics:
13612 """"""""""
13613
13614 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13615 source location to the destination location, which are not allowed to
13616 overlap. It copies "len" bytes of memory over. If the argument is known
13617 to be aligned to some boundary, this can be specified as an attribute on
13618 the argument.
13619 The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
13620 '``llvm.memcpy.*``', but the generated code is guaranteed not to call any
13621 external functions.
13622
13623 .. _int_memmove:
13624
13625 '``llvm.memmove``' Intrinsic
13626 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13627
13628 Syntax:
13629 """""""
13630
13631 This is an overloaded intrinsic. You can use llvm.memmove on any integer
13632 bit width and for different address space. Not all targets support all
13633 bit widths however.
13634
13635 ::
13636
13637       declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13638                                                i32 <len>, i1 <isvolatile>)
13639       declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13640                                                i64 <len>, i1 <isvolatile>)
13641
13642 Overview:
13643 """""""""
13644
13645 The '``llvm.memmove.*``' intrinsics move a block of memory from the
13646 source location to the destination location. It is similar to the
13647 '``llvm.memcpy``' intrinsic but allows the two memory locations to
13648 overlap.
13649
13650 Note that, unlike the standard libc function, the ``llvm.memmove.*``
13651 intrinsics do not return a value, takes an extra isvolatile
13652 argument and the pointers can be in specified address spaces.
13653
13654 Arguments:
13655 """"""""""
13656
13657 The first argument is a pointer to the destination, the second is a
13658 pointer to the source. The third argument is an integer argument
13659 specifying the number of bytes to copy, and the fourth is a
13660 boolean indicating a volatile access.
13661
13662 The :ref:`align <attr_align>` parameter attribute can be provided
13663 for the first and second arguments.
13664
13665 If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
13666 is a :ref:`volatile operation <volatile>`. The detailed access behavior is
13667 not very cleanly specified and it is unwise to depend on it.
13668
13669 Semantics:
13670 """"""""""
13671
13672 The '``llvm.memmove.*``' intrinsics copy a block of memory from the
13673 source location to the destination location, which may overlap. It
13674 copies "len" bytes of memory over. If the argument is known to be
13675 aligned to some boundary, this can be specified as an attribute on
13676 the argument.
13677
13678 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13679 the arguments.
13680 If ``<len>`` is not a well-defined value, the behavior is undefined.
13681 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13682 otherwise the behavior is undefined.
13683
13684 .. _int_memset:
13685
13686 '``llvm.memset.*``' Intrinsics
13687 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13688
13689 Syntax:
13690 """""""
13691
13692 This is an overloaded intrinsic. You can use llvm.memset on any integer
13693 bit width and for different address spaces. However, not all targets
13694 support all bit widths.
13695
13696 ::
13697
13698       declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
13699                                          i32 <len>, i1 <isvolatile>)
13700       declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
13701                                          i64 <len>, i1 <isvolatile>)
13702
13703 Overview:
13704 """""""""
13705
13706 The '``llvm.memset.*``' intrinsics fill a block of memory with a
13707 particular byte value.
13708
13709 Note that, unlike the standard libc function, the ``llvm.memset``
13710 intrinsic does not return a value and takes an extra volatile
13711 argument. Also, the destination can be in an arbitrary address space.
13712
13713 Arguments:
13714 """"""""""
13715
13716 The first argument is a pointer to the destination to fill, the second
13717 is the byte value with which to fill it, the third argument is an
13718 integer argument specifying the number of bytes to fill, and the fourth
13719 is a boolean indicating a volatile access.
13720
13721 The :ref:`align <attr_align>` parameter attribute can be provided
13722 for the first arguments.
13723
13724 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
13725 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13726 very cleanly specified and it is unwise to depend on it.
13727
13728 Semantics:
13729 """"""""""
13730
13731 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
13732 at the destination location. If the argument is known to be
13733 aligned to some boundary, this can be specified as an attribute on
13734 the argument.
13735
13736 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13737 the arguments.
13738 If ``<len>`` is not a well-defined value, the behavior is undefined.
13739 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13740 otherwise the behavior is undefined.
13741
13742 '``llvm.sqrt.*``' Intrinsic
13743 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13744
13745 Syntax:
13746 """""""
13747
13748 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
13749 floating-point or vector of floating-point type. Not all targets support
13750 all types however.
13751
13752 ::
13753
13754       declare float     @llvm.sqrt.f32(float %Val)
13755       declare double    @llvm.sqrt.f64(double %Val)
13756       declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
13757       declare fp128     @llvm.sqrt.f128(fp128 %Val)
13758       declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
13759
13760 Overview:
13761 """""""""
13762
13763 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
13764
13765 Arguments:
13766 """"""""""
13767
13768 The argument and return value are floating-point numbers of the same type.
13769
13770 Semantics:
13771 """"""""""
13772
13773 Return the same value as a corresponding libm '``sqrt``' function but without
13774 trapping or setting ``errno``. For types specified by IEEE-754, the result
13775 matches a conforming libm implementation.
13776
13777 When specified with the fast-math-flag 'afn', the result may be approximated
13778 using a less accurate calculation.
13779
13780 '``llvm.powi.*``' Intrinsic
13781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13782
13783 Syntax:
13784 """""""
13785
13786 This is an overloaded intrinsic. You can use ``llvm.powi`` on any
13787 floating-point or vector of floating-point type. Not all targets support
13788 all types however.
13789
13790 Generally, the only supported type for the exponent is the one matching
13791 with the C type ``int``.
13792
13793 ::
13794
13795       declare float     @llvm.powi.f32.i32(float  %Val, i32 %power)
13796       declare double    @llvm.powi.f64.i16(double %Val, i16 %power)
13797       declare x86_fp80  @llvm.powi.f80.i32(x86_fp80  %Val, i32 %power)
13798       declare fp128     @llvm.powi.f128.i32(fp128 %Val, i32 %power)
13799       declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128  %Val, i32 %power)
13800
13801 Overview:
13802 """""""""
13803
13804 The '``llvm.powi.*``' intrinsics return the first operand raised to the
13805 specified (positive or negative) power. The order of evaluation of
13806 multiplications is not defined. When a vector of floating-point type is
13807 used, the second argument remains a scalar integer value.
13808
13809 Arguments:
13810 """"""""""
13811
13812 The second argument is an integer power, and the first is a value to
13813 raise to that power.
13814
13815 Semantics:
13816 """"""""""
13817
13818 This function returns the first value raised to the second power with an
13819 unspecified sequence of rounding operations.
13820
13821 '``llvm.sin.*``' Intrinsic
13822 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13823
13824 Syntax:
13825 """""""
13826
13827 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
13828 floating-point or vector of floating-point type. Not all targets support
13829 all types however.
13830
13831 ::
13832
13833       declare float     @llvm.sin.f32(float  %Val)
13834       declare double    @llvm.sin.f64(double %Val)
13835       declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
13836       declare fp128     @llvm.sin.f128(fp128 %Val)
13837       declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
13838
13839 Overview:
13840 """""""""
13841
13842 The '``llvm.sin.*``' intrinsics return the sine of the operand.
13843
13844 Arguments:
13845 """"""""""
13846
13847 The argument and return value are floating-point numbers of the same type.
13848
13849 Semantics:
13850 """"""""""
13851
13852 Return the same value as a corresponding libm '``sin``' function but without
13853 trapping or setting ``errno``.
13854
13855 When specified with the fast-math-flag 'afn', the result may be approximated
13856 using a less accurate calculation.
13857
13858 '``llvm.cos.*``' Intrinsic
13859 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13860
13861 Syntax:
13862 """""""
13863
13864 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
13865 floating-point or vector of floating-point type. Not all targets support
13866 all types however.
13867
13868 ::
13869
13870       declare float     @llvm.cos.f32(float  %Val)
13871       declare double    @llvm.cos.f64(double %Val)
13872       declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
13873       declare fp128     @llvm.cos.f128(fp128 %Val)
13874       declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
13875
13876 Overview:
13877 """""""""
13878
13879 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
13880
13881 Arguments:
13882 """"""""""
13883
13884 The argument and return value are floating-point numbers of the same type.
13885
13886 Semantics:
13887 """"""""""
13888
13889 Return the same value as a corresponding libm '``cos``' function but without
13890 trapping or setting ``errno``.
13891
13892 When specified with the fast-math-flag 'afn', the result may be approximated
13893 using a less accurate calculation.
13894
13895 '``llvm.pow.*``' Intrinsic
13896 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13897
13898 Syntax:
13899 """""""
13900
13901 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
13902 floating-point or vector of floating-point type. Not all targets support
13903 all types however.
13904
13905 ::
13906
13907       declare float     @llvm.pow.f32(float  %Val, float %Power)
13908       declare double    @llvm.pow.f64(double %Val, double %Power)
13909       declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
13910       declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
13911       declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
13912
13913 Overview:
13914 """""""""
13915
13916 The '``llvm.pow.*``' intrinsics return the first operand raised to the
13917 specified (positive or negative) power.
13918
13919 Arguments:
13920 """"""""""
13921
13922 The arguments and return value are floating-point numbers of the same type.
13923
13924 Semantics:
13925 """"""""""
13926
13927 Return the same value as a corresponding libm '``pow``' function but without
13928 trapping or setting ``errno``.
13929
13930 When specified with the fast-math-flag 'afn', the result may be approximated
13931 using a less accurate calculation.
13932
13933 '``llvm.exp.*``' Intrinsic
13934 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13935
13936 Syntax:
13937 """""""
13938
13939 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
13940 floating-point or vector of floating-point type. Not all targets support
13941 all types however.
13942
13943 ::
13944
13945       declare float     @llvm.exp.f32(float  %Val)
13946       declare double    @llvm.exp.f64(double %Val)
13947       declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
13948       declare fp128     @llvm.exp.f128(fp128 %Val)
13949       declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
13950
13951 Overview:
13952 """""""""
13953
13954 The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
13955 value.
13956
13957 Arguments:
13958 """"""""""
13959
13960 The argument and return value are floating-point numbers of the same type.
13961
13962 Semantics:
13963 """"""""""
13964
13965 Return the same value as a corresponding libm '``exp``' function but without
13966 trapping or setting ``errno``.
13967
13968 When specified with the fast-math-flag 'afn', the result may be approximated
13969 using a less accurate calculation.
13970
13971 '``llvm.exp2.*``' Intrinsic
13972 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13973
13974 Syntax:
13975 """""""
13976
13977 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
13978 floating-point or vector of floating-point type. Not all targets support
13979 all types however.
13980
13981 ::
13982
13983       declare float     @llvm.exp2.f32(float  %Val)
13984       declare double    @llvm.exp2.f64(double %Val)
13985       declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
13986       declare fp128     @llvm.exp2.f128(fp128 %Val)
13987       declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
13988
13989 Overview:
13990 """""""""
13991
13992 The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
13993 specified value.
13994
13995 Arguments:
13996 """"""""""
13997
13998 The argument and return value are floating-point numbers of the same type.
13999
14000 Semantics:
14001 """"""""""
14002
14003 Return the same value as a corresponding libm '``exp2``' function but without
14004 trapping or setting ``errno``.
14005
14006 When specified with the fast-math-flag 'afn', the result may be approximated
14007 using a less accurate calculation.
14008
14009 '``llvm.log.*``' Intrinsic
14010 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14011
14012 Syntax:
14013 """""""
14014
14015 This is an overloaded intrinsic. You can use ``llvm.log`` on any
14016 floating-point or vector of floating-point type. Not all targets support
14017 all types however.
14018
14019 ::
14020
14021       declare float     @llvm.log.f32(float  %Val)
14022       declare double    @llvm.log.f64(double %Val)
14023       declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
14024       declare fp128     @llvm.log.f128(fp128 %Val)
14025       declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
14026
14027 Overview:
14028 """""""""
14029
14030 The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
14031 value.
14032
14033 Arguments:
14034 """"""""""
14035
14036 The argument and return value are floating-point numbers of the same type.
14037
14038 Semantics:
14039 """"""""""
14040
14041 Return the same value as a corresponding libm '``log``' function but without
14042 trapping or setting ``errno``.
14043
14044 When specified with the fast-math-flag 'afn', the result may be approximated
14045 using a less accurate calculation.
14046
14047 '``llvm.log10.*``' Intrinsic
14048 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14049
14050 Syntax:
14051 """""""
14052
14053 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
14054 floating-point or vector of floating-point type. Not all targets support
14055 all types however.
14056
14057 ::
14058
14059       declare float     @llvm.log10.f32(float  %Val)
14060       declare double    @llvm.log10.f64(double %Val)
14061       declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
14062       declare fp128     @llvm.log10.f128(fp128 %Val)
14063       declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
14064
14065 Overview:
14066 """""""""
14067
14068 The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
14069 specified value.
14070
14071 Arguments:
14072 """"""""""
14073
14074 The argument and return value are floating-point numbers of the same type.
14075
14076 Semantics:
14077 """"""""""
14078
14079 Return the same value as a corresponding libm '``log10``' function but without
14080 trapping or setting ``errno``.
14081
14082 When specified with the fast-math-flag 'afn', the result may be approximated
14083 using a less accurate calculation.
14084
14085 '``llvm.log2.*``' Intrinsic
14086 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14087
14088 Syntax:
14089 """""""
14090
14091 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
14092 floating-point or vector of floating-point type. Not all targets support
14093 all types however.
14094
14095 ::
14096
14097       declare float     @llvm.log2.f32(float  %Val)
14098       declare double    @llvm.log2.f64(double %Val)
14099       declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
14100       declare fp128     @llvm.log2.f128(fp128 %Val)
14101       declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
14102
14103 Overview:
14104 """""""""
14105
14106 The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
14107 value.
14108
14109 Arguments:
14110 """"""""""
14111
14112 The argument and return value are floating-point numbers of the same type.
14113
14114 Semantics:
14115 """"""""""
14116
14117 Return the same value as a corresponding libm '``log2``' function but without
14118 trapping or setting ``errno``.
14119
14120 When specified with the fast-math-flag 'afn', the result may be approximated
14121 using a less accurate calculation.
14122
14123 .. _int_fma:
14124
14125 '``llvm.fma.*``' Intrinsic
14126 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14127
14128 Syntax:
14129 """""""
14130
14131 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
14132 floating-point or vector of floating-point type. Not all targets support
14133 all types however.
14134
14135 ::
14136
14137       declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
14138       declare double    @llvm.fma.f64(double %a, double %b, double %c)
14139       declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
14140       declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
14141       declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
14142
14143 Overview:
14144 """""""""
14145
14146 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
14147
14148 Arguments:
14149 """"""""""
14150
14151 The arguments and return value are floating-point numbers of the same type.
14152
14153 Semantics:
14154 """"""""""
14155
14156 Return the same value as a corresponding libm '``fma``' function but without
14157 trapping or setting ``errno``.
14158
14159 When specified with the fast-math-flag 'afn', the result may be approximated
14160 using a less accurate calculation.
14161
14162 '``llvm.fabs.*``' Intrinsic
14163 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14164
14165 Syntax:
14166 """""""
14167
14168 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
14169 floating-point or vector of floating-point type. Not all targets support
14170 all types however.
14171
14172 ::
14173
14174       declare float     @llvm.fabs.f32(float  %Val)
14175       declare double    @llvm.fabs.f64(double %Val)
14176       declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
14177       declare fp128     @llvm.fabs.f128(fp128 %Val)
14178       declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
14179
14180 Overview:
14181 """""""""
14182
14183 The '``llvm.fabs.*``' intrinsics return the absolute value of the
14184 operand.
14185
14186 Arguments:
14187 """"""""""
14188
14189 The argument and return value are floating-point numbers of the same
14190 type.
14191
14192 Semantics:
14193 """"""""""
14194
14195 This function returns the same values as the libm ``fabs`` functions
14196 would, and handles error conditions in the same way.
14197
14198 '``llvm.minnum.*``' Intrinsic
14199 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14200
14201 Syntax:
14202 """""""
14203
14204 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
14205 floating-point or vector of floating-point type. Not all targets support
14206 all types however.
14207
14208 ::
14209
14210       declare float     @llvm.minnum.f32(float %Val0, float %Val1)
14211       declare double    @llvm.minnum.f64(double %Val0, double %Val1)
14212       declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14213       declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
14214       declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14215
14216 Overview:
14217 """""""""
14218
14219 The '``llvm.minnum.*``' intrinsics return the minimum of the two
14220 arguments.
14221
14222
14223 Arguments:
14224 """"""""""
14225
14226 The arguments and return value are floating-point numbers of the same
14227 type.
14228
14229 Semantics:
14230 """"""""""
14231
14232 Follows the IEEE-754 semantics for minNum, except for handling of
14233 signaling NaNs. This match's the behavior of libm's fmin.
14234
14235 If either operand is a NaN, returns the other non-NaN operand. Returns
14236 NaN only if both operands are NaN. The returned NaN is always
14237 quiet. If the operands compare equal, returns a value that compares
14238 equal to both operands. This means that fmin(+/-0.0, +/-0.0) could
14239 return either -0.0 or 0.0.
14240
14241 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14242 signaling and quiet NaN inputs. If a target's implementation follows
14243 the standard and returns a quiet NaN if either input is a signaling
14244 NaN, the intrinsic lowering is responsible for quieting the inputs to
14245 correctly return the non-NaN input (e.g. by using the equivalent of
14246 ``llvm.canonicalize``).
14247
14248
14249 '``llvm.maxnum.*``' Intrinsic
14250 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14251
14252 Syntax:
14253 """""""
14254
14255 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
14256 floating-point or vector of floating-point type. Not all targets support
14257 all types however.
14258
14259 ::
14260
14261       declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1)
14262       declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
14263       declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
14264       declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
14265       declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
14266
14267 Overview:
14268 """""""""
14269
14270 The '``llvm.maxnum.*``' intrinsics return the maximum of the two
14271 arguments.
14272
14273
14274 Arguments:
14275 """"""""""
14276
14277 The arguments and return value are floating-point numbers of the same
14278 type.
14279
14280 Semantics:
14281 """"""""""
14282 Follows the IEEE-754 semantics for maxNum except for the handling of
14283 signaling NaNs. This matches the behavior of libm's fmax.
14284
14285 If either operand is a NaN, returns the other non-NaN operand. Returns
14286 NaN only if both operands are NaN. The returned NaN is always
14287 quiet. If the operands compare equal, returns a value that compares
14288 equal to both operands. This means that fmax(+/-0.0, +/-0.0) could
14289 return either -0.0 or 0.0.
14290
14291 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14292 signaling and quiet NaN inputs. If a target's implementation follows
14293 the standard and returns a quiet NaN if either input is a signaling
14294 NaN, the intrinsic lowering is responsible for quieting the inputs to
14295 correctly return the non-NaN input (e.g. by using the equivalent of
14296 ``llvm.canonicalize``).
14297
14298 '``llvm.minimum.*``' Intrinsic
14299 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14300
14301 Syntax:
14302 """""""
14303
14304 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
14305 floating-point or vector of floating-point type. Not all targets support
14306 all types however.
14307
14308 ::
14309
14310       declare float     @llvm.minimum.f32(float %Val0, float %Val1)
14311       declare double    @llvm.minimum.f64(double %Val0, double %Val1)
14312       declare x86_fp80  @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14313       declare fp128     @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
14314       declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14315
14316 Overview:
14317 """""""""
14318
14319 The '``llvm.minimum.*``' intrinsics return the minimum of the two
14320 arguments, propagating NaNs and treating -0.0 as less than +0.0.
14321
14322
14323 Arguments:
14324 """"""""""
14325
14326 The arguments and return value are floating-point numbers of the same
14327 type.
14328
14329 Semantics:
14330 """"""""""
14331 If either operand is a NaN, returns NaN. Otherwise returns the lesser
14332 of the two arguments. -0.0 is considered to be less than +0.0 for this
14333 intrinsic. Note that these are the semantics specified in the draft of
14334 IEEE 754-2018.
14335
14336 '``llvm.maximum.*``' Intrinsic
14337 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14338
14339 Syntax:
14340 """""""
14341
14342 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
14343 floating-point or vector of floating-point type. Not all targets support
14344 all types however.
14345
14346 ::
14347
14348       declare float     @llvm.maximum.f32(float %Val0, float %Val1)
14349       declare double    @llvm.maximum.f64(double %Val0, double %Val1)
14350       declare x86_fp80  @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14351       declare fp128     @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
14352       declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14353
14354 Overview:
14355 """""""""
14356
14357 The '``llvm.maximum.*``' intrinsics return the maximum of the two
14358 arguments, propagating NaNs and treating -0.0 as less than +0.0.
14359
14360
14361 Arguments:
14362 """"""""""
14363
14364 The arguments and return value are floating-point numbers of the same
14365 type.
14366
14367 Semantics:
14368 """"""""""
14369 If either operand is a NaN, returns NaN. Otherwise returns the greater
14370 of the two arguments. -0.0 is considered to be less than +0.0 for this
14371 intrinsic. Note that these are the semantics specified in the draft of
14372 IEEE 754-2018.
14373
14374 '``llvm.copysign.*``' Intrinsic
14375 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14376
14377 Syntax:
14378 """""""
14379
14380 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
14381 floating-point or vector of floating-point type. Not all targets support
14382 all types however.
14383
14384 ::
14385
14386       declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
14387       declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
14388       declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
14389       declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
14390       declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
14391
14392 Overview:
14393 """""""""
14394
14395 The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
14396 first operand and the sign of the second operand.
14397
14398 Arguments:
14399 """"""""""
14400
14401 The arguments and return value are floating-point numbers of the same
14402 type.
14403
14404 Semantics:
14405 """"""""""
14406
14407 This function returns the same values as the libm ``copysign``
14408 functions would, and handles error conditions in the same way.
14409
14410 '``llvm.floor.*``' Intrinsic
14411 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14412
14413 Syntax:
14414 """""""
14415
14416 This is an overloaded intrinsic. You can use ``llvm.floor`` on any
14417 floating-point or vector of floating-point type. Not all targets support
14418 all types however.
14419
14420 ::
14421
14422       declare float     @llvm.floor.f32(float  %Val)
14423       declare double    @llvm.floor.f64(double %Val)
14424       declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
14425       declare fp128     @llvm.floor.f128(fp128 %Val)
14426       declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
14427
14428 Overview:
14429 """""""""
14430
14431 The '``llvm.floor.*``' intrinsics return the floor of the operand.
14432
14433 Arguments:
14434 """"""""""
14435
14436 The argument and return value are floating-point numbers of the same
14437 type.
14438
14439 Semantics:
14440 """"""""""
14441
14442 This function returns the same values as the libm ``floor`` functions
14443 would, and handles error conditions in the same way.
14444
14445 '``llvm.ceil.*``' Intrinsic
14446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14447
14448 Syntax:
14449 """""""
14450
14451 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
14452 floating-point or vector of floating-point type. Not all targets support
14453 all types however.
14454
14455 ::
14456
14457       declare float     @llvm.ceil.f32(float  %Val)
14458       declare double    @llvm.ceil.f64(double %Val)
14459       declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
14460       declare fp128     @llvm.ceil.f128(fp128 %Val)
14461       declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
14462
14463 Overview:
14464 """""""""
14465
14466 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
14467
14468 Arguments:
14469 """"""""""
14470
14471 The argument and return value are floating-point numbers of the same
14472 type.
14473
14474 Semantics:
14475 """"""""""
14476
14477 This function returns the same values as the libm ``ceil`` functions
14478 would, and handles error conditions in the same way.
14479
14480 '``llvm.trunc.*``' Intrinsic
14481 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14482
14483 Syntax:
14484 """""""
14485
14486 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
14487 floating-point or vector of floating-point type. Not all targets support
14488 all types however.
14489
14490 ::
14491
14492       declare float     @llvm.trunc.f32(float  %Val)
14493       declare double    @llvm.trunc.f64(double %Val)
14494       declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
14495       declare fp128     @llvm.trunc.f128(fp128 %Val)
14496       declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
14497
14498 Overview:
14499 """""""""
14500
14501 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
14502 nearest integer not larger in magnitude than the operand.
14503
14504 Arguments:
14505 """"""""""
14506
14507 The argument and return value are floating-point numbers of the same
14508 type.
14509
14510 Semantics:
14511 """"""""""
14512
14513 This function returns the same values as the libm ``trunc`` functions
14514 would, and handles error conditions in the same way.
14515
14516 '``llvm.rint.*``' Intrinsic
14517 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14518
14519 Syntax:
14520 """""""
14521
14522 This is an overloaded intrinsic. You can use ``llvm.rint`` on any
14523 floating-point or vector of floating-point type. Not all targets support
14524 all types however.
14525
14526 ::
14527
14528       declare float     @llvm.rint.f32(float  %Val)
14529       declare double    @llvm.rint.f64(double %Val)
14530       declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
14531       declare fp128     @llvm.rint.f128(fp128 %Val)
14532       declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
14533
14534 Overview:
14535 """""""""
14536
14537 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
14538 nearest integer. It may raise an inexact floating-point exception if the
14539 operand isn't an integer.
14540
14541 Arguments:
14542 """"""""""
14543
14544 The argument and return value are floating-point numbers of the same
14545 type.
14546
14547 Semantics:
14548 """"""""""
14549
14550 This function returns the same values as the libm ``rint`` functions
14551 would, and handles error conditions in the same way.
14552
14553 '``llvm.nearbyint.*``' Intrinsic
14554 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14555
14556 Syntax:
14557 """""""
14558
14559 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
14560 floating-point or vector of floating-point type. Not all targets support
14561 all types however.
14562
14563 ::
14564
14565       declare float     @llvm.nearbyint.f32(float  %Val)
14566       declare double    @llvm.nearbyint.f64(double %Val)
14567       declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
14568       declare fp128     @llvm.nearbyint.f128(fp128 %Val)
14569       declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
14570
14571 Overview:
14572 """""""""
14573
14574 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
14575 nearest integer.
14576
14577 Arguments:
14578 """"""""""
14579
14580 The argument and return value are floating-point numbers of the same
14581 type.
14582
14583 Semantics:
14584 """"""""""
14585
14586 This function returns the same values as the libm ``nearbyint``
14587 functions would, and handles error conditions in the same way.
14588
14589 '``llvm.round.*``' Intrinsic
14590 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14591
14592 Syntax:
14593 """""""
14594
14595 This is an overloaded intrinsic. You can use ``llvm.round`` on any
14596 floating-point or vector of floating-point type. Not all targets support
14597 all types however.
14598
14599 ::
14600
14601       declare float     @llvm.round.f32(float  %Val)
14602       declare double    @llvm.round.f64(double %Val)
14603       declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
14604       declare fp128     @llvm.round.f128(fp128 %Val)
14605       declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
14606
14607 Overview:
14608 """""""""
14609
14610 The '``llvm.round.*``' intrinsics returns the operand rounded to the
14611 nearest integer.
14612
14613 Arguments:
14614 """"""""""
14615
14616 The argument and return value are floating-point numbers of the same
14617 type.
14618
14619 Semantics:
14620 """"""""""
14621
14622 This function returns the same values as the libm ``round``
14623 functions would, and handles error conditions in the same way.
14624
14625 '``llvm.roundeven.*``' Intrinsic
14626 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14627
14628 Syntax:
14629 """""""
14630
14631 This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
14632 floating-point or vector of floating-point type. Not all targets support
14633 all types however.
14634
14635 ::
14636
14637       declare float     @llvm.roundeven.f32(float  %Val)
14638       declare double    @llvm.roundeven.f64(double %Val)
14639       declare x86_fp80  @llvm.roundeven.f80(x86_fp80  %Val)
14640       declare fp128     @llvm.roundeven.f128(fp128 %Val)
14641       declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128  %Val)
14642
14643 Overview:
14644 """""""""
14645
14646 The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
14647 integer in floating-point format rounding halfway cases to even (that is, to the
14648 nearest value that is an even integer).
14649
14650 Arguments:
14651 """"""""""
14652
14653 The argument and return value are floating-point numbers of the same type.
14654
14655 Semantics:
14656 """"""""""
14657
14658 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
14659 also behaves in the same way as C standard function ``roundeven``, except that
14660 it does not raise floating point exceptions.
14661
14662
14663 '``llvm.lround.*``' Intrinsic
14664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14665
14666 Syntax:
14667 """""""
14668
14669 This is an overloaded intrinsic. You can use ``llvm.lround`` on any
14670 floating-point type. Not all targets support all types however.
14671
14672 ::
14673
14674       declare i32 @llvm.lround.i32.f32(float %Val)
14675       declare i32 @llvm.lround.i32.f64(double %Val)
14676       declare i32 @llvm.lround.i32.f80(float %Val)
14677       declare i32 @llvm.lround.i32.f128(double %Val)
14678       declare i32 @llvm.lround.i32.ppcf128(double %Val)
14679
14680       declare i64 @llvm.lround.i64.f32(float %Val)
14681       declare i64 @llvm.lround.i64.f64(double %Val)
14682       declare i64 @llvm.lround.i64.f80(float %Val)
14683       declare i64 @llvm.lround.i64.f128(double %Val)
14684       declare i64 @llvm.lround.i64.ppcf128(double %Val)
14685
14686 Overview:
14687 """""""""
14688
14689 The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
14690 integer with ties away from zero.
14691
14692
14693 Arguments:
14694 """"""""""
14695
14696 The argument is a floating-point number and the return value is an integer
14697 type.
14698
14699 Semantics:
14700 """"""""""
14701
14702 This function returns the same values as the libm ``lround``
14703 functions would, but without setting errno.
14704
14705 '``llvm.llround.*``' Intrinsic
14706 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14707
14708 Syntax:
14709 """""""
14710
14711 This is an overloaded intrinsic. You can use ``llvm.llround`` on any
14712 floating-point type. Not all targets support all types however.
14713
14714 ::
14715
14716       declare i64 @llvm.lround.i64.f32(float %Val)
14717       declare i64 @llvm.lround.i64.f64(double %Val)
14718       declare i64 @llvm.lround.i64.f80(float %Val)
14719       declare i64 @llvm.lround.i64.f128(double %Val)
14720       declare i64 @llvm.lround.i64.ppcf128(double %Val)
14721
14722 Overview:
14723 """""""""
14724
14725 The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
14726 integer with ties away from zero.
14727
14728 Arguments:
14729 """"""""""
14730
14731 The argument is a floating-point number and the return value is an integer
14732 type.
14733
14734 Semantics:
14735 """"""""""
14736
14737 This function returns the same values as the libm ``llround``
14738 functions would, but without setting errno.
14739
14740 '``llvm.lrint.*``' Intrinsic
14741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14742
14743 Syntax:
14744 """""""
14745
14746 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
14747 floating-point type. Not all targets support all types however.
14748
14749 ::
14750
14751       declare i32 @llvm.lrint.i32.f32(float %Val)
14752       declare i32 @llvm.lrint.i32.f64(double %Val)
14753       declare i32 @llvm.lrint.i32.f80(float %Val)
14754       declare i32 @llvm.lrint.i32.f128(double %Val)
14755       declare i32 @llvm.lrint.i32.ppcf128(double %Val)
14756
14757       declare i64 @llvm.lrint.i64.f32(float %Val)
14758       declare i64 @llvm.lrint.i64.f64(double %Val)
14759       declare i64 @llvm.lrint.i64.f80(float %Val)
14760       declare i64 @llvm.lrint.i64.f128(double %Val)
14761       declare i64 @llvm.lrint.i64.ppcf128(double %Val)
14762
14763 Overview:
14764 """""""""
14765
14766 The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
14767 integer.
14768
14769
14770 Arguments:
14771 """"""""""
14772
14773 The argument is a floating-point number and the return value is an integer
14774 type.
14775
14776 Semantics:
14777 """"""""""
14778
14779 This function returns the same values as the libm ``lrint``
14780 functions would, but without setting errno.
14781
14782 '``llvm.llrint.*``' Intrinsic
14783 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14784
14785 Syntax:
14786 """""""
14787
14788 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
14789 floating-point type. Not all targets support all types however.
14790
14791 ::
14792
14793       declare i64 @llvm.llrint.i64.f32(float %Val)
14794       declare i64 @llvm.llrint.i64.f64(double %Val)
14795       declare i64 @llvm.llrint.i64.f80(float %Val)
14796       declare i64 @llvm.llrint.i64.f128(double %Val)
14797       declare i64 @llvm.llrint.i64.ppcf128(double %Val)
14798
14799 Overview:
14800 """""""""
14801
14802 The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
14803 integer.
14804
14805 Arguments:
14806 """"""""""
14807
14808 The argument is a floating-point number and the return value is an integer
14809 type.
14810
14811 Semantics:
14812 """"""""""
14813
14814 This function returns the same values as the libm ``llrint``
14815 functions would, but without setting errno.
14816
14817 Bit Manipulation Intrinsics
14818 ---------------------------
14819
14820 LLVM provides intrinsics for a few important bit manipulation
14821 operations. These allow efficient code generation for some algorithms.
14822
14823 '``llvm.bitreverse.*``' Intrinsics
14824 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14825
14826 Syntax:
14827 """""""
14828
14829 This is an overloaded intrinsic function. You can use bitreverse on any
14830 integer type.
14831
14832 ::
14833
14834       declare i16 @llvm.bitreverse.i16(i16 <id>)
14835       declare i32 @llvm.bitreverse.i32(i32 <id>)
14836       declare i64 @llvm.bitreverse.i64(i64 <id>)
14837       declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
14838
14839 Overview:
14840 """""""""
14841
14842 The '``llvm.bitreverse``' family of intrinsics is used to reverse the
14843 bitpattern of an integer value or vector of integer values; for example
14844 ``0b10110110`` becomes ``0b01101101``.
14845
14846 Semantics:
14847 """"""""""
14848
14849 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
14850 ``M`` in the input moved to bit ``N-M`` in the output. The vector
14851 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
14852 basis and the element order is not affected.
14853
14854 '``llvm.bswap.*``' Intrinsics
14855 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14856
14857 Syntax:
14858 """""""
14859
14860 This is an overloaded intrinsic function. You can use bswap on any
14861 integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
14862
14863 ::
14864
14865       declare i16 @llvm.bswap.i16(i16 <id>)
14866       declare i32 @llvm.bswap.i32(i32 <id>)
14867       declare i64 @llvm.bswap.i64(i64 <id>)
14868       declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
14869
14870 Overview:
14871 """""""""
14872
14873 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
14874 value or vector of integer values with an even number of bytes (positive
14875 multiple of 16 bits).
14876
14877 Semantics:
14878 """"""""""
14879
14880 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
14881 and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
14882 intrinsic returns an i32 value that has the four bytes of the input i32
14883 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
14884 returned i32 will have its bytes in 3, 2, 1, 0 order. The
14885 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
14886 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
14887 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
14888 operate on a per-element basis and the element order is not affected.
14889
14890 '``llvm.ctpop.*``' Intrinsic
14891 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14892
14893 Syntax:
14894 """""""
14895
14896 This is an overloaded intrinsic. You can use llvm.ctpop on any integer
14897 bit width, or on any vector with integer elements. Not all targets
14898 support all bit widths or vector types, however.
14899
14900 ::
14901
14902       declare i8 @llvm.ctpop.i8(i8  <src>)
14903       declare i16 @llvm.ctpop.i16(i16 <src>)
14904       declare i32 @llvm.ctpop.i32(i32 <src>)
14905       declare i64 @llvm.ctpop.i64(i64 <src>)
14906       declare i256 @llvm.ctpop.i256(i256 <src>)
14907       declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
14908
14909 Overview:
14910 """""""""
14911
14912 The '``llvm.ctpop``' family of intrinsics counts the number of bits set
14913 in a value.
14914
14915 Arguments:
14916 """"""""""
14917
14918 The only argument is the value to be counted. The argument may be of any
14919 integer type, or a vector with integer elements. The return type must
14920 match the argument type.
14921
14922 Semantics:
14923 """"""""""
14924
14925 The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
14926 each element of a vector.
14927
14928 '``llvm.ctlz.*``' Intrinsic
14929 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14930
14931 Syntax:
14932 """""""
14933
14934 This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
14935 integer bit width, or any vector whose elements are integers. Not all
14936 targets support all bit widths or vector types, however.
14937
14938 ::
14939
14940       declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_undef>)
14941       declare i16  @llvm.ctlz.i16 (i16  <src>, i1 <is_zero_undef>)
14942       declare i32  @llvm.ctlz.i32 (i32  <src>, i1 <is_zero_undef>)
14943       declare i64  @llvm.ctlz.i64 (i64  <src>, i1 <is_zero_undef>)
14944       declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
14945       declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14946
14947 Overview:
14948 """""""""
14949
14950 The '``llvm.ctlz``' family of intrinsic functions counts the number of
14951 leading zeros in a variable.
14952
14953 Arguments:
14954 """"""""""
14955
14956 The first argument is the value to be counted. This argument may be of
14957 any integer type, or a vector with integer element type. The return
14958 type must match the first argument type.
14959
14960 The second argument must be a constant and is a flag to indicate whether
14961 the intrinsic should ensure that a zero as the first argument produces a
14962 defined result. Historically some architectures did not provide a
14963 defined result for zero values as efficiently, and many algorithms are
14964 now predicated on avoiding zero-value inputs.
14965
14966 Semantics:
14967 """"""""""
14968
14969 The '``llvm.ctlz``' intrinsic counts the leading (most significant)
14970 zeros in a variable, or within each element of the vector. If
14971 ``src == 0`` then the result is the size in bits of the type of ``src``
14972 if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
14973 ``llvm.ctlz(i32 2) = 30``.
14974
14975 '``llvm.cttz.*``' Intrinsic
14976 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14977
14978 Syntax:
14979 """""""
14980
14981 This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
14982 integer bit width, or any vector of integer elements. Not all targets
14983 support all bit widths or vector types, however.
14984
14985 ::
14986
14987       declare i8   @llvm.cttz.i8  (i8   <src>, i1 <is_zero_undef>)
14988       declare i16  @llvm.cttz.i16 (i16  <src>, i1 <is_zero_undef>)
14989       declare i32  @llvm.cttz.i32 (i32  <src>, i1 <is_zero_undef>)
14990       declare i64  @llvm.cttz.i64 (i64  <src>, i1 <is_zero_undef>)
14991       declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
14992       declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14993
14994 Overview:
14995 """""""""
14996
14997 The '``llvm.cttz``' family of intrinsic functions counts the number of
14998 trailing zeros.
14999
15000 Arguments:
15001 """"""""""
15002
15003 The first argument is the value to be counted. This argument may be of
15004 any integer type, or a vector with integer element type. The return
15005 type must match the first argument type.
15006
15007 The second argument must be a constant and is a flag to indicate whether
15008 the intrinsic should ensure that a zero as the first argument produces a
15009 defined result. Historically some architectures did not provide a
15010 defined result for zero values as efficiently, and many algorithms are
15011 now predicated on avoiding zero-value inputs.
15012
15013 Semantics:
15014 """"""""""
15015
15016 The '``llvm.cttz``' intrinsic counts the trailing (least significant)
15017 zeros in a variable, or within each element of a vector. If ``src == 0``
15018 then the result is the size in bits of the type of ``src`` if
15019 ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
15020 ``llvm.cttz(2) = 1``.
15021
15022 .. _int_overflow:
15023
15024 '``llvm.fshl.*``' Intrinsic
15025 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15026
15027 Syntax:
15028 """""""
15029
15030 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
15031 integer bit width or any vector of integer elements. Not all targets
15032 support all bit widths or vector types, however.
15033
15034 ::
15035
15036       declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
15037       declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
15038       declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15039
15040 Overview:
15041 """""""""
15042
15043 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
15044 the first two values are concatenated as { %a : %b } (%a is the most significant
15045 bits of the wide value), the combined value is shifted left, and the most
15046 significant bits are extracted to produce a result that is the same size as the
15047 original arguments. If the first 2 arguments are identical, this is equivalent
15048 to a rotate left operation. For vector types, the operation occurs for each
15049 element of the vector. The shift argument is treated as an unsigned amount
15050 modulo the element size of the arguments.
15051
15052 Arguments:
15053 """"""""""
15054
15055 The first two arguments are the values to be concatenated. The third
15056 argument is the shift amount. The arguments may be any integer type or a
15057 vector with integer element type. All arguments and the return value must
15058 have the same type.
15059
15060 Example:
15061 """"""""
15062
15063 .. code-block:: text
15064
15065       %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
15066       %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
15067       %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
15068       %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
15069
15070 '``llvm.fshr.*``' Intrinsic
15071 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15072
15073 Syntax:
15074 """""""
15075
15076 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
15077 integer bit width or any vector of integer elements. Not all targets
15078 support all bit widths or vector types, however.
15079
15080 ::
15081
15082       declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
15083       declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
15084       declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15085
15086 Overview:
15087 """""""""
15088
15089 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
15090 the first two values are concatenated as { %a : %b } (%a is the most significant
15091 bits of the wide value), the combined value is shifted right, and the least
15092 significant bits are extracted to produce a result that is the same size as the
15093 original arguments. If the first 2 arguments are identical, this is equivalent
15094 to a rotate right operation. For vector types, the operation occurs for each
15095 element of the vector. The shift argument is treated as an unsigned amount
15096 modulo the element size of the arguments.
15097
15098 Arguments:
15099 """"""""""
15100
15101 The first two arguments are the values to be concatenated. The third
15102 argument is the shift amount. The arguments may be any integer type or a
15103 vector with integer element type. All arguments and the return value must
15104 have the same type.
15105
15106 Example:
15107 """"""""
15108
15109 .. code-block:: text
15110
15111       %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
15112       %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
15113       %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
15114       %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
15115
15116 Arithmetic with Overflow Intrinsics
15117 -----------------------------------
15118
15119 LLVM provides intrinsics for fast arithmetic overflow checking.
15120
15121 Each of these intrinsics returns a two-element struct. The first
15122 element of this struct contains the result of the corresponding
15123 arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
15124 the result. Therefore, for example, the first element of the struct
15125 returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
15126 result of a 32-bit ``add`` instruction with the same operands, where
15127 the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
15128
15129 The second element of the result is an ``i1`` that is 1 if the
15130 arithmetic operation overflowed and 0 otherwise. An operation
15131 overflows if, for any values of its operands ``A`` and ``B`` and for
15132 any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
15133 not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
15134 ``sext`` for signed overflow and ``zext`` for unsigned overflow, and
15135 ``op`` is the underlying arithmetic operation.
15136
15137 The behavior of these intrinsics is well-defined for all argument
15138 values.
15139
15140 '``llvm.sadd.with.overflow.*``' Intrinsics
15141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15142
15143 Syntax:
15144 """""""
15145
15146 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
15147 on any integer bit width or vectors of integers.
15148
15149 ::
15150
15151       declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
15152       declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15153       declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
15154       declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15155
15156 Overview:
15157 """""""""
15158
15159 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15160 a signed addition of the two arguments, and indicate whether an overflow
15161 occurred during the signed summation.
15162
15163 Arguments:
15164 """"""""""
15165
15166 The arguments (%a and %b) and the first element of the result structure
15167 may be of integer types of any bit width, but they must have the same
15168 bit width. The second element of the result structure must be of type
15169 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15170 addition.
15171
15172 Semantics:
15173 """"""""""
15174
15175 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15176 a signed addition of the two variables. They return a structure --- the
15177 first element of which is the signed summation, and the second element
15178 of which is a bit specifying if the signed summation resulted in an
15179 overflow.
15180
15181 Examples:
15182 """""""""
15183
15184 .. code-block:: llvm
15185
15186       %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15187       %sum = extractvalue {i32, i1} %res, 0
15188       %obit = extractvalue {i32, i1} %res, 1
15189       br i1 %obit, label %overflow, label %normal
15190
15191 '``llvm.uadd.with.overflow.*``' Intrinsics
15192 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15193
15194 Syntax:
15195 """""""
15196
15197 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
15198 on any integer bit width or vectors of integers.
15199
15200 ::
15201
15202       declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
15203       declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15204       declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
15205       declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15206
15207 Overview:
15208 """""""""
15209
15210 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15211 an unsigned addition of the two arguments, and indicate whether a carry
15212 occurred during the unsigned summation.
15213
15214 Arguments:
15215 """"""""""
15216
15217 The arguments (%a and %b) and the first element of the result structure
15218 may be of integer types of any bit width, but they must have the same
15219 bit width. The second element of the result structure must be of type
15220 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15221 addition.
15222
15223 Semantics:
15224 """"""""""
15225
15226 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15227 an unsigned addition of the two arguments. They return a structure --- the
15228 first element of which is the sum, and the second element of which is a
15229 bit specifying if the unsigned summation resulted in a carry.
15230
15231 Examples:
15232 """""""""
15233
15234 .. code-block:: llvm
15235
15236       %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15237       %sum = extractvalue {i32, i1} %res, 0
15238       %obit = extractvalue {i32, i1} %res, 1
15239       br i1 %obit, label %carry, label %normal
15240
15241 '``llvm.ssub.with.overflow.*``' Intrinsics
15242 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15243
15244 Syntax:
15245 """""""
15246
15247 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
15248 on any integer bit width or vectors of integers.
15249
15250 ::
15251
15252       declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
15253       declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15254       declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
15255       declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15256
15257 Overview:
15258 """""""""
15259
15260 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15261 a signed subtraction of the two arguments, and indicate whether an
15262 overflow occurred during the signed subtraction.
15263
15264 Arguments:
15265 """"""""""
15266
15267 The arguments (%a and %b) and the first element of the result structure
15268 may be of integer types of any bit width, but they must have the same
15269 bit width. The second element of the result structure must be of type
15270 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15271 subtraction.
15272
15273 Semantics:
15274 """"""""""
15275
15276 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15277 a signed subtraction of the two arguments. They return a structure --- the
15278 first element of which is the subtraction, and the second element of
15279 which is a bit specifying if the signed subtraction resulted in an
15280 overflow.
15281
15282 Examples:
15283 """""""""
15284
15285 .. code-block:: llvm
15286
15287       %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15288       %sum = extractvalue {i32, i1} %res, 0
15289       %obit = extractvalue {i32, i1} %res, 1
15290       br i1 %obit, label %overflow, label %normal
15291
15292 '``llvm.usub.with.overflow.*``' Intrinsics
15293 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15294
15295 Syntax:
15296 """""""
15297
15298 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
15299 on any integer bit width or vectors of integers.
15300
15301 ::
15302
15303       declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
15304       declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15305       declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
15306       declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15307
15308 Overview:
15309 """""""""
15310
15311 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15312 an unsigned subtraction of the two arguments, and indicate whether an
15313 overflow occurred during the unsigned subtraction.
15314
15315 Arguments:
15316 """"""""""
15317
15318 The arguments (%a and %b) and the first element of the result structure
15319 may be of integer types of any bit width, but they must have the same
15320 bit width. The second element of the result structure must be of type
15321 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15322 subtraction.
15323
15324 Semantics:
15325 """"""""""
15326
15327 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15328 an unsigned subtraction of the two arguments. They return a structure ---
15329 the first element of which is the subtraction, and the second element of
15330 which is a bit specifying if the unsigned subtraction resulted in an
15331 overflow.
15332
15333 Examples:
15334 """""""""
15335
15336 .. code-block:: llvm
15337
15338       %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15339       %sum = extractvalue {i32, i1} %res, 0
15340       %obit = extractvalue {i32, i1} %res, 1
15341       br i1 %obit, label %overflow, label %normal
15342
15343 '``llvm.smul.with.overflow.*``' Intrinsics
15344 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15345
15346 Syntax:
15347 """""""
15348
15349 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
15350 on any integer bit width or vectors of integers.
15351
15352 ::
15353
15354       declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
15355       declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15356       declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
15357       declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15358
15359 Overview:
15360 """""""""
15361
15362 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15363 a signed multiplication of the two arguments, and indicate whether an
15364 overflow occurred during the signed multiplication.
15365
15366 Arguments:
15367 """"""""""
15368
15369 The arguments (%a and %b) and the first element of the result structure
15370 may be of integer types of any bit width, but they must have the same
15371 bit width. The second element of the result structure must be of type
15372 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15373 multiplication.
15374
15375 Semantics:
15376 """"""""""
15377
15378 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15379 a signed multiplication of the two arguments. They return a structure ---
15380 the first element of which is the multiplication, and the second element
15381 of which is a bit specifying if the signed multiplication resulted in an
15382 overflow.
15383
15384 Examples:
15385 """""""""
15386
15387 .. code-block:: llvm
15388
15389       %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15390       %sum = extractvalue {i32, i1} %res, 0
15391       %obit = extractvalue {i32, i1} %res, 1
15392       br i1 %obit, label %overflow, label %normal
15393
15394 '``llvm.umul.with.overflow.*``' Intrinsics
15395 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15396
15397 Syntax:
15398 """""""
15399
15400 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
15401 on any integer bit width or vectors of integers.
15402
15403 ::
15404
15405       declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
15406       declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15407       declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
15408       declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15409
15410 Overview:
15411 """""""""
15412
15413 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15414 a unsigned multiplication of the two arguments, and indicate whether an
15415 overflow occurred during the unsigned multiplication.
15416
15417 Arguments:
15418 """"""""""
15419
15420 The arguments (%a and %b) and the first element of the result structure
15421 may be of integer types of any bit width, but they must have the same
15422 bit width. The second element of the result structure must be of type
15423 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15424 multiplication.
15425
15426 Semantics:
15427 """"""""""
15428
15429 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15430 an unsigned multiplication of the two arguments. They return a structure ---
15431 the first element of which is the multiplication, and the second
15432 element of which is a bit specifying if the unsigned multiplication
15433 resulted in an overflow.
15434
15435 Examples:
15436 """""""""
15437
15438 .. code-block:: llvm
15439
15440       %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15441       %sum = extractvalue {i32, i1} %res, 0
15442       %obit = extractvalue {i32, i1} %res, 1
15443       br i1 %obit, label %overflow, label %normal
15444
15445 Saturation Arithmetic Intrinsics
15446 ---------------------------------
15447
15448 Saturation arithmetic is a version of arithmetic in which operations are
15449 limited to a fixed range between a minimum and maximum value. If the result of
15450 an operation is greater than the maximum value, the result is set (or
15451 "clamped") to this maximum. If it is below the minimum, it is clamped to this
15452 minimum.
15453
15454
15455 '``llvm.sadd.sat.*``' Intrinsics
15456 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15457
15458 Syntax
15459 """""""
15460
15461 This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
15462 on any integer bit width or vectors of integers.
15463
15464 ::
15465
15466       declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
15467       declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
15468       declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
15469       declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15470
15471 Overview
15472 """""""""
15473
15474 The '``llvm.sadd.sat``' family of intrinsic functions perform signed
15475 saturating addition on the 2 arguments.
15476
15477 Arguments
15478 """"""""""
15479
15480 The arguments (%a and %b) and the result may be of integer types of any bit
15481 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15482 values that will undergo signed addition.
15483
15484 Semantics:
15485 """"""""""
15486
15487 The maximum value this operation can clamp to is the largest signed value
15488 representable by the bit width of the arguments. The minimum value is the
15489 smallest signed value representable by this bit width.
15490
15491
15492 Examples
15493 """""""""
15494
15495 .. code-block:: llvm
15496
15497       %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2)  ; %res = 3
15498       %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6)  ; %res = 7
15499       %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2)  ; %res = -2
15500       %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5)  ; %res = -8
15501
15502
15503 '``llvm.uadd.sat.*``' Intrinsics
15504 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15505
15506 Syntax
15507 """""""
15508
15509 This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
15510 on any integer bit width or vectors of integers.
15511
15512 ::
15513
15514       declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
15515       declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
15516       declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
15517       declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15518
15519 Overview
15520 """""""""
15521
15522 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
15523 saturating addition on the 2 arguments.
15524
15525 Arguments
15526 """"""""""
15527
15528 The arguments (%a and %b) and the result may be of integer types of any bit
15529 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15530 values that will undergo unsigned addition.
15531
15532 Semantics:
15533 """"""""""
15534
15535 The maximum value this operation can clamp to is the largest unsigned value
15536 representable by the bit width of the arguments. Because this is an unsigned
15537 operation, the result will never saturate towards zero.
15538
15539
15540 Examples
15541 """""""""
15542
15543 .. code-block:: llvm
15544
15545       %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2)  ; %res = 3
15546       %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6)  ; %res = 11
15547       %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8)  ; %res = 15
15548
15549
15550 '``llvm.ssub.sat.*``' Intrinsics
15551 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15552
15553 Syntax
15554 """""""
15555
15556 This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
15557 on any integer bit width or vectors of integers.
15558
15559 ::
15560
15561       declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
15562       declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
15563       declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
15564       declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15565
15566 Overview
15567 """""""""
15568
15569 The '``llvm.ssub.sat``' family of intrinsic functions perform signed
15570 saturating subtraction on the 2 arguments.
15571
15572 Arguments
15573 """"""""""
15574
15575 The arguments (%a and %b) and the result may be of integer types of any bit
15576 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15577 values that will undergo signed subtraction.
15578
15579 Semantics:
15580 """"""""""
15581
15582 The maximum value this operation can clamp to is the largest signed value
15583 representable by the bit width of the arguments. The minimum value is the
15584 smallest signed value representable by this bit width.
15585
15586
15587 Examples
15588 """""""""
15589
15590 .. code-block:: llvm
15591
15592       %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1)  ; %res = 1
15593       %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6)  ; %res = -4
15594       %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5)  ; %res = -8
15595       %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5)  ; %res = 7
15596
15597
15598 '``llvm.usub.sat.*``' Intrinsics
15599 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15600
15601 Syntax
15602 """""""
15603
15604 This is an overloaded intrinsic. You can use ``llvm.usub.sat``
15605 on any integer bit width or vectors of integers.
15606
15607 ::
15608
15609       declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
15610       declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
15611       declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
15612       declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15613
15614 Overview
15615 """""""""
15616
15617 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
15618 saturating subtraction on the 2 arguments.
15619
15620 Arguments
15621 """"""""""
15622
15623 The arguments (%a and %b) and the result may be of integer types of any bit
15624 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15625 values that will undergo unsigned subtraction.
15626
15627 Semantics:
15628 """"""""""
15629
15630 The minimum value this operation can clamp to is 0, which is the smallest
15631 unsigned value representable by the bit width of the unsigned arguments.
15632 Because this is an unsigned operation, the result will never saturate towards
15633 the largest possible value representable by this bit width.
15634
15635
15636 Examples
15637 """""""""
15638
15639 .. code-block:: llvm
15640
15641       %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1)  ; %res = 1
15642       %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6)  ; %res = 0
15643
15644
15645 '``llvm.sshl.sat.*``' Intrinsics
15646 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15647
15648 Syntax
15649 """""""
15650
15651 This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
15652 on integers or vectors of integers of any bit width.
15653
15654 ::
15655
15656       declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
15657       declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
15658       declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
15659       declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15660
15661 Overview
15662 """""""""
15663
15664 The '``llvm.sshl.sat``' family of intrinsic functions perform signed
15665 saturating left shift on the first argument.
15666
15667 Arguments
15668 """"""""""
15669
15670 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15671 bit width, but they must have the same bit width. ``%a`` is the value to be
15672 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15673 dynamically) equal to or larger than the integer bit width of the arguments,
15674 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15675 vectors, each vector element of ``a`` is shifted by the corresponding shift
15676 amount in ``b``.
15677
15678
15679 Semantics:
15680 """"""""""
15681
15682 The maximum value this operation can clamp to is the largest signed value
15683 representable by the bit width of the arguments. The minimum value is the
15684 smallest signed value representable by this bit width.
15685
15686
15687 Examples
15688 """""""""
15689
15690 .. code-block:: llvm
15691
15692       %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1)  ; %res = 4
15693       %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2)  ; %res = 7
15694       %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1)  ; %res = -8
15695       %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1)  ; %res = -2
15696
15697
15698 '``llvm.ushl.sat.*``' Intrinsics
15699 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15700
15701 Syntax
15702 """""""
15703
15704 This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
15705 on integers or vectors of integers of any bit width.
15706
15707 ::
15708
15709       declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
15710       declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
15711       declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
15712       declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15713
15714 Overview
15715 """""""""
15716
15717 The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
15718 saturating left shift on the first argument.
15719
15720 Arguments
15721 """"""""""
15722
15723 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15724 bit width, but they must have the same bit width. ``%a`` is the value to be
15725 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15726 dynamically) equal to or larger than the integer bit width of the arguments,
15727 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15728 vectors, each vector element of ``a`` is shifted by the corresponding shift
15729 amount in ``b``.
15730
15731 Semantics:
15732 """"""""""
15733
15734 The maximum value this operation can clamp to is the largest unsigned value
15735 representable by the bit width of the arguments.
15736
15737
15738 Examples
15739 """""""""
15740
15741 .. code-block:: llvm
15742
15743       %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1)  ; %res = 4
15744       %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3)  ; %res = 15
15745
15746
15747 Fixed Point Arithmetic Intrinsics
15748 ---------------------------------
15749
15750 A fixed point number represents a real data type for a number that has a fixed
15751 number of digits after a radix point (equivalent to the decimal point '.').
15752 The number of digits after the radix point is referred as the `scale`. These
15753 are useful for representing fractional values to a specific precision. The
15754 following intrinsics perform fixed point arithmetic operations on 2 operands
15755 of the same scale, specified as the third argument.
15756
15757 The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
15758 of fixed point numbers through scaled integers. Therefore, fixed point
15759 multiplication can be represented as
15760
15761 .. code-block:: llvm
15762
15763         %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
15764
15765         ; Expands to
15766         %a2 = sext i4 %a to i8
15767         %b2 = sext i4 %b to i8
15768         %mul = mul nsw nuw i8 %a, %b
15769         %scale2 = trunc i32 %scale to i8
15770         %r = ashr i8 %mul, i8 %scale2  ; this is for a target rounding down towards negative infinity
15771         %result = trunc i8 %r to i4
15772
15773 The ``llvm.*div.fix`` family of intrinsic functions represents a division of
15774 fixed point numbers through scaled integers. Fixed point division can be
15775 represented as:
15776
15777 .. code-block:: llvm
15778
15779         %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
15780
15781         ; Expands to
15782         %a2 = sext i4 %a to i8
15783         %b2 = sext i4 %b to i8
15784         %scale2 = trunc i32 %scale to i8
15785         %a3 = shl i8 %a2, %scale2
15786         %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
15787         %result = trunc i8 %r to i4
15788
15789 For each of these functions, if the result cannot be represented exactly with
15790 the provided scale, the result is rounded. Rounding is unspecified since
15791 preferred rounding may vary for different targets. Rounding is specified
15792 through a target hook. Different pipelines should legalize or optimize this
15793 using the rounding specified by this hook if it is provided. Operations like
15794 constant folding, instruction combining, KnownBits, and ValueTracking should
15795 also use this hook, if provided, and not assume the direction of rounding. A
15796 rounded result must always be within one unit of precision from the true
15797 result. That is, the error between the returned result and the true result must
15798 be less than 1/2^(scale).
15799
15800
15801 '``llvm.smul.fix.*``' Intrinsics
15802 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15803
15804 Syntax
15805 """""""
15806
15807 This is an overloaded intrinsic. You can use ``llvm.smul.fix``
15808 on any integer bit width or vectors of integers.
15809
15810 ::
15811
15812       declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
15813       declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
15814       declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
15815       declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15816
15817 Overview
15818 """""""""
15819
15820 The '``llvm.smul.fix``' family of intrinsic functions perform signed
15821 fixed point multiplication on 2 arguments of the same scale.
15822
15823 Arguments
15824 """"""""""
15825
15826 The arguments (%a and %b) and the result may be of integer types of any bit
15827 width, but they must have the same bit width. The arguments may also work with
15828 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15829 values that will undergo signed fixed point multiplication. The argument
15830 ``%scale`` represents the scale of both operands, and must be a constant
15831 integer.
15832
15833 Semantics:
15834 """"""""""
15835
15836 This operation performs fixed point multiplication on the 2 arguments of a
15837 specified scale. The result will also be returned in the same scale specified
15838 in the third argument.
15839
15840 If the result value cannot be precisely represented in the given scale, the
15841 value is rounded up or down to the closest representable value. The rounding
15842 direction is unspecified.
15843
15844 It is undefined behavior if the result value does not fit within the range of
15845 the fixed point type.
15846
15847
15848 Examples
15849 """""""""
15850
15851 .. code-block:: llvm
15852
15853       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15854       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15855       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15856
15857       ; The result in the following could be rounded up to -2 or down to -2.5
15858       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15859
15860
15861 '``llvm.umul.fix.*``' Intrinsics
15862 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15863
15864 Syntax
15865 """""""
15866
15867 This is an overloaded intrinsic. You can use ``llvm.umul.fix``
15868 on any integer bit width or vectors of integers.
15869
15870 ::
15871
15872       declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
15873       declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
15874       declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
15875       declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15876
15877 Overview
15878 """""""""
15879
15880 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
15881 fixed point multiplication on 2 arguments of the same scale.
15882
15883 Arguments
15884 """"""""""
15885
15886 The arguments (%a and %b) and the result may be of integer types of any bit
15887 width, but they must have the same bit width. The arguments may also work with
15888 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15889 values that will undergo unsigned fixed point multiplication. The argument
15890 ``%scale`` represents the scale of both operands, and must be a constant
15891 integer.
15892
15893 Semantics:
15894 """"""""""
15895
15896 This operation performs unsigned fixed point multiplication on the 2 arguments of a
15897 specified scale. The result will also be returned in the same scale specified
15898 in the third argument.
15899
15900 If the result value cannot be precisely represented in the given scale, the
15901 value is rounded up or down to the closest representable value. The rounding
15902 direction is unspecified.
15903
15904 It is undefined behavior if the result value does not fit within the range of
15905 the fixed point type.
15906
15907
15908 Examples
15909 """""""""
15910
15911 .. code-block:: llvm
15912
15913       %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15914       %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15915
15916       ; The result in the following could be rounded down to 3.5 or up to 4
15917       %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1)  ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
15918
15919
15920 '``llvm.smul.fix.sat.*``' Intrinsics
15921 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15922
15923 Syntax
15924 """""""
15925
15926 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
15927 on any integer bit width or vectors of integers.
15928
15929 ::
15930
15931       declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
15932       declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
15933       declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
15934       declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15935
15936 Overview
15937 """""""""
15938
15939 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
15940 fixed point saturating multiplication on 2 arguments of the same scale.
15941
15942 Arguments
15943 """"""""""
15944
15945 The arguments (%a and %b) and the result may be of integer types of any bit
15946 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15947 values that will undergo signed fixed point multiplication. The argument
15948 ``%scale`` represents the scale of both operands, and must be a constant
15949 integer.
15950
15951 Semantics:
15952 """"""""""
15953
15954 This operation performs fixed point multiplication on the 2 arguments of a
15955 specified scale. The result will also be returned in the same scale specified
15956 in the third argument.
15957
15958 If the result value cannot be precisely represented in the given scale, the
15959 value is rounded up or down to the closest representable value. The rounding
15960 direction is unspecified.
15961
15962 The maximum value this operation can clamp to is the largest signed value
15963 representable by the bit width of the first 2 arguments. The minimum value is the
15964 smallest signed value representable by this bit width.
15965
15966
15967 Examples
15968 """""""""
15969
15970 .. code-block:: llvm
15971
15972       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15973       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15974       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15975
15976       ; The result in the following could be rounded up to -2 or down to -2.5
15977       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15978
15979       ; Saturation
15980       %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0)  ; %res = 7
15981       %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2)  ; %res = 7
15982       %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2)  ; %res = -8
15983       %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1)  ; %res = 7
15984
15985       ; Scale can affect the saturation result
15986       %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
15987       %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
15988
15989
15990 '``llvm.umul.fix.sat.*``' Intrinsics
15991 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15992
15993 Syntax
15994 """""""
15995
15996 This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
15997 on any integer bit width or vectors of integers.
15998
15999 ::
16000
16001       declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16002       declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16003       declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16004       declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16005
16006 Overview
16007 """""""""
16008
16009 The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
16010 fixed point saturating multiplication on 2 arguments of the same scale.
16011
16012 Arguments
16013 """"""""""
16014
16015 The arguments (%a and %b) and the result may be of integer types of any bit
16016 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16017 values that will undergo unsigned fixed point multiplication. The argument
16018 ``%scale`` represents the scale of both operands, and must be a constant
16019 integer.
16020
16021 Semantics:
16022 """"""""""
16023
16024 This operation performs fixed point multiplication on the 2 arguments of a
16025 specified scale. The result will also be returned in the same scale specified
16026 in the third argument.
16027
16028 If the result value cannot be precisely represented in the given scale, the
16029 value is rounded up or down to the closest representable value. The rounding
16030 direction is unspecified.
16031
16032 The maximum value this operation can clamp to is the largest unsigned value
16033 representable by the bit width of the first 2 arguments. The minimum value is the
16034 smallest unsigned value representable by this bit width (zero).
16035
16036
16037 Examples
16038 """""""""
16039
16040 .. code-block:: llvm
16041
16042       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16043       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16044
16045       ; The result in the following could be rounded down to 2 or up to 2.5
16046       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1)  ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
16047
16048       ; Saturation
16049       %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
16050       %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
16051
16052       ; Scale can affect the saturation result
16053       %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16054       %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16055
16056
16057 '``llvm.sdiv.fix.*``' Intrinsics
16058 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16059
16060 Syntax
16061 """""""
16062
16063 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
16064 on any integer bit width or vectors of integers.
16065
16066 ::
16067
16068       declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16069       declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16070       declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16071       declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16072
16073 Overview
16074 """""""""
16075
16076 The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
16077 fixed point division on 2 arguments of the same scale.
16078
16079 Arguments
16080 """"""""""
16081
16082 The arguments (%a and %b) and the result may be of integer types of any bit
16083 width, but they must have the same bit width. The arguments may also work with
16084 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16085 values that will undergo signed fixed point division. The argument
16086 ``%scale`` represents the scale of both operands, and must be a constant
16087 integer.
16088
16089 Semantics:
16090 """"""""""
16091
16092 This operation performs fixed point division on the 2 arguments of a
16093 specified scale. The result will also be returned in the same scale specified
16094 in the third argument.
16095
16096 If the result value cannot be precisely represented in the given scale, the
16097 value is rounded up or down to the closest representable value. The rounding
16098 direction is unspecified.
16099
16100 It is undefined behavior if the result value does not fit within the range of
16101 the fixed point type, or if the second argument is zero.
16102
16103
16104 Examples
16105 """""""""
16106
16107 .. code-block:: llvm
16108
16109       %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16110       %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16111       %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16112
16113       ; The result in the following could be rounded up to 1 or down to 0.5
16114       %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16115
16116
16117 '``llvm.udiv.fix.*``' Intrinsics
16118 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16119
16120 Syntax
16121 """""""
16122
16123 This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
16124 on any integer bit width or vectors of integers.
16125
16126 ::
16127
16128       declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16129       declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16130       declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16131       declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16132
16133 Overview
16134 """""""""
16135
16136 The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
16137 fixed point division on 2 arguments of the same scale.
16138
16139 Arguments
16140 """"""""""
16141
16142 The arguments (%a and %b) and the result may be of integer types of any bit
16143 width, but they must have the same bit width. The arguments may also work with
16144 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16145 values that will undergo unsigned fixed point division. The argument
16146 ``%scale`` represents the scale of both operands, and must be a constant
16147 integer.
16148
16149 Semantics:
16150 """"""""""
16151
16152 This operation performs fixed point division on the 2 arguments of a
16153 specified scale. The result will also be returned in the same scale specified
16154 in the third argument.
16155
16156 If the result value cannot be precisely represented in the given scale, the
16157 value is rounded up or down to the closest representable value. The rounding
16158 direction is unspecified.
16159
16160 It is undefined behavior if the result value does not fit within the range of
16161 the fixed point type, or if the second argument is zero.
16162
16163
16164 Examples
16165 """""""""
16166
16167 .. code-block:: llvm
16168
16169       %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16170       %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16171       %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
16172
16173       ; The result in the following could be rounded up to 1 or down to 0.5
16174       %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16175
16176
16177 '``llvm.sdiv.fix.sat.*``' Intrinsics
16178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16179
16180 Syntax
16181 """""""
16182
16183 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
16184 on any integer bit width or vectors of integers.
16185
16186 ::
16187
16188       declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16189       declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16190       declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16191       declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16192
16193 Overview
16194 """""""""
16195
16196 The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
16197 fixed point saturating division on 2 arguments of the same scale.
16198
16199 Arguments
16200 """"""""""
16201
16202 The arguments (%a and %b) and the result may be of integer types of any bit
16203 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16204 values that will undergo signed fixed point division. The argument
16205 ``%scale`` represents the scale of both operands, and must be a constant
16206 integer.
16207
16208 Semantics:
16209 """"""""""
16210
16211 This operation performs fixed point division on the 2 arguments of a
16212 specified scale. The result will also be returned in the same scale specified
16213 in the third argument.
16214
16215 If the result value cannot be precisely represented in the given scale, the
16216 value is rounded up or down to the closest representable value. The rounding
16217 direction is unspecified.
16218
16219 The maximum value this operation can clamp to is the largest signed value
16220 representable by the bit width of the first 2 arguments. The minimum value is the
16221 smallest signed value representable by this bit width.
16222
16223 It is undefined behavior if the second argument is zero.
16224
16225
16226 Examples
16227 """""""""
16228
16229 .. code-block:: llvm
16230
16231       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16232       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16233       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16234
16235       ; The result in the following could be rounded up to 1 or down to 0.5
16236       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16237
16238       ; Saturation
16239       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0)  ; %res = 7 (-8 / -1 = 8 => 7)
16240       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2)  ; %res = 7 (1 / 0.5 = 2 => 1.75)
16241       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2)  ; %res = -8 (-1 / 0.25 = -4 => -2)
16242
16243
16244 '``llvm.udiv.fix.sat.*``' Intrinsics
16245 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16246
16247 Syntax
16248 """""""
16249
16250 This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
16251 on any integer bit width or vectors of integers.
16252
16253 ::
16254
16255       declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16256       declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16257       declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16258       declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16259
16260 Overview
16261 """""""""
16262
16263 The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
16264 fixed point saturating division on 2 arguments of the same scale.
16265
16266 Arguments
16267 """"""""""
16268
16269 The arguments (%a and %b) and the result may be of integer types of any bit
16270 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16271 values that will undergo unsigned fixed point division. The argument
16272 ``%scale`` represents the scale of both operands, and must be a constant
16273 integer.
16274
16275 Semantics:
16276 """"""""""
16277
16278 This operation performs fixed point division on the 2 arguments of a
16279 specified scale. The result will also be returned in the same scale specified
16280 in the third argument.
16281
16282 If the result value cannot be precisely represented in the given scale, the
16283 value is rounded up or down to the closest representable value. The rounding
16284 direction is unspecified.
16285
16286 The maximum value this operation can clamp to is the largest unsigned value
16287 representable by the bit width of the first 2 arguments. The minimum value is the
16288 smallest unsigned value representable by this bit width (zero).
16289
16290 It is undefined behavior if the second argument is zero.
16291
16292 Examples
16293 """""""""
16294
16295 .. code-block:: llvm
16296
16297       %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16298       %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16299
16300       ; The result in the following could be rounded down to 0.5 or up to 1
16301       %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 1 (or 2) (1.5 / 2 = 0.75)
16302
16303       ; Saturation
16304       %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2)  ; %res = 15 (2 / 0.5 = 4 => 3.75)
16305
16306
16307 Specialised Arithmetic Intrinsics
16308 ---------------------------------
16309
16310 .. _i_intr_llvm_canonicalize:
16311
16312 '``llvm.canonicalize.*``' Intrinsic
16313 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16314
16315 Syntax:
16316 """""""
16317
16318 ::
16319
16320       declare float @llvm.canonicalize.f32(float %a)
16321       declare double @llvm.canonicalize.f64(double %b)
16322
16323 Overview:
16324 """""""""
16325
16326 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
16327 encoding of a floating-point number. This canonicalization is useful for
16328 implementing certain numeric primitives such as frexp. The canonical encoding is
16329 defined by IEEE-754-2008 to be:
16330
16331 ::
16332
16333       2.1.8 canonical encoding: The preferred encoding of a floating-point
16334       representation in a format. Applied to declets, significands of finite
16335       numbers, infinities, and NaNs, especially in decimal formats.
16336
16337 This operation can also be considered equivalent to the IEEE-754-2008
16338 conversion of a floating-point value to the same format. NaNs are handled
16339 according to section 6.2.
16340
16341 Examples of non-canonical encodings:
16342
16343 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
16344   converted to a canonical representation per hardware-specific protocol.
16345 - Many normal decimal floating-point numbers have non-canonical alternative
16346   encodings.
16347 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
16348   These are treated as non-canonical encodings of zero and will be flushed to
16349   a zero of the same sign by this operation.
16350
16351 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
16352 default exception handling must signal an invalid exception, and produce a
16353 quiet NaN result.
16354
16355 This function should always be implementable as multiplication by 1.0, provided
16356 that the compiler does not constant fold the operation. Likewise, division by
16357 1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
16358 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
16359
16360 ``@llvm.canonicalize`` must preserve the equality relation. That is:
16361
16362 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
16363 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
16364   to ``(x == y)``
16365
16366 Additionally, the sign of zero must be conserved:
16367 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
16368
16369 The payload bits of a NaN must be conserved, with two exceptions.
16370 First, environments which use only a single canonical representation of NaN
16371 must perform said canonicalization. Second, SNaNs must be quieted per the
16372 usual methods.
16373
16374 The canonicalization operation may be optimized away if:
16375
16376 - The input is known to be canonical. For example, it was produced by a
16377   floating-point operation that is required by the standard to be canonical.
16378 - The result is consumed only by (or fused with) other floating-point
16379   operations. That is, the bits of the floating-point value are not examined.
16380
16381 '``llvm.fmuladd.*``' Intrinsic
16382 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16383
16384 Syntax:
16385 """""""
16386
16387 ::
16388
16389       declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
16390       declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
16391
16392 Overview:
16393 """""""""
16394
16395 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
16396 expressions that can be fused if the code generator determines that (a) the
16397 target instruction set has support for a fused operation, and (b) that the
16398 fused operation is more efficient than the equivalent, separate pair of mul
16399 and add instructions.
16400
16401 Arguments:
16402 """"""""""
16403
16404 The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
16405 multiplicands, a and b, and an addend c.
16406
16407 Semantics:
16408 """"""""""
16409
16410 The expression:
16411
16412 ::
16413
16414       %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
16415
16416 is equivalent to the expression a \* b + c, except that it is unspecified
16417 whether rounding will be performed between the multiplication and addition
16418 steps. Fusion is not guaranteed, even if the target platform supports it.
16419 If a fused multiply-add is required, the corresponding
16420 :ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
16421 This never sets errno, just as '``llvm.fma.*``'.
16422
16423 Examples:
16424 """""""""
16425
16426 .. code-block:: llvm
16427
16428       %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
16429
16430
16431 Hardware-Loop Intrinsics
16432 ------------------------
16433
16434 LLVM support several intrinsics to mark a loop as a hardware-loop. They are
16435 hints to the backend which are required to lower these intrinsics further to target
16436 specific instructions, or revert the hardware-loop to a normal loop if target
16437 specific restriction are not met and a hardware-loop can't be generated.
16438
16439 These intrinsics may be modified in the future and are not intended to be used
16440 outside the backend. Thus, front-end and mid-level optimizations should not be
16441 generating these intrinsics.
16442
16443
16444 '``llvm.set.loop.iterations.*``' Intrinsic
16445 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16446
16447 Syntax:
16448 """""""
16449
16450 This is an overloaded intrinsic.
16451
16452 ::
16453
16454       declare void @llvm.set.loop.iterations.i32(i32)
16455       declare void @llvm.set.loop.iterations.i64(i64)
16456
16457 Overview:
16458 """""""""
16459
16460 The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
16461 hardware-loop trip count. They are placed in the loop preheader basic block and
16462 are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
16463 instructions.
16464
16465 Arguments:
16466 """"""""""
16467
16468 The integer operand is the loop trip count of the hardware-loop, and thus
16469 not e.g. the loop back-edge taken count.
16470
16471 Semantics:
16472 """"""""""
16473
16474 The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
16475 on their operand. It's a hint to the backend that can use this to set up the
16476 hardware-loop count with a target specific instruction, usually a move of this
16477 value to a special register or a hardware-loop instruction.
16478
16479
16480 '``llvm.start.loop.iterations.*``' Intrinsic
16481 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16482
16483 Syntax:
16484 """""""
16485
16486 This is an overloaded intrinsic.
16487
16488 ::
16489
16490       declare i32 @llvm.start.loop.iterations.i32(i32)
16491       declare i64 @llvm.start.loop.iterations.i64(i64)
16492
16493 Overview:
16494 """""""""
16495
16496 The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
16497 '``llvm.set.loop.iterations.*``' intrinsics, used to specify the
16498 hardware-loop trip count but also produce a value identical to the input
16499 that can be used as the input to the loop. They are placed in the loop
16500 preheader basic block and the output is expected to be the input to the
16501 phi for the induction variable of the loop, decremented by the
16502 '``llvm.loop.decrement.reg.*``'.
16503
16504 Arguments:
16505 """"""""""
16506
16507 The integer operand is the loop trip count of the hardware-loop, and thus
16508 not e.g. the loop back-edge taken count.
16509
16510 Semantics:
16511 """"""""""
16512
16513 The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
16514 on their operand. It's a hint to the backend that can use this to set up the
16515 hardware-loop count with a target specific instruction, usually a move of this
16516 value to a special register or a hardware-loop instruction.
16517
16518 '``llvm.test.set.loop.iterations.*``' Intrinsic
16519 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16520
16521 Syntax:
16522 """""""
16523
16524 This is an overloaded intrinsic.
16525
16526 ::
16527
16528       declare i1 @llvm.test.set.loop.iterations.i32(i32)
16529       declare i1 @llvm.test.set.loop.iterations.i64(i64)
16530
16531 Overview:
16532 """""""""
16533
16534 The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
16535 the loop trip count, and also test that the given count is not zero, allowing
16536 it to control entry to a while-loop.  They are placed in the loop preheader's
16537 predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
16538 optimizers duplicating these instructions.
16539
16540 Arguments:
16541 """"""""""
16542
16543 The integer operand is the loop trip count of the hardware-loop, and thus
16544 not e.g. the loop back-edge taken count.
16545
16546 Semantics:
16547 """"""""""
16548
16549 The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
16550 arithmetic on their operand. It's a hint to the backend that can use this to
16551 set up the hardware-loop count with a target specific instruction, usually a
16552 move of this value to a special register or a hardware-loop instruction.
16553 The result is the conditional value of whether the given count is not zero.
16554
16555
16556 '``llvm.test.start.loop.iterations.*``' Intrinsic
16557 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16558
16559 Syntax:
16560 """""""
16561
16562 This is an overloaded intrinsic.
16563
16564 ::
16565
16566       declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
16567       declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
16568
16569 Overview:
16570 """""""""
16571
16572 The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
16573 '``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
16574 intrinsics, used to specify the hardware-loop trip count, but also produce a
16575 value identical to the input that can be used as the input to the loop. The
16576 second i1 output controls entry to a while-loop.
16577
16578 Arguments:
16579 """"""""""
16580
16581 The integer operand is the loop trip count of the hardware-loop, and thus
16582 not e.g. the loop back-edge taken count.
16583
16584 Semantics:
16585 """"""""""
16586
16587 The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
16588 arithmetic on their operand. It's a hint to the backend that can use this to
16589 set up the hardware-loop count with a target specific instruction, usually a
16590 move of this value to a special register or a hardware-loop instruction.
16591 The result is a pair of the input and a conditional value of whether the
16592 given count is not zero.
16593
16594
16595 '``llvm.loop.decrement.reg.*``' Intrinsic
16596 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16597
16598 Syntax:
16599 """""""
16600
16601 This is an overloaded intrinsic.
16602
16603 ::
16604
16605       declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
16606       declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
16607
16608 Overview:
16609 """""""""
16610
16611 The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
16612 iteration counter and return an updated value that will be used in the next
16613 loop test check.
16614
16615 Arguments:
16616 """"""""""
16617
16618 Both arguments must have identical integer types. The first operand is the
16619 loop iteration counter. The second operand is the maximum number of elements
16620 processed in an iteration.
16621
16622 Semantics:
16623 """"""""""
16624
16625 The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
16626 two operands, which is not allowed to wrap. They return the remaining number of
16627 iterations still to be executed, and can be used together with a ``PHI``,
16628 ``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
16629 optimisations are allowed to treat it is a ``SUB``, and it is supported by
16630 SCEV, so it's the backends responsibility to handle cases where it may be
16631 optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
16632 optimizers duplicating these instructions.
16633
16634
16635 '``llvm.loop.decrement.*``' Intrinsic
16636 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16637
16638 Syntax:
16639 """""""
16640
16641 This is an overloaded intrinsic.
16642
16643 ::
16644
16645       declare i1 @llvm.loop.decrement.i32(i32)
16646       declare i1 @llvm.loop.decrement.i64(i64)
16647
16648 Overview:
16649 """""""""
16650
16651 The HardwareLoops pass allows the loop decrement value to be specified with an
16652 option. It defaults to a loop decrement value of 1, but it can be an unsigned
16653 integer value provided by this option.  The '``llvm.loop.decrement.*``'
16654 intrinsics decrement the loop iteration counter with this value, and return a
16655 false predicate if the loop should exit, and true otherwise.
16656 This is emitted if the loop counter is not updated via a ``PHI`` node, which
16657 can also be controlled with an option.
16658
16659 Arguments:
16660 """"""""""
16661
16662 The integer argument is the loop decrement value used to decrement the loop
16663 iteration counter.
16664
16665 Semantics:
16666 """"""""""
16667
16668 The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
16669 counter with the given loop decrement value, and return false if the loop
16670 should exit, this ``SUB`` is not allowed to wrap. The result is a condition
16671 that is used by the conditional branch controlling the loop.
16672
16673
16674 Vector Reduction Intrinsics
16675 ---------------------------
16676
16677 Horizontal reductions of vectors can be expressed using the following
16678 intrinsics. Each one takes a vector operand as an input and applies its
16679 respective operation across all elements of the vector, returning a single
16680 scalar result of the same element type.
16681
16682 .. _int_vector_reduce_add:
16683
16684 '``llvm.vector.reduce.add.*``' Intrinsic
16685 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16686
16687 Syntax:
16688 """""""
16689
16690 ::
16691
16692       declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
16693       declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
16694
16695 Overview:
16696 """""""""
16697
16698 The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
16699 reduction of a vector, returning the result as a scalar. The return type matches
16700 the element-type of the vector input.
16701
16702 Arguments:
16703 """"""""""
16704 The argument to this intrinsic must be a vector of integer values.
16705
16706 .. _int_vector_reduce_fadd:
16707
16708 '``llvm.vector.reduce.fadd.*``' Intrinsic
16709 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16710
16711 Syntax:
16712 """""""
16713
16714 ::
16715
16716       declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
16717       declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
16718
16719 Overview:
16720 """""""""
16721
16722 The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
16723 ``ADD`` reduction of a vector, returning the result as a scalar. The return type
16724 matches the element-type of the vector input.
16725
16726 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16727 preserve the associativity of an equivalent scalarized counterpart. Otherwise
16728 the reduction will be *sequential*, thus implying that the operation respects
16729 the associativity of a scalarized reduction. That is, the reduction begins with
16730 the start value and performs an fadd operation with consecutively increasing
16731 vector element indices. See the following pseudocode:
16732
16733 ::
16734
16735     float sequential_fadd(start_value, input_vector)
16736       result = start_value
16737       for i = 0 to length(input_vector)
16738         result = result + input_vector[i]
16739       return result
16740
16741
16742 Arguments:
16743 """"""""""
16744 The first argument to this intrinsic is a scalar start value for the reduction.
16745 The type of the start value matches the element-type of the vector input.
16746 The second argument must be a vector of floating-point values.
16747
16748 To ignore the start value, negative zero (``-0.0``) can be used, as it is
16749 the neutral value of floating point addition.
16750
16751 Examples:
16752 """""""""
16753
16754 ::
16755
16756       %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
16757       %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16758
16759
16760 .. _int_vector_reduce_mul:
16761
16762 '``llvm.vector.reduce.mul.*``' Intrinsic
16763 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16764
16765 Syntax:
16766 """""""
16767
16768 ::
16769
16770       declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
16771       declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
16772
16773 Overview:
16774 """""""""
16775
16776 The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
16777 reduction of a vector, returning the result as a scalar. The return type matches
16778 the element-type of the vector input.
16779
16780 Arguments:
16781 """"""""""
16782 The argument to this intrinsic must be a vector of integer values.
16783
16784 .. _int_vector_reduce_fmul:
16785
16786 '``llvm.vector.reduce.fmul.*``' Intrinsic
16787 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16788
16789 Syntax:
16790 """""""
16791
16792 ::
16793
16794       declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
16795       declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
16796
16797 Overview:
16798 """""""""
16799
16800 The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
16801 ``MUL`` reduction of a vector, returning the result as a scalar. The return type
16802 matches the element-type of the vector input.
16803
16804 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16805 preserve the associativity of an equivalent scalarized counterpart. Otherwise
16806 the reduction will be *sequential*, thus implying that the operation respects
16807 the associativity of a scalarized reduction. That is, the reduction begins with
16808 the start value and performs an fmul operation with consecutively increasing
16809 vector element indices. See the following pseudocode:
16810
16811 ::
16812
16813     float sequential_fmul(start_value, input_vector)
16814       result = start_value
16815       for i = 0 to length(input_vector)
16816         result = result * input_vector[i]
16817       return result
16818
16819
16820 Arguments:
16821 """"""""""
16822 The first argument to this intrinsic is a scalar start value for the reduction.
16823 The type of the start value matches the element-type of the vector input.
16824 The second argument must be a vector of floating-point values.
16825
16826 To ignore the start value, one (``1.0``) can be used, as it is the neutral
16827 value of floating point multiplication.
16828
16829 Examples:
16830 """""""""
16831
16832 ::
16833
16834       %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
16835       %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16836
16837 .. _int_vector_reduce_and:
16838
16839 '``llvm.vector.reduce.and.*``' Intrinsic
16840 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16841
16842 Syntax:
16843 """""""
16844
16845 ::
16846
16847       declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
16848
16849 Overview:
16850 """""""""
16851
16852 The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
16853 reduction of a vector, returning the result as a scalar. The return type matches
16854 the element-type of the vector input.
16855
16856 Arguments:
16857 """"""""""
16858 The argument to this intrinsic must be a vector of integer values.
16859
16860 .. _int_vector_reduce_or:
16861
16862 '``llvm.vector.reduce.or.*``' Intrinsic
16863 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16864
16865 Syntax:
16866 """""""
16867
16868 ::
16869
16870       declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
16871
16872 Overview:
16873 """""""""
16874
16875 The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
16876 of a vector, returning the result as a scalar. The return type matches the
16877 element-type of the vector input.
16878
16879 Arguments:
16880 """"""""""
16881 The argument to this intrinsic must be a vector of integer values.
16882
16883 .. _int_vector_reduce_xor:
16884
16885 '``llvm.vector.reduce.xor.*``' Intrinsic
16886 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16887
16888 Syntax:
16889 """""""
16890
16891 ::
16892
16893       declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
16894
16895 Overview:
16896 """""""""
16897
16898 The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
16899 reduction of a vector, returning the result as a scalar. The return type matches
16900 the element-type of the vector input.
16901
16902 Arguments:
16903 """"""""""
16904 The argument to this intrinsic must be a vector of integer values.
16905
16906 .. _int_vector_reduce_smax:
16907
16908 '``llvm.vector.reduce.smax.*``' Intrinsic
16909 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16910
16911 Syntax:
16912 """""""
16913
16914 ::
16915
16916       declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
16917
16918 Overview:
16919 """""""""
16920
16921 The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
16922 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
16923 matches the element-type of the vector input.
16924
16925 Arguments:
16926 """"""""""
16927 The argument to this intrinsic must be a vector of integer values.
16928
16929 .. _int_vector_reduce_smin:
16930
16931 '``llvm.vector.reduce.smin.*``' Intrinsic
16932 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16933
16934 Syntax:
16935 """""""
16936
16937 ::
16938
16939       declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
16940
16941 Overview:
16942 """""""""
16943
16944 The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
16945 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
16946 matches the element-type of the vector input.
16947
16948 Arguments:
16949 """"""""""
16950 The argument to this intrinsic must be a vector of integer values.
16951
16952 .. _int_vector_reduce_umax:
16953
16954 '``llvm.vector.reduce.umax.*``' Intrinsic
16955 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16956
16957 Syntax:
16958 """""""
16959
16960 ::
16961
16962       declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
16963
16964 Overview:
16965 """""""""
16966
16967 The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
16968 integer ``MAX`` reduction of a vector, returning the result as a scalar. The
16969 return type matches the element-type of the vector input.
16970
16971 Arguments:
16972 """"""""""
16973 The argument to this intrinsic must be a vector of integer values.
16974
16975 .. _int_vector_reduce_umin:
16976
16977 '``llvm.vector.reduce.umin.*``' Intrinsic
16978 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16979
16980 Syntax:
16981 """""""
16982
16983 ::
16984
16985       declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
16986
16987 Overview:
16988 """""""""
16989
16990 The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
16991 integer ``MIN`` reduction of a vector, returning the result as a scalar. The
16992 return type matches the element-type of the vector input.
16993
16994 Arguments:
16995 """"""""""
16996 The argument to this intrinsic must be a vector of integer values.
16997
16998 .. _int_vector_reduce_fmax:
16999
17000 '``llvm.vector.reduce.fmax.*``' Intrinsic
17001 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17002
17003 Syntax:
17004 """""""
17005
17006 ::
17007
17008       declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
17009       declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
17010
17011 Overview:
17012 """""""""
17013
17014 The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
17015 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
17016 matches the element-type of the vector input.
17017
17018 This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
17019 intrinsic. That is, the result will always be a number unless all elements of
17020 the vector are NaN. For a vector with maximum element magnitude 0.0 and
17021 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17022
17023 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17024 assume that NaNs are not present in the input vector.
17025
17026 Arguments:
17027 """"""""""
17028 The argument to this intrinsic must be a vector of floating-point values.
17029
17030 .. _int_vector_reduce_fmin:
17031
17032 '``llvm.vector.reduce.fmin.*``' Intrinsic
17033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17034
17035 Syntax:
17036 """""""
17037 This is an overloaded intrinsic.
17038
17039 ::
17040
17041       declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
17042       declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
17043
17044 Overview:
17045 """""""""
17046
17047 The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
17048 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
17049 matches the element-type of the vector input.
17050
17051 This instruction has the same comparison semantics as the '``llvm.minnum.*``'
17052 intrinsic. That is, the result will always be a number unless all elements of
17053 the vector are NaN. For a vector with minimum element magnitude 0.0 and
17054 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17055
17056 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17057 assume that NaNs are not present in the input vector.
17058
17059 Arguments:
17060 """"""""""
17061 The argument to this intrinsic must be a vector of floating-point values.
17062
17063 '``llvm.experimental.vector.insert``' Intrinsic
17064 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17065
17066 Syntax:
17067 """""""
17068 This is an overloaded intrinsic. You can use ``llvm.experimental.vector.insert``
17069 to insert a fixed-width vector into a scalable vector, but not the other way
17070 around.
17071
17072 ::
17073
17074       declare <vscale x 4 x float> @llvm.experimental.vector.insert.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 %idx)
17075       declare <vscale x 2 x double> @llvm.experimental.vector.insert.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 %idx)
17076
17077 Overview:
17078 """""""""
17079
17080 The '``llvm.experimental.vector.insert.*``' intrinsics insert a vector into another vector
17081 starting from a given index. The return type matches the type of the vector we
17082 insert into. Conceptually, this can be used to build a scalable vector out of
17083 non-scalable vectors.
17084
17085 Arguments:
17086 """"""""""
17087
17088 The ``vec`` is the vector which ``subvec`` will be inserted into.
17089 The ``subvec`` is the vector that will be inserted.
17090
17091 ``idx`` represents the starting element number at which ``subvec`` will be
17092 inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
17093 vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
17094 the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
17095 ``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
17096 num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
17097 cannot be determined statically but is false at runtime, then the result vector
17098 is undefined.
17099
17100
17101 '``llvm.experimental.vector.extract``' Intrinsic
17102 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17103
17104 Syntax:
17105 """""""
17106 This is an overloaded intrinsic. You can use
17107 ``llvm.experimental.vector.extract`` to extract a fixed-width vector from a
17108 scalable vector, but not the other way around.
17109
17110 ::
17111
17112       declare <4 x float> @llvm.experimental.vector.extract.v4f32(<vscale x 4 x float> %vec, i64 %idx)
17113       declare <2 x double> @llvm.experimental.vector.extract.v2f64(<vscale x 2 x double> %vec, i64 %idx)
17114
17115 Overview:
17116 """""""""
17117
17118 The '``llvm.experimental.vector.extract.*``' intrinsics extract a vector from
17119 within another vector starting from a given index. The return type must be
17120 explicitly specified. Conceptually, this can be used to decompose a scalable
17121 vector into non-scalable parts.
17122
17123 Arguments:
17124 """"""""""
17125
17126 The ``vec`` is the vector from which we will extract a subvector.
17127
17128 The ``idx`` specifies the starting element number within ``vec`` from which a
17129 subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
17130 vector length of the result type. If the result type is a scalable vector,
17131 ``idx`` is first scaled by the result type's runtime scaling factor. Elements
17132 ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
17133 indices. If this condition cannot be determined statically but is false at
17134 runtime, then the result vector is undefined. The ``idx`` parameter must be a
17135 vector index constant type (for most targets this will be an integer pointer
17136 type).
17137
17138 '``llvm.experimental.vector.reverse``' Intrinsic
17139 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17140
17141 Syntax:
17142 """""""
17143 This is an overloaded intrinsic.
17144
17145 ::
17146
17147       declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
17148       declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
17149
17150 Overview:
17151 """""""""
17152
17153 The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
17154 The intrinsic takes a single vector and returns a vector of matching type but
17155 with the original lane order reversed. These intrinsics work for both fixed
17156 and scalable vectors. While this intrinsic is marked as experimental the
17157 recommended way to express reverse operations for fixed-width vectors is still
17158 to use a shufflevector, as that may allow for more optimization opportunities.
17159
17160 Arguments:
17161 """"""""""
17162
17163 The argument to this intrinsic must be a vector.
17164
17165 '``llvm.experimental.vector.splice``' Intrinsic
17166 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17167
17168 Syntax:
17169 """""""
17170 This is an overloaded intrinsic.
17171
17172 ::
17173
17174       declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
17175       declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
17176
17177 Overview:
17178 """""""""
17179
17180 The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
17181 concatenating elements from the first input vector with elements of the second
17182 input vector, returning a vector of the same type as the input vectors. The
17183 signed immediate, modulo the number of elements in the vector, is the index
17184 into the first vector from which to extract the result value. This means
17185 conceptually that for a positive immediate, a vector is extracted from
17186 ``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
17187 immediate, it extracts ``-imm`` trailing elements from the first vector, and
17188 the remaining elements from ``%vec2``.
17189
17190 These intrinsics work for both fixed and scalable vectors. While this intrinsic
17191 is marked as experimental, the recommended way to express this operation for
17192 fixed-width vectors is still to use a shufflevector, as that may allow for more
17193 optimization opportunities.
17194
17195 For example:
17196
17197 .. code-block:: text
17198
17199  llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1)  ==> <B, C, D, E> ; index
17200  llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements
17201
17202
17203 Arguments:
17204 """"""""""
17205
17206 The first two operands are vectors with the same type. The third argument
17207 ``imm`` is the start index, modulo VL, where VL is the runtime vector length of
17208 the source/result vector. The ``imm`` is a signed integer constant in the range
17209 ``-VL <= imm < VL``. For values outside of this range the result is poison.
17210
17211 '``llvm.experimental.stepvector``' Intrinsic
17212 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17213
17214 This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
17215 to generate a vector whose lane values comprise the linear sequence
17216 <0, 1, 2, ...>. It is primarily intended for scalable vectors.
17217
17218 ::
17219
17220       declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
17221       declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
17222
17223 The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
17224 of integers whose elements contain a linear sequence of values starting from 0
17225 with a step of 1.  This experimental intrinsic can only be used for vectors
17226 with integer elements that are at least 8 bits in size. If the sequence value
17227 exceeds the allowed limit for the element type then the result for that lane is
17228 undefined.
17229
17230 These intrinsics work for both fixed and scalable vectors. While this intrinsic
17231 is marked as experimental, the recommended way to express this operation for
17232 fixed-width vectors is still to generate a constant vector instead.
17233
17234
17235 Arguments:
17236 """"""""""
17237
17238 None.
17239
17240
17241 Matrix Intrinsics
17242 -----------------
17243
17244 Operations on matrixes requiring shape information (like number of rows/columns
17245 or the memory layout) can be expressed using the matrix intrinsics. These
17246 intrinsics require matrix dimensions to be passed as immediate arguments, and
17247 matrixes are passed and returned as vectors. This means that for a ``R`` x
17248 ``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
17249 corresponding vector, with indices starting at 0. Currently column-major layout
17250 is assumed.  The intrinsics support both integer and floating point matrixes.
17251
17252
17253 '``llvm.matrix.transpose.*``' Intrinsic
17254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17255
17256 Syntax:
17257 """""""
17258 This is an overloaded intrinsic.
17259
17260 ::
17261
17262       declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
17263
17264 Overview:
17265 """""""""
17266
17267 The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
17268 <Cols>`` matrix and return the transposed matrix in the result vector.
17269
17270 Arguments:
17271 """"""""""
17272
17273 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17274 <Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
17275 number of rows and columns, respectively, and must be positive, constant
17276 integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
17277 the same float or integer element type as ``%In``.
17278
17279 '``llvm.matrix.multiply.*``' Intrinsic
17280 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17281
17282 Syntax:
17283 """""""
17284 This is an overloaded intrinsic.
17285
17286 ::
17287
17288       declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
17289
17290 Overview:
17291 """""""""
17292
17293 The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
17294 <Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
17295 multiplies them. The result matrix is returned in the result vector.
17296
17297 Arguments:
17298 """"""""""
17299
17300 The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
17301 <Inner>`` elements, and the second argument ``%B`` to a matrix with
17302 ``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
17303 ``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
17304 returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
17305 Vectors ``%A``, ``%B``, and the returned vector all have the same float or
17306 integer element type.
17307
17308
17309 '``llvm.matrix.column.major.load.*``' Intrinsic
17310 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17311
17312 Syntax:
17313 """""""
17314 This is an overloaded intrinsic.
17315
17316 ::
17317
17318       declare vectorty @llvm.matrix.column.major.load.*(
17319           ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17320
17321 Overview:
17322 """""""""
17323
17324 The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
17325 matrix using a stride of ``%Stride`` to compute the start address of the
17326 different columns.  The offset is computed using ``%Stride``'s bitwidth. This
17327 allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
17328 intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
17329 matrix is returned in the result vector. If the ``%Ptr`` argument is known to
17330 be aligned to some boundary, this can be specified as an attribute on the
17331 argument.
17332
17333 Arguments:
17334 """"""""""
17335
17336 The first argument ``%Ptr`` is a pointer type to the returned vector type, and
17337 corresponds to the start address to load from. The second argument ``%Stride``
17338 is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
17339 to compute the column memory addresses. I.e., for a column ``C``, its start
17340 memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
17341 ``<IsVolatile>`` is a boolean value.  The fourth and fifth arguments,
17342 ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
17343 respectively, and must be positive, constant integers. The returned vector must
17344 have ``<Rows> * <Cols>`` elements.
17345
17346 The :ref:`align <attr_align>` parameter attribute can be provided for the
17347 ``%Ptr`` arguments.
17348
17349
17350 '``llvm.matrix.column.major.store.*``' Intrinsic
17351 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17352
17353 Syntax:
17354 """""""
17355
17356 ::
17357
17358       declare void @llvm.matrix.column.major.store.*(
17359           vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17360
17361 Overview:
17362 """""""""
17363
17364 The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
17365 <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
17366 columns. The offset is computed using ``%Stride``'s bitwidth. If
17367 ``<IsVolatile>`` is true, the intrinsic is considered a
17368 :ref:`volatile memory access <volatile>`.
17369
17370 If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
17371 specified as an attribute on the argument.
17372
17373 Arguments:
17374 """"""""""
17375
17376 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17377 <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
17378 pointer to the vector type of ``%In``, and is the start address of the matrix
17379 in memory. The third argument ``%Stride`` is a positive, constant integer with
17380 ``%Stride >= <Rows>``.  ``%Stride`` is used to compute the column memory
17381 addresses. I.e., for a column ``C``, its start memory addresses is calculated
17382 with ``%Ptr + C * %Stride``.  The fourth argument ``<IsVolatile>`` is a boolean
17383 value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
17384 and columns, respectively, and must be positive, constant integers.
17385
17386 The :ref:`align <attr_align>` parameter attribute can be provided
17387 for the ``%Ptr`` arguments.
17388
17389
17390 Half Precision Floating-Point Intrinsics
17391 ----------------------------------------
17392
17393 For most target platforms, half precision floating-point is a
17394 storage-only format. This means that it is a dense encoding (in memory)
17395 but does not support computation in the format.
17396
17397 This means that code must first load the half-precision floating-point
17398 value as an i16, then convert it to float with
17399 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
17400 then be performed on the float value (including extending to double
17401 etc). To store the value back to memory, it is first converted to float
17402 if needed, then converted to i16 with
17403 :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
17404 i16 value.
17405
17406 .. _int_convert_to_fp16:
17407
17408 '``llvm.convert.to.fp16``' Intrinsic
17409 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17410
17411 Syntax:
17412 """""""
17413
17414 ::
17415
17416       declare i16 @llvm.convert.to.fp16.f32(float %a)
17417       declare i16 @llvm.convert.to.fp16.f64(double %a)
17418
17419 Overview:
17420 """""""""
17421
17422 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17423 conventional floating-point type to half precision floating-point format.
17424
17425 Arguments:
17426 """"""""""
17427
17428 The intrinsic function contains single argument - the value to be
17429 converted.
17430
17431 Semantics:
17432 """"""""""
17433
17434 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17435 conventional floating-point format to half precision floating-point format. The
17436 return value is an ``i16`` which contains the converted number.
17437
17438 Examples:
17439 """""""""
17440
17441 .. code-block:: llvm
17442
17443       %res = call i16 @llvm.convert.to.fp16.f32(float %a)
17444       store i16 %res, i16* @x, align 2
17445
17446 .. _int_convert_from_fp16:
17447
17448 '``llvm.convert.from.fp16``' Intrinsic
17449 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17450
17451 Syntax:
17452 """""""
17453
17454 ::
17455
17456       declare float @llvm.convert.from.fp16.f32(i16 %a)
17457       declare double @llvm.convert.from.fp16.f64(i16 %a)
17458
17459 Overview:
17460 """""""""
17461
17462 The '``llvm.convert.from.fp16``' intrinsic function performs a
17463 conversion from half precision floating-point format to single precision
17464 floating-point format.
17465
17466 Arguments:
17467 """"""""""
17468
17469 The intrinsic function contains single argument - the value to be
17470 converted.
17471
17472 Semantics:
17473 """"""""""
17474
17475 The '``llvm.convert.from.fp16``' intrinsic function performs a
17476 conversion from half single precision floating-point format to single
17477 precision floating-point format. The input half-float value is
17478 represented by an ``i16`` value.
17479
17480 Examples:
17481 """""""""
17482
17483 .. code-block:: llvm
17484
17485       %a = load i16, i16* @x, align 2
17486       %res = call float @llvm.convert.from.fp16(i16 %a)
17487
17488 Saturating floating-point to integer conversions
17489 ------------------------------------------------
17490
17491 The ``fptoui`` and ``fptosi`` instructions return a
17492 :ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
17493 representable by the result type. These intrinsics provide an alternative
17494 conversion, which will saturate towards the smallest and largest representable
17495 integer values instead.
17496
17497 '``llvm.fptoui.sat.*``' Intrinsic
17498 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17499
17500 Syntax:
17501 """""""
17502
17503 This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
17504 floating-point argument type and any integer result type, or vectors thereof.
17505 Not all targets may support all types, however.
17506
17507 ::
17508
17509       declare i32 @llvm.fptoui.sat.i32.f32(float %f)
17510       declare i19 @llvm.fptoui.sat.i19.f64(double %f)
17511       declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
17512
17513 Overview:
17514 """""""""
17515
17516 This intrinsic converts the argument into an unsigned integer using saturating
17517 semantics.
17518
17519 Arguments:
17520 """"""""""
17521
17522 The argument may be any floating-point or vector of floating-point type. The
17523 return value may be any integer or vector of integer type. The number of vector
17524 elements in argument and return must be the same.
17525
17526 Semantics:
17527 """"""""""
17528
17529 The conversion to integer is performed subject to the following rules:
17530
17531 - If the argument is any NaN, zero is returned.
17532 - If the argument is smaller than zero (this includes negative infinity),
17533   zero is returned.
17534 - If the argument is larger than the largest representable unsigned integer of
17535   the result type (this includes positive infinity), the largest representable
17536   unsigned integer is returned.
17537 - Otherwise, the result of rounding the argument towards zero is returned.
17538
17539 Example:
17540 """"""""
17541
17542 .. code-block:: text
17543
17544       %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9)              ; yields i8: 123
17545       %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7)               ; yields i8:   0
17546       %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0)              ; yields i8: 255
17547       %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:   0
17548
17549 '``llvm.fptosi.sat.*``' Intrinsic
17550 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17551
17552 Syntax:
17553 """""""
17554
17555 This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
17556 floating-point argument type and any integer result type, or vectors thereof.
17557 Not all targets may support all types, however.
17558
17559 ::
17560
17561       declare i32 @llvm.fptosi.sat.i32.f32(float %f)
17562       declare i19 @llvm.fptosi.sat.i19.f64(double %f)
17563       declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
17564
17565 Overview:
17566 """""""""
17567
17568 This intrinsic converts the argument into a signed integer using saturating
17569 semantics.
17570
17571 Arguments:
17572 """"""""""
17573
17574 The argument may be any floating-point or vector of floating-point type. The
17575 return value may be any integer or vector of integer type. The number of vector
17576 elements in argument and return must be the same.
17577
17578 Semantics:
17579 """"""""""
17580
17581 The conversion to integer is performed subject to the following rules:
17582
17583 - If the argument is any NaN, zero is returned.
17584 - If the argument is smaller than the smallest representable signed integer of
17585   the result type (this includes negative infinity), the smallest
17586   representable signed integer is returned.
17587 - If the argument is larger than the largest representable signed integer of
17588   the result type (this includes positive infinity), the largest representable
17589   signed integer is returned.
17590 - Otherwise, the result of rounding the argument towards zero is returned.
17591
17592 Example:
17593 """"""""
17594
17595 .. code-block:: text
17596
17597       %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9)               ; yields i8:   23
17598       %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8)             ; yields i8: -128
17599       %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0)              ; yields i8:  127
17600       %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:    0
17601
17602 .. _dbg_intrinsics:
17603
17604 Debugger Intrinsics
17605 -------------------
17606
17607 The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
17608 prefix), are described in the `LLVM Source Level
17609 Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
17610 document.
17611
17612 Exception Handling Intrinsics
17613 -----------------------------
17614
17615 The LLVM exception handling intrinsics (which all start with
17616 ``llvm.eh.`` prefix), are described in the `LLVM Exception
17617 Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
17618
17619 Pointer Authentication Intrinsics
17620 ---------------------------------
17621
17622 The LLVM pointer authentication intrinsics (which all start with
17623 ``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
17624 <PointerAuth.html#intrinsics>`_ document.
17625
17626 .. _int_trampoline:
17627
17628 Trampoline Intrinsics
17629 ---------------------
17630
17631 These intrinsics make it possible to excise one parameter, marked with
17632 the :ref:`nest <nest>` attribute, from a function. The result is a
17633 callable function pointer lacking the nest parameter - the caller does
17634 not need to provide a value for it. Instead, the value to use is stored
17635 in advance in a "trampoline", a block of memory usually allocated on the
17636 stack, which also contains code to splice the nest value into the
17637 argument list. This is used to implement the GCC nested function address
17638 extension.
17639
17640 For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
17641 then the resulting function pointer has signature ``i32 (i32, i32)*``.
17642 It can be created as follows:
17643
17644 .. code-block:: llvm
17645
17646       %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
17647       %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
17648       call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
17649       %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
17650       %fp = bitcast i8* %p to i32 (i32, i32)*
17651
17652 The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
17653 ``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
17654
17655 .. _int_it:
17656
17657 '``llvm.init.trampoline``' Intrinsic
17658 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17659
17660 Syntax:
17661 """""""
17662
17663 ::
17664
17665       declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
17666
17667 Overview:
17668 """""""""
17669
17670 This fills the memory pointed to by ``tramp`` with executable code,
17671 turning it into a trampoline.
17672
17673 Arguments:
17674 """"""""""
17675
17676 The ``llvm.init.trampoline`` intrinsic takes three arguments, all
17677 pointers. The ``tramp`` argument must point to a sufficiently large and
17678 sufficiently aligned block of memory; this memory is written to by the
17679 intrinsic. Note that the size and the alignment are target-specific -
17680 LLVM currently provides no portable way of determining them, so a
17681 front-end that generates this intrinsic needs to have some
17682 target-specific knowledge. The ``func`` argument must hold a function
17683 bitcast to an ``i8*``.
17684
17685 Semantics:
17686 """"""""""
17687
17688 The block of memory pointed to by ``tramp`` is filled with target
17689 dependent code, turning it into a function. Then ``tramp`` needs to be
17690 passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
17691 be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
17692 function's signature is the same as that of ``func`` with any arguments
17693 marked with the ``nest`` attribute removed. At most one such ``nest``
17694 argument is allowed, and it must be of pointer type. Calling the new
17695 function is equivalent to calling ``func`` with the same argument list,
17696 but with ``nval`` used for the missing ``nest`` argument. If, after
17697 calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
17698 modified, then the effect of any later call to the returned function
17699 pointer is undefined.
17700
17701 .. _int_at:
17702
17703 '``llvm.adjust.trampoline``' Intrinsic
17704 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17705
17706 Syntax:
17707 """""""
17708
17709 ::
17710
17711       declare i8* @llvm.adjust.trampoline(i8* <tramp>)
17712
17713 Overview:
17714 """""""""
17715
17716 This performs any required machine-specific adjustment to the address of
17717 a trampoline (passed as ``tramp``).
17718
17719 Arguments:
17720 """"""""""
17721
17722 ``tramp`` must point to a block of memory which already has trampoline
17723 code filled in by a previous call to
17724 :ref:`llvm.init.trampoline <int_it>`.
17725
17726 Semantics:
17727 """"""""""
17728
17729 On some architectures the address of the code to be executed needs to be
17730 different than the address where the trampoline is actually stored. This
17731 intrinsic returns the executable address corresponding to ``tramp``
17732 after performing the required machine specific adjustments. The pointer
17733 returned can then be :ref:`bitcast and executed <int_trampoline>`.
17734
17735
17736 .. _int_vp:
17737
17738 Vector Predication Intrinsics
17739 -----------------------------
17740 VP intrinsics are intended for predicated SIMD/vector code.  A typical VP
17741 operation takes a vector mask and an explicit vector length parameter as in:
17742
17743 ::
17744
17745       <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
17746
17747 The vector mask parameter (%mask) always has a vector of `i1` type, for example
17748 `<32 x i1>`.  The explicit vector length parameter always has the type `i32` and
17749 is an unsigned integer value.  The explicit vector length parameter (%evl) is in
17750 the range:
17751
17752 ::
17753
17754       0 <= %evl <= W,  where W is the number of vector elements
17755
17756 Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
17757 length of the vector.
17758
17759 The VP intrinsic has undefined behavior if ``%evl > W``.  The explicit vector
17760 length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
17761 to True, and all other lanes ``%evl <= i < W`` to False.  A new mask %M is
17762 calculated with an element-wise AND from %mask and %EVLmask:
17763
17764 ::
17765
17766       M = %mask AND %EVLmask
17767
17768 A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
17769
17770 ::
17771
17772        A <opcode> B =  {  A[i] <opcode> B[i]   M[i] = True, and
17773                        {  undef otherwise
17774
17775 Optimization Hint
17776 ^^^^^^^^^^^^^^^^^
17777
17778 Some targets, such as AVX512, do not support the %evl parameter in hardware.
17779 The use of an effective %evl is discouraged for those targets.  The function
17780 ``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
17781 has native support for %evl.
17782
17783 .. _int_vp_select:
17784
17785 '``llvm.vp.select.*``' Intrinsics
17786 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17787
17788 Syntax:
17789 """""""
17790 This is an overloaded intrinsic.
17791
17792 ::
17793
17794       declare <16 x i32>  @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
17795       declare <vscale x 4 x i64>  @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i32> <on_true>, <vscale x 4 x i32> <on_false>, i32 <evl>)
17796
17797 Overview:
17798 """""""""
17799
17800 The '``llvm.vp.select``' intrinsic is used to choose one value based on a
17801 condition vector, without IR-level branching.
17802
17803 Arguments:
17804 """"""""""
17805
17806 The first operand is a vector of ``i1`` and indicates the condition.  The
17807 second operand is the value that is selected where the condition vector is
17808 true.  The third operand is the value that is selected where the condition
17809 vector is false.  The vectors must be of the same size.  The fourth operand is
17810 the explicit vector length.
17811
17812 #. The optional ``fast-math flags`` marker indicates that the select has one or
17813    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
17814    enable otherwise unsafe floating-point optimizations. Fast-math flags are
17815    only valid for selects that return a floating-point scalar or vector type,
17816    or an array (nested to any depth) of floating-point scalar or vector types.
17817
17818 Semantics:
17819 """"""""""
17820
17821 The intrinsic selects lanes from the second and third operand depending on a
17822 condition vector.
17823
17824 All result lanes at positions greater or equal than ``%evl`` are undefined.
17825 For all lanes below ``%evl`` where the condition vector is true the lane is
17826 taken from the second operand.  Otherwise, the lane is taken from the third
17827 operand.
17828
17829 Example:
17830 """"""""
17831
17832 .. code-block:: llvm
17833
17834       %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
17835
17836       ;;; Expansion.
17837       ;; Any result is legal on lanes at and above %evl.
17838       %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
17839
17840
17841
17842 .. _int_vp_add:
17843
17844 '``llvm.vp.add.*``' Intrinsics
17845 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17846
17847 Syntax:
17848 """""""
17849 This is an overloaded intrinsic.
17850
17851 ::
17852
17853       declare <16 x i32>  @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17854       declare <vscale x 4 x i32>  @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17855       declare <256 x i64>  @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17856
17857 Overview:
17858 """""""""
17859
17860 Predicated integer addition of two vectors of integers.
17861
17862
17863 Arguments:
17864 """"""""""
17865
17866 The first two operands and the result have the same vector of integer type. The
17867 third operand is the vector mask and has the same number of elements as the
17868 result vector type. The fourth operand is the explicit vector length of the
17869 operation.
17870
17871 Semantics:
17872 """"""""""
17873
17874 The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
17875 of the first and second vector operand on each enabled lane.  The result on
17876 disabled lanes is undefined.
17877
17878 Examples:
17879 """""""""
17880
17881 .. code-block:: llvm
17882
17883       %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17884       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17885
17886       %t = add <4 x i32> %a, %b
17887       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17888
17889 .. _int_vp_sub:
17890
17891 '``llvm.vp.sub.*``' Intrinsics
17892 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17893
17894 Syntax:
17895 """""""
17896 This is an overloaded intrinsic.
17897
17898 ::
17899
17900       declare <16 x i32>  @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17901       declare <vscale x 4 x i32>  @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17902       declare <256 x i64>  @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17903
17904 Overview:
17905 """""""""
17906
17907 Predicated integer subtraction of two vectors of integers.
17908
17909
17910 Arguments:
17911 """"""""""
17912
17913 The first two operands and the result have the same vector of integer type. The
17914 third operand is the vector mask and has the same number of elements as the
17915 result vector type. The fourth operand is the explicit vector length of the
17916 operation.
17917
17918 Semantics:
17919 """"""""""
17920
17921 The '``llvm.vp.sub``' intrinsic performs integer subtraction
17922 (:ref:`sub <i_sub>`)  of the first and second vector operand on each enabled
17923 lane. The result on disabled lanes is undefined.
17924
17925 Examples:
17926 """""""""
17927
17928 .. code-block:: llvm
17929
17930       %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17931       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17932
17933       %t = sub <4 x i32> %a, %b
17934       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17935
17936
17937
17938 .. _int_vp_mul:
17939
17940 '``llvm.vp.mul.*``' Intrinsics
17941 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17942
17943 Syntax:
17944 """""""
17945 This is an overloaded intrinsic.
17946
17947 ::
17948
17949       declare <16 x i32>  @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17950       declare <vscale x 4 x i32>  @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17951       declare <256 x i64>  @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17952
17953 Overview:
17954 """""""""
17955
17956 Predicated integer multiplication of two vectors of integers.
17957
17958
17959 Arguments:
17960 """"""""""
17961
17962 The first two operands and the result have the same vector of integer type. The
17963 third operand is the vector mask and has the same number of elements as the
17964 result vector type. The fourth operand is the explicit vector length of the
17965 operation.
17966
17967 Semantics:
17968 """"""""""
17969 The '``llvm.vp.mul``' intrinsic performs integer multiplication
17970 (:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
17971 lane. The result on disabled lanes is undefined.
17972
17973 Examples:
17974 """""""""
17975
17976 .. code-block:: llvm
17977
17978       %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17979       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17980
17981       %t = mul <4 x i32> %a, %b
17982       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17983
17984
17985 .. _int_vp_sdiv:
17986
17987 '``llvm.vp.sdiv.*``' Intrinsics
17988 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17989
17990 Syntax:
17991 """""""
17992 This is an overloaded intrinsic.
17993
17994 ::
17995
17996       declare <16 x i32>  @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17997       declare <vscale x 4 x i32>  @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17998       declare <256 x i64>  @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17999
18000 Overview:
18001 """""""""
18002
18003 Predicated, signed division of two vectors of integers.
18004
18005
18006 Arguments:
18007 """"""""""
18008
18009 The first two operands and the result have the same vector of integer type. The
18010 third operand is the vector mask and has the same number of elements as the
18011 result vector type. The fourth operand is the explicit vector length of the
18012 operation.
18013
18014 Semantics:
18015 """"""""""
18016
18017 The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
18018 of the first and second vector operand on each enabled lane.  The result on
18019 disabled lanes is undefined.
18020
18021 Examples:
18022 """""""""
18023
18024 .. code-block:: llvm
18025
18026       %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18027       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18028
18029       %t = sdiv <4 x i32> %a, %b
18030       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18031
18032
18033 .. _int_vp_udiv:
18034
18035 '``llvm.vp.udiv.*``' Intrinsics
18036 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18037
18038 Syntax:
18039 """""""
18040 This is an overloaded intrinsic.
18041
18042 ::
18043
18044       declare <16 x i32>  @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18045       declare <vscale x 4 x i32>  @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18046       declare <256 x i64>  @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18047
18048 Overview:
18049 """""""""
18050
18051 Predicated, unsigned division of two vectors of integers.
18052
18053
18054 Arguments:
18055 """"""""""
18056
18057 The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
18058
18059 Semantics:
18060 """"""""""
18061
18062 The '``llvm.vp.udiv``' intrinsic performs unsigned division
18063 (:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
18064 lane. The result on disabled lanes is undefined.
18065
18066 Examples:
18067 """""""""
18068
18069 .. code-block:: llvm
18070
18071       %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18072       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18073
18074       %t = udiv <4 x i32> %a, %b
18075       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18076
18077
18078
18079 .. _int_vp_srem:
18080
18081 '``llvm.vp.srem.*``' Intrinsics
18082 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18083
18084 Syntax:
18085 """""""
18086 This is an overloaded intrinsic.
18087
18088 ::
18089
18090       declare <16 x i32>  @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18091       declare <vscale x 4 x i32>  @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18092       declare <256 x i64>  @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18093
18094 Overview:
18095 """""""""
18096
18097 Predicated computations of the signed remainder of two integer vectors.
18098
18099
18100 Arguments:
18101 """"""""""
18102
18103 The first two operands and the result have the same vector of integer type. The
18104 third operand is the vector mask and has the same number of elements as the
18105 result vector type. The fourth operand is the explicit vector length of the
18106 operation.
18107
18108 Semantics:
18109 """"""""""
18110
18111 The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
18112 (:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
18113 lane.  The result on disabled lanes is undefined.
18114
18115 Examples:
18116 """""""""
18117
18118 .. code-block:: llvm
18119
18120       %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18121       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18122
18123       %t = srem <4 x i32> %a, %b
18124       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18125
18126
18127
18128 .. _int_vp_urem:
18129
18130 '``llvm.vp.urem.*``' Intrinsics
18131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18132
18133 Syntax:
18134 """""""
18135 This is an overloaded intrinsic.
18136
18137 ::
18138
18139       declare <16 x i32>  @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18140       declare <vscale x 4 x i32>  @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18141       declare <256 x i64>  @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18142
18143 Overview:
18144 """""""""
18145
18146 Predicated computation of the unsigned remainder of two integer vectors.
18147
18148
18149 Arguments:
18150 """"""""""
18151
18152 The first two operands and the result have the same vector of integer type. The
18153 third operand is the vector mask and has the same number of elements as the
18154 result vector type. The fourth operand is the explicit vector length of the
18155 operation.
18156
18157 Semantics:
18158 """"""""""
18159
18160 The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
18161 (:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
18162 lane.  The result on disabled lanes is undefined.
18163
18164 Examples:
18165 """""""""
18166
18167 .. code-block:: llvm
18168
18169       %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18170       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18171
18172       %t = urem <4 x i32> %a, %b
18173       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18174
18175
18176 .. _int_vp_ashr:
18177
18178 '``llvm.vp.ashr.*``' Intrinsics
18179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18180
18181 Syntax:
18182 """""""
18183 This is an overloaded intrinsic.
18184
18185 ::
18186
18187       declare <16 x i32>  @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18188       declare <vscale x 4 x i32>  @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18189       declare <256 x i64>  @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18190
18191 Overview:
18192 """""""""
18193
18194 Vector-predicated arithmetic right-shift.
18195
18196
18197 Arguments:
18198 """"""""""
18199
18200 The first two operands and the result have the same vector of integer type. The
18201 third operand is the vector mask and has the same number of elements as the
18202 result vector type. The fourth operand is the explicit vector length of the
18203 operation.
18204
18205 Semantics:
18206 """"""""""
18207
18208 The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
18209 (:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
18210 enabled lane. The result on disabled lanes is undefined.
18211
18212 Examples:
18213 """""""""
18214
18215 .. code-block:: llvm
18216
18217       %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18218       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18219
18220       %t = ashr <4 x i32> %a, %b
18221       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18222
18223
18224 .. _int_vp_lshr:
18225
18226
18227 '``llvm.vp.lshr.*``' Intrinsics
18228 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18229
18230 Syntax:
18231 """""""
18232 This is an overloaded intrinsic.
18233
18234 ::
18235
18236       declare <16 x i32>  @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18237       declare <vscale x 4 x i32>  @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18238       declare <256 x i64>  @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18239
18240 Overview:
18241 """""""""
18242
18243 Vector-predicated logical right-shift.
18244
18245
18246 Arguments:
18247 """"""""""
18248
18249 The first two operands and the result have the same vector of integer type. The
18250 third operand is the vector mask and has the same number of elements as the
18251 result vector type. The fourth operand is the explicit vector length of the
18252 operation.
18253
18254 Semantics:
18255 """"""""""
18256
18257 The '``llvm.vp.lshr``' intrinsic computes the logical right shift
18258 (:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
18259 enabled lane. The result on disabled lanes is undefined.
18260
18261 Examples:
18262 """""""""
18263
18264 .. code-block:: llvm
18265
18266       %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18267       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18268
18269       %t = lshr <4 x i32> %a, %b
18270       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18271
18272
18273 .. _int_vp_shl:
18274
18275 '``llvm.vp.shl.*``' Intrinsics
18276 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18277
18278 Syntax:
18279 """""""
18280 This is an overloaded intrinsic.
18281
18282 ::
18283
18284       declare <16 x i32>  @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18285       declare <vscale x 4 x i32>  @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18286       declare <256 x i64>  @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18287
18288 Overview:
18289 """""""""
18290
18291 Vector-predicated left shift.
18292
18293
18294 Arguments:
18295 """"""""""
18296
18297 The first two operands and the result have the same vector of integer type. The
18298 third operand is the vector mask and has the same number of elements as the
18299 result vector type. The fourth operand is the explicit vector length of the
18300 operation.
18301
18302 Semantics:
18303 """"""""""
18304
18305 The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
18306 the first operand by the second operand on each enabled lane.  The result on
18307 disabled lanes is undefined.
18308
18309 Examples:
18310 """""""""
18311
18312 .. code-block:: llvm
18313
18314       %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18315       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18316
18317       %t = shl <4 x i32> %a, %b
18318       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18319
18320
18321 .. _int_vp_or:
18322
18323 '``llvm.vp.or.*``' Intrinsics
18324 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18325
18326 Syntax:
18327 """""""
18328 This is an overloaded intrinsic.
18329
18330 ::
18331
18332       declare <16 x i32>  @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18333       declare <vscale x 4 x i32>  @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18334       declare <256 x i64>  @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18335
18336 Overview:
18337 """""""""
18338
18339 Vector-predicated or.
18340
18341
18342 Arguments:
18343 """"""""""
18344
18345 The first two operands and the result have the same vector of integer type. The
18346 third operand is the vector mask and has the same number of elements as the
18347 result vector type. The fourth operand is the explicit vector length of the
18348 operation.
18349
18350 Semantics:
18351 """"""""""
18352
18353 The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
18354 first two operands on each enabled lane.  The result on disabled lanes is
18355 undefined.
18356
18357 Examples:
18358 """""""""
18359
18360 .. code-block:: llvm
18361
18362       %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18363       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18364
18365       %t = or <4 x i32> %a, %b
18366       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18367
18368
18369 .. _int_vp_and:
18370
18371 '``llvm.vp.and.*``' Intrinsics
18372 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18373
18374 Syntax:
18375 """""""
18376 This is an overloaded intrinsic.
18377
18378 ::
18379
18380       declare <16 x i32>  @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18381       declare <vscale x 4 x i32>  @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18382       declare <256 x i64>  @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18383
18384 Overview:
18385 """""""""
18386
18387 Vector-predicated and.
18388
18389
18390 Arguments:
18391 """"""""""
18392
18393 The first two operands and the result have the same vector of integer type. The
18394 third operand is the vector mask and has the same number of elements as the
18395 result vector type. The fourth operand is the explicit vector length of the
18396 operation.
18397
18398 Semantics:
18399 """"""""""
18400
18401 The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
18402 the first two operands on each enabled lane.  The result on disabled lanes is
18403 undefined.
18404
18405 Examples:
18406 """""""""
18407
18408 .. code-block:: llvm
18409
18410       %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18411       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18412
18413       %t = and <4 x i32> %a, %b
18414       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18415
18416
18417 .. _int_vp_xor:
18418
18419 '``llvm.vp.xor.*``' Intrinsics
18420 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18421
18422 Syntax:
18423 """""""
18424 This is an overloaded intrinsic.
18425
18426 ::
18427
18428       declare <16 x i32>  @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18429       declare <vscale x 4 x i32>  @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18430       declare <256 x i64>  @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18431
18432 Overview:
18433 """""""""
18434
18435 Vector-predicated, bitwise xor.
18436
18437
18438 Arguments:
18439 """"""""""
18440
18441 The first two operands and the result have the same vector of integer type. The
18442 third operand is the vector mask and has the same number of elements as the
18443 result vector type. The fourth operand is the explicit vector length of the
18444 operation.
18445
18446 Semantics:
18447 """"""""""
18448
18449 The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
18450 the first two operands on each enabled lane.
18451 The result on disabled lanes is undefined.
18452
18453 Examples:
18454 """""""""
18455
18456 .. code-block:: llvm
18457
18458       %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18459       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18460
18461       %t = xor <4 x i32> %a, %b
18462       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18463
18464
18465 .. _int_vp_fadd:
18466
18467 '``llvm.vp.fadd.*``' Intrinsics
18468 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18469
18470 Syntax:
18471 """""""
18472 This is an overloaded intrinsic.
18473
18474 ::
18475
18476       declare <16 x float>  @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18477       declare <vscale x 4 x float>  @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18478       declare <256 x double>  @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18479
18480 Overview:
18481 """""""""
18482
18483 Predicated floating-point addition of two vectors of floating-point values.
18484
18485
18486 Arguments:
18487 """"""""""
18488
18489 The first two operands and the result have the same vector of floating-point type. The
18490 third operand is the vector mask and has the same number of elements as the
18491 result vector type. The fourth operand is the explicit vector length of the
18492 operation.
18493
18494 Semantics:
18495 """"""""""
18496
18497 The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`add <i_fadd>`)
18498 of the first and second vector operand on each enabled lane.  The result on
18499 disabled lanes is undefined.  The operation is performed in the default
18500 floating-point environment.
18501
18502 Examples:
18503 """""""""
18504
18505 .. code-block:: llvm
18506
18507       %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18508       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18509
18510       %t = fadd <4 x float> %a, %b
18511       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18512
18513
18514 .. _int_vp_fsub:
18515
18516 '``llvm.vp.fsub.*``' Intrinsics
18517 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18518
18519 Syntax:
18520 """""""
18521 This is an overloaded intrinsic.
18522
18523 ::
18524
18525       declare <16 x float>  @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18526       declare <vscale x 4 x float>  @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18527       declare <256 x double>  @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18528
18529 Overview:
18530 """""""""
18531
18532 Predicated floating-point subtraction of two vectors of floating-point values.
18533
18534
18535 Arguments:
18536 """"""""""
18537
18538 The first two operands and the result have the same vector of floating-point type. The
18539 third operand is the vector mask and has the same number of elements as the
18540 result vector type. The fourth operand is the explicit vector length of the
18541 operation.
18542
18543 Semantics:
18544 """"""""""
18545
18546 The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`add <i_fsub>`)
18547 of the first and second vector operand on each enabled lane.  The result on
18548 disabled lanes is undefined.  The operation is performed in the default
18549 floating-point environment.
18550
18551 Examples:
18552 """""""""
18553
18554 .. code-block:: llvm
18555
18556       %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18557       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18558
18559       %t = fsub <4 x float> %a, %b
18560       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18561
18562
18563 .. _int_vp_fmul:
18564
18565 '``llvm.vp.fmul.*``' Intrinsics
18566 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18567
18568 Syntax:
18569 """""""
18570 This is an overloaded intrinsic.
18571
18572 ::
18573
18574       declare <16 x float>  @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18575       declare <vscale x 4 x float>  @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18576       declare <256 x double>  @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18577
18578 Overview:
18579 """""""""
18580
18581 Predicated floating-point multiplication of two vectors of floating-point values.
18582
18583
18584 Arguments:
18585 """"""""""
18586
18587 The first two operands and the result have the same vector of floating-point type. The
18588 third operand is the vector mask and has the same number of elements as the
18589 result vector type. The fourth operand is the explicit vector length of the
18590 operation.
18591
18592 Semantics:
18593 """"""""""
18594
18595 The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`add <i_fmul>`)
18596 of the first and second vector operand on each enabled lane.  The result on
18597 disabled lanes is undefined.  The operation is performed in the default
18598 floating-point environment.
18599
18600 Examples:
18601 """""""""
18602
18603 .. code-block:: llvm
18604
18605       %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18606       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18607
18608       %t = fmul <4 x float> %a, %b
18609       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18610
18611
18612 .. _int_vp_fdiv:
18613
18614 '``llvm.vp.fdiv.*``' Intrinsics
18615 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18616
18617 Syntax:
18618 """""""
18619 This is an overloaded intrinsic.
18620
18621 ::
18622
18623       declare <16 x float>  @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18624       declare <vscale x 4 x float>  @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18625       declare <256 x double>  @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18626
18627 Overview:
18628 """""""""
18629
18630 Predicated floating-point division of two vectors of floating-point values.
18631
18632
18633 Arguments:
18634 """"""""""
18635
18636 The first two operands and the result have the same vector of floating-point type. The
18637 third operand is the vector mask and has the same number of elements as the
18638 result vector type. The fourth operand is the explicit vector length of the
18639 operation.
18640
18641 Semantics:
18642 """"""""""
18643
18644 The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`add <i_fdiv>`)
18645 of the first and second vector operand on each enabled lane.  The result on
18646 disabled lanes is undefined.  The operation is performed in the default
18647 floating-point environment.
18648
18649 Examples:
18650 """""""""
18651
18652 .. code-block:: llvm
18653
18654       %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18655       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18656
18657       %t = fdiv <4 x float> %a, %b
18658       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18659
18660
18661 .. _int_vp_frem:
18662
18663 '``llvm.vp.frem.*``' Intrinsics
18664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18665
18666 Syntax:
18667 """""""
18668 This is an overloaded intrinsic.
18669
18670 ::
18671
18672       declare <16 x float>  @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18673       declare <vscale x 4 x float>  @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18674       declare <256 x double>  @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18675
18676 Overview:
18677 """""""""
18678
18679 Predicated floating-point remainder of two vectors of floating-point values.
18680
18681
18682 Arguments:
18683 """"""""""
18684
18685 The first two operands and the result have the same vector of floating-point type. The
18686 third operand is the vector mask and has the same number of elements as the
18687 result vector type. The fourth operand is the explicit vector length of the
18688 operation.
18689
18690 Semantics:
18691 """"""""""
18692
18693 The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`add <i_frem>`)
18694 of the first and second vector operand on each enabled lane.  The result on
18695 disabled lanes is undefined.  The operation is performed in the default
18696 floating-point environment.
18697
18698 Examples:
18699 """""""""
18700
18701 .. code-block:: llvm
18702
18703       %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18704       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18705
18706       %t = frem <4 x float> %a, %b
18707       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18708
18709
18710
18711 .. _int_vp_reduce_add:
18712
18713 '``llvm.vp.reduce.add.*``' Intrinsics
18714 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18715
18716 Syntax:
18717 """""""
18718 This is an overloaded intrinsic.
18719
18720 ::
18721
18722       declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18723       declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18724
18725 Overview:
18726 """""""""
18727
18728 Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
18729 returning the result as a scalar.
18730
18731 Arguments:
18732 """"""""""
18733
18734 The first operand is the start value of the reduction, which must be a scalar
18735 integer type equal to the result type. The second operand is the vector on
18736 which the reduction is performed and must be a vector of integer values whose
18737 element type is the result/start type. The third operand is the vector mask and
18738 is a vector of boolean values with the same number of elements as the vector
18739 operand. The fourth operand is the explicit vector length of the operation.
18740
18741 Semantics:
18742 """"""""""
18743
18744 The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
18745 (:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
18746 ``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
18747 lanes are treated as containing the neutral value ``0`` (i.e. having no effect
18748 on the reduction operation). If the vector length is zero, the result is equal
18749 to ``start_value``.
18750
18751 To ignore the start value, the neutral value can be used.
18752
18753 Examples:
18754 """""""""
18755
18756 .. code-block:: llvm
18757
18758       %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18759       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18760       ; are treated as though %mask were false for those lanes.
18761
18762       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
18763       %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
18764       %also.r = add i32 %reduction, %start
18765
18766
18767 .. _int_vp_reduce_fadd:
18768
18769 '``llvm.vp.reduce.fadd.*``' Intrinsics
18770 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18771
18772 Syntax:
18773 """""""
18774 This is an overloaded intrinsic.
18775
18776 ::
18777
18778       declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18779       declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18780
18781 Overview:
18782 """""""""
18783
18784 Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
18785 value, returning the result as a scalar.
18786
18787 Arguments:
18788 """"""""""
18789
18790 The first operand is the start value of the reduction, which must be a scalar
18791 floating-point type equal to the result type. The second operand is the vector
18792 on which the reduction is performed and must be a vector of floating-point
18793 values whose element type is the result/start type. The third operand is the
18794 vector mask and is a vector of boolean values with the same number of elements
18795 as the vector operand. The fourth operand is the explicit vector length of the
18796 operation.
18797
18798 Semantics:
18799 """"""""""
18800
18801 The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
18802 reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
18803 vector operand ``val`` on each enabled lane, adding it to the scalar
18804 ``start_value``. Disabled lanes are treated as containing the neutral value
18805 ``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
18806 enabled, the resulting value will be equal to ``start_value``.
18807
18808 To ignore the start value, the neutral value can be used.
18809
18810 See the unpredicated version (:ref:`llvm.vector.reduce.fadd
18811 <int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
18812
18813 Examples:
18814 """""""""
18815
18816 .. code-block:: llvm
18817
18818       %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18819       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18820       ; are treated as though %mask were false for those lanes.
18821
18822       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
18823       %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
18824
18825
18826 .. _int_vp_reduce_mul:
18827
18828 '``llvm.vp.reduce.mul.*``' Intrinsics
18829 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18830
18831 Syntax:
18832 """""""
18833 This is an overloaded intrinsic.
18834
18835 ::
18836
18837       declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18838       declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18839
18840 Overview:
18841 """""""""
18842
18843 Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
18844 returning the result as a scalar.
18845
18846
18847 Arguments:
18848 """"""""""
18849
18850 The first operand is the start value of the reduction, which must be a scalar
18851 integer type equal to the result type. The second operand is the vector on
18852 which the reduction is performed and must be a vector of integer values whose
18853 element type is the result/start type. The third operand is the vector mask and
18854 is a vector of boolean values with the same number of elements as the vector
18855 operand. The fourth operand is the explicit vector length of the operation.
18856
18857 Semantics:
18858 """"""""""
18859
18860 The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
18861 (:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
18862 on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
18863 lanes are treated as containing the neutral value ``1`` (i.e. having no effect
18864 on the reduction operation). If the vector length is zero, the result is the
18865 start value.
18866
18867 To ignore the start value, the neutral value can be used.
18868
18869 Examples:
18870 """""""""
18871
18872 .. code-block:: llvm
18873
18874       %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18875       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18876       ; are treated as though %mask were false for those lanes.
18877
18878       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
18879       %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
18880       %also.r = mul i32 %reduction, %start
18881
18882 .. _int_vp_reduce_fmul:
18883
18884 '``llvm.vp.reduce.fmul.*``' Intrinsics
18885 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18886
18887 Syntax:
18888 """""""
18889 This is an overloaded intrinsic.
18890
18891 ::
18892
18893       declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18894       declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18895
18896 Overview:
18897 """""""""
18898
18899 Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
18900 value, returning the result as a scalar.
18901
18902
18903 Arguments:
18904 """"""""""
18905
18906 The first operand is the start value of the reduction, which must be a scalar
18907 floating-point type equal to the result type. The second operand is the vector
18908 on which the reduction is performed and must be a vector of floating-point
18909 values whose element type is the result/start type. The third operand is the
18910 vector mask and is a vector of boolean values with the same number of elements
18911 as the vector operand. The fourth operand is the explicit vector length of the
18912 operation.
18913
18914 Semantics:
18915 """"""""""
18916
18917 The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
18918 reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
18919 vector operand ``val`` on each enabled lane, multiplying it by the scalar
18920 `start_value``. Disabled lanes are treated as containing the neutral value
18921 ``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
18922 enabled, the resulting value will be equal to the starting value.
18923
18924 To ignore the start value, the neutral value can be used.
18925
18926 See the unpredicated version (:ref:`llvm.vector.reduce.fmul
18927 <int_vector_reduce_fmul>`) for more detail on the semantics.
18928
18929 Examples:
18930 """""""""
18931
18932 .. code-block:: llvm
18933
18934       %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18935       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18936       ; are treated as though %mask were false for those lanes.
18937
18938       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
18939       %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
18940
18941
18942 .. _int_vp_reduce_and:
18943
18944 '``llvm.vp.reduce.and.*``' Intrinsics
18945 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18946
18947 Syntax:
18948 """""""
18949 This is an overloaded intrinsic.
18950
18951 ::
18952
18953       declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18954       declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18955
18956 Overview:
18957 """""""""
18958
18959 Predicated integer ``AND`` reduction of a vector and a scalar starting value,
18960 returning the result as a scalar.
18961
18962
18963 Arguments:
18964 """"""""""
18965
18966 The first operand is the start value of the reduction, which must be a scalar
18967 integer type equal to the result type. The second operand is the vector on
18968 which the reduction is performed and must be a vector of integer values whose
18969 element type is the result/start type. The third operand is the vector mask and
18970 is a vector of boolean values with the same number of elements as the vector
18971 operand. The fourth operand is the explicit vector length of the operation.
18972
18973 Semantics:
18974 """"""""""
18975
18976 The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
18977 (:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
18978 ``val`` on each enabled lane, performing an '``and``' of that with with the
18979 scalar ``start_value``. Disabled lanes are treated as containing the neutral
18980 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
18981 operation). If the vector length is zero, the result is the start value.
18982
18983 To ignore the start value, the neutral value can be used.
18984
18985 Examples:
18986 """""""""
18987
18988 .. code-block:: llvm
18989
18990       %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18991       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18992       ; are treated as though %mask were false for those lanes.
18993
18994       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
18995       %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
18996       %also.r = and i32 %reduction, %start
18997
18998
18999 .. _int_vp_reduce_or:
19000
19001 '``llvm.vp.reduce.or.*``' Intrinsics
19002 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19003
19004 Syntax:
19005 """""""
19006 This is an overloaded intrinsic.
19007
19008 ::
19009
19010       declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19011       declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19012
19013 Overview:
19014 """""""""
19015
19016 Predicated integer ``OR`` reduction of a vector and a scalar starting value,
19017 returning the result as a scalar.
19018
19019
19020 Arguments:
19021 """"""""""
19022
19023 The first operand is the start value of the reduction, which must be a scalar
19024 integer type equal to the result type. The second operand is the vector on
19025 which the reduction is performed and must be a vector of integer values whose
19026 element type is the result/start type. The third operand is the vector mask and
19027 is a vector of boolean values with the same number of elements as the vector
19028 operand. The fourth operand is the explicit vector length of the operation.
19029
19030 Semantics:
19031 """"""""""
19032
19033 The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
19034 (:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
19035 ``val`` on each enabled lane, performing an '``or``' of that with the scalar
19036 ``start_value``. Disabled lanes are treated as containing the neutral value
19037 ``0`` (i.e. having no effect on the reduction operation). If the vector length
19038 is zero, the result is the start value.
19039
19040 To ignore the start value, the neutral value can be used.
19041
19042 Examples:
19043 """""""""
19044
19045 .. code-block:: llvm
19046
19047       %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19048       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19049       ; are treated as though %mask were false for those lanes.
19050
19051       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19052       %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
19053       %also.r = or i32 %reduction, %start
19054
19055 .. _int_vp_reduce_xor:
19056
19057 '``llvm.vp.reduce.xor.*``' Intrinsics
19058 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19059
19060 Syntax:
19061 """""""
19062 This is an overloaded intrinsic.
19063
19064 ::
19065
19066       declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19067       declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19068
19069 Overview:
19070 """""""""
19071
19072 Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
19073 returning the result as a scalar.
19074
19075
19076 Arguments:
19077 """"""""""
19078
19079 The first operand is the start value of the reduction, which must be a scalar
19080 integer type equal to the result type. The second operand is the vector on
19081 which the reduction is performed and must be a vector of integer values whose
19082 element type is the result/start type. The third operand is the vector mask and
19083 is a vector of boolean values with the same number of elements as the vector
19084 operand. The fourth operand is the explicit vector length of the operation.
19085
19086 Semantics:
19087 """"""""""
19088
19089 The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
19090 (:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
19091 ``val`` on each enabled lane, performing an '``xor``' of that with the scalar
19092 ``start_value``. Disabled lanes are treated as containing the neutral value
19093 ``0`` (i.e. having no effect on the reduction operation). If the vector length
19094 is zero, the result is the start value.
19095
19096 To ignore the start value, the neutral value can be used.
19097
19098 Examples:
19099 """""""""
19100
19101 .. code-block:: llvm
19102
19103       %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19104       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19105       ; are treated as though %mask were false for those lanes.
19106
19107       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19108       %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
19109       %also.r = xor i32 %reduction, %start
19110
19111
19112 .. _int_vp_reduce_smax:
19113
19114 '``llvm.vp.reduce.smax.*``' Intrinsics
19115 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19116
19117 Syntax:
19118 """""""
19119 This is an overloaded intrinsic.
19120
19121 ::
19122
19123       declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19124       declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19125
19126 Overview:
19127 """""""""
19128
19129 Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
19130 value, returning the result as a scalar.
19131
19132
19133 Arguments:
19134 """"""""""
19135
19136 The first operand is the start value of the reduction, which must be a scalar
19137 integer type equal to the result type. The second operand is the vector on
19138 which the reduction is performed and must be a vector of integer values whose
19139 element type is the result/start type. The third operand is the vector mask and
19140 is a vector of boolean values with the same number of elements as the vector
19141 operand. The fourth operand is the explicit vector length of the operation.
19142
19143 Semantics:
19144 """"""""""
19145
19146 The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
19147 reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
19148 vector operand ``val`` on each enabled lane, and taking the maximum of that and
19149 the scalar ``start_value``. Disabled lanes are treated as containing the
19150 neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
19151 If the vector length is zero, the result is the start value.
19152
19153 To ignore the start value, the neutral value can be used.
19154
19155 Examples:
19156 """""""""
19157
19158 .. code-block:: llvm
19159
19160       %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19161       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19162       ; are treated as though %mask were false for those lanes.
19163
19164       %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
19165       %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
19166       %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
19167
19168
19169 .. _int_vp_reduce_smin:
19170
19171 '``llvm.vp.reduce.smin.*``' Intrinsics
19172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19173
19174 Syntax:
19175 """""""
19176 This is an overloaded intrinsic.
19177
19178 ::
19179
19180       declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19181       declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19182
19183 Overview:
19184 """""""""
19185
19186 Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
19187 value, returning the result as a scalar.
19188
19189
19190 Arguments:
19191 """"""""""
19192
19193 The first operand is the start value of the reduction, which must be a scalar
19194 integer type equal to the result type. The second operand is the vector on
19195 which the reduction is performed and must be a vector of integer values whose
19196 element type is the result/start type. The third operand is the vector mask and
19197 is a vector of boolean values with the same number of elements as the vector
19198 operand. The fourth operand is the explicit vector length of the operation.
19199
19200 Semantics:
19201 """"""""""
19202
19203 The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
19204 reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
19205 vector operand ``val`` on each enabled lane, and taking the minimum of that and
19206 the scalar ``start_value``. Disabled lanes are treated as containing the
19207 neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
19208 If the vector length is zero, the result is the start value.
19209
19210 To ignore the start value, the neutral value can be used.
19211
19212 Examples:
19213 """""""""
19214
19215 .. code-block:: llvm
19216
19217       %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19218       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19219       ; are treated as though %mask were false for those lanes.
19220
19221       %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
19222       %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
19223       %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
19224
19225
19226 .. _int_vp_reduce_umax:
19227
19228 '``llvm.vp.reduce.umax.*``' Intrinsics
19229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19230
19231 Syntax:
19232 """""""
19233 This is an overloaded intrinsic.
19234
19235 ::
19236
19237       declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19238       declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19239
19240 Overview:
19241 """""""""
19242
19243 Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
19244 value, returning the result as a scalar.
19245
19246
19247 Arguments:
19248 """"""""""
19249
19250 The first operand is the start value of the reduction, which must be a scalar
19251 integer type equal to the result type. The second operand is the vector on
19252 which the reduction is performed and must be a vector of integer values whose
19253 element type is the result/start type. The third operand is the vector mask and
19254 is a vector of boolean values with the same number of elements as the vector
19255 operand. The fourth operand is the explicit vector length of the operation.
19256
19257 Semantics:
19258 """"""""""
19259
19260 The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
19261 reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
19262 vector operand ``val`` on each enabled lane, and taking the maximum of that and
19263 the scalar ``start_value``. Disabled lanes are treated as containing the
19264 neutral value ``0`` (i.e. having no effect on the reduction operation). If the
19265 vector length is zero, the result is the start value.
19266
19267 To ignore the start value, the neutral value can be used.
19268
19269 Examples:
19270 """""""""
19271
19272 .. code-block:: llvm
19273
19274       %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19275       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19276       ; are treated as though %mask were false for those lanes.
19277
19278       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19279       %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
19280       %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
19281
19282
19283 .. _int_vp_reduce_umin:
19284
19285 '``llvm.vp.reduce.umin.*``' Intrinsics
19286 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19287
19288 Syntax:
19289 """""""
19290 This is an overloaded intrinsic.
19291
19292 ::
19293
19294       declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19295       declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19296
19297 Overview:
19298 """""""""
19299
19300 Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
19301 value, returning the result as a scalar.
19302
19303
19304 Arguments:
19305 """"""""""
19306
19307 The first operand is the start value of the reduction, which must be a scalar
19308 integer type equal to the result type. The second operand is the vector on
19309 which the reduction is performed and must be a vector of integer values whose
19310 element type is the result/start type. The third operand is the vector mask and
19311 is a vector of boolean values with the same number of elements as the vector
19312 operand. The fourth operand is the explicit vector length of the operation.
19313
19314 Semantics:
19315 """"""""""
19316
19317 The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
19318 reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
19319 vector operand ``val`` on each enabled lane, taking the minimum of that and the
19320 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19321 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19322 operation). If the vector length is zero, the result is the start value.
19323
19324 To ignore the start value, the neutral value can be used.
19325
19326 Examples:
19327 """""""""
19328
19329 .. code-block:: llvm
19330
19331       %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19332       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19333       ; are treated as though %mask were false for those lanes.
19334
19335       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19336       %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
19337       %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
19338
19339
19340 .. _int_vp_reduce_fmax:
19341
19342 '``llvm.vp.reduce.fmax.*``' Intrinsics
19343 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19344
19345 Syntax:
19346 """""""
19347 This is an overloaded intrinsic.
19348
19349 ::
19350
19351       declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19352       declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19353
19354 Overview:
19355 """""""""
19356
19357 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
19358 value, returning the result as a scalar.
19359
19360
19361 Arguments:
19362 """"""""""
19363
19364 The first operand is the start value of the reduction, which must be a scalar
19365 floating-point type equal to the result type. The second operand is the vector
19366 on which the reduction is performed and must be a vector of floating-point
19367 values whose element type is the result/start type. The third operand is the
19368 vector mask and is a vector of boolean values with the same number of elements
19369 as the vector operand. The fourth operand is the explicit vector length of the
19370 operation.
19371
19372 Semantics:
19373 """"""""""
19374
19375 The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
19376 reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
19377 vector operand ``val`` on each enabled lane, taking the maximum of that and the
19378 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19379 value (i.e. having no effect on the reduction operation). If the vector length
19380 is zero, the result is the start value.
19381
19382 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19383 flags are set, the neutral value is ``-QNAN``. If ``nnan``  and ``ninf`` are
19384 both set, then the neutral value is the smallest floating-point value for the
19385 result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
19386
19387 This instruction has the same comparison semantics as the
19388 :ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
19389 '``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
19390 unless all elements of the vector and the starting value are ``NaN``. For a
19391 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19392 ``-0.0`` elements, the sign of the result is unspecified.
19393
19394 To ignore the start value, the neutral value can be used.
19395
19396 Examples:
19397 """""""""
19398
19399 .. code-block:: llvm
19400
19401       %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19402       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19403       ; are treated as though %mask were false for those lanes.
19404
19405       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19406       %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
19407       %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
19408
19409
19410 .. _int_vp_reduce_fmin:
19411
19412 '``llvm.vp.reduce.fmin.*``' Intrinsics
19413 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19414
19415 Syntax:
19416 """""""
19417 This is an overloaded intrinsic.
19418
19419 ::
19420
19421       declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19422       declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19423
19424 Overview:
19425 """""""""
19426
19427 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
19428 value, returning the result as a scalar.
19429
19430
19431 Arguments:
19432 """"""""""
19433
19434 The first operand is the start value of the reduction, which must be a scalar
19435 floating-point type equal to the result type. The second operand is the vector
19436 on which the reduction is performed and must be a vector of floating-point
19437 values whose element type is the result/start type. The third operand is the
19438 vector mask and is a vector of boolean values with the same number of elements
19439 as the vector operand. The fourth operand is the explicit vector length of the
19440 operation.
19441
19442 Semantics:
19443 """"""""""
19444
19445 The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
19446 reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
19447 vector operand ``val`` on each enabled lane, taking the minimum of that and the
19448 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19449 value (i.e. having no effect on the reduction operation). If the vector length
19450 is zero, the result is the start value.
19451
19452 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19453 flags are set, the neutral value is ``+QNAN``. If ``nnan``  and ``ninf`` are
19454 both set, then the neutral value is the largest floating-point value for the
19455 result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
19456
19457 This instruction has the same comparison semantics as the
19458 :ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
19459 '``llvm.minnum.*``' intrinsic). That is, the result will always be a number
19460 unless all elements of the vector and the starting value are ``NaN``. For a
19461 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19462 ``-0.0`` elements, the sign of the result is unspecified.
19463
19464 To ignore the start value, the neutral value can be used.
19465
19466 Examples:
19467 """""""""
19468
19469 .. code-block:: llvm
19470
19471       %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19472       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19473       ; are treated as though %mask were false for those lanes.
19474
19475       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19476       %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
19477       %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
19478
19479
19480 .. _int_get_active_lane_mask:
19481
19482 '``llvm.get.active.lane.mask.*``' Intrinsics
19483 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19484
19485 Syntax:
19486 """""""
19487 This is an overloaded intrinsic.
19488
19489 ::
19490
19491       declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
19492       declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
19493       declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
19494       declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
19495
19496
19497 Overview:
19498 """""""""
19499
19500 Create a mask representing active and inactive vector lanes.
19501
19502
19503 Arguments:
19504 """"""""""
19505
19506 Both operands have the same scalar integer type. The result is a vector with
19507 the i1 element type.
19508
19509 Semantics:
19510 """"""""""
19511
19512 The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
19513 to:
19514
19515 ::
19516
19517       %m[i] = icmp ult (%base + i), %n
19518
19519 where ``%m`` is a vector (mask) of active/inactive lanes with its elements
19520 indexed by ``i``,  and ``%base``, ``%n`` are the two arguments to
19521 ``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
19522 the unsigned less-than comparison operator.  Overflow cannot occur in
19523 ``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
19524 numbers and not in machine numbers.  If ``%n`` is ``0``, then the result is a
19525 poison value. The above is equivalent to:
19526
19527 ::
19528
19529       %m = @llvm.get.active.lane.mask(%base, %n)
19530
19531 This can, for example, be emitted by the loop vectorizer in which case
19532 ``%base`` is the first element of the vector induction variable (VIV) and
19533 ``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
19534 less than comparison of VIV with the loop tripcount, producing a mask of
19535 true/false values representing active/inactive vector lanes, except if the VIV
19536 overflows in which case they return false in the lanes where the VIV overflows.
19537 The arguments are scalar types to accommodate scalable vector types, for which
19538 it is unknown what the type of the step vector needs to be that enumerate its
19539 lanes without overflow.
19540
19541 This mask ``%m`` can e.g. be used in masked load/store instructions. These
19542 intrinsics provide a hint to the backend. I.e., for a vector loop, the
19543 back-edge taken count of the original scalar loop is explicit as the second
19544 argument.
19545
19546
19547 Examples:
19548 """""""""
19549
19550 .. code-block:: llvm
19551
19552       %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
19553       %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef)
19554
19555
19556 .. _int_experimental_vp_splice:
19557
19558 '``llvm.experimental.vp.splice``' Intrinsic
19559 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19560
19561 Syntax:
19562 """""""
19563 This is an overloaded intrinsic.
19564
19565 ::
19566
19567       declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
19568       declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
19569
19570 Overview:
19571 """""""""
19572
19573 The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
19574 predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
19575
19576 Arguments:
19577 """"""""""
19578
19579 The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
19580 the same type.  The third argument ``imm`` is an immediate signed integer that
19581 indicates the offset index.  The fourth argument ``mask`` is a vector mask and
19582 has the same number of elements as the result.  The last two arguments ``evl1``
19583 and ``evl2`` are unsigned integers indicating the explicit vector lengths of
19584 ``vec1`` and ``vec2`` respectively.  ``imm``, ``evl1`` and ``evl2`` should
19585 respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
19586 and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
19587 constraints are not satisfied the intrinsic has undefined behaviour.
19588
19589 Semantics:
19590 """"""""""
19591
19592 Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
19593 ``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
19594 window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
19595 the concatenated vector. Elements in the result vector beyond ``evl2`` are
19596 ``undef``.  If ``imm`` is negative the starting index is ``evl1 + imm``.  The result
19597 vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
19598 negative ``imm``) elements from indices ``[imm..evl1 - 1]``
19599 (``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
19600 first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
19601 ``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
19602 elements are considered and the remaining are ``undef``.  The lanes in the result
19603 vector disabled by ``mask`` are ``undef``.
19604
19605 Examples:
19606 """""""""
19607
19608 .. code-block:: text
19609
19610  llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3)  ==> <B, E, F, undef> ; index
19611  llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2) ==> <B, C, undef, undef> ; trailing elements
19612
19613
19614 .. _int_vp_load:
19615
19616 '``llvm.vp.load``' Intrinsic
19617 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19618
19619 Syntax:
19620 """""""
19621 This is an overloaded intrinsic.
19622
19623 ::
19624
19625     declare <4 x float> @llvm.vp.load.v4f32.p0v4f32(<4 x float>* %ptr, <4 x i1> %mask, i32 %evl)
19626     declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0nxv2i16(<vscale x 2 x i16>* %ptr, <vscale x 2 x i1> %mask, i32 %evl)
19627     declare <8 x float> @llvm.vp.load.v8f32.p1v8f32(<8 x float> addrspace(1)* %ptr, <8 x i1> %mask, i32 %evl)
19628     declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6nxv1i64(<vscale x 1 x i64> addrspace(6)* %ptr, <vscale x 1 x i1> %mask, i32 %evl)
19629
19630 Overview:
19631 """""""""
19632
19633 The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
19634 the :ref:`llvm.masked.load <int_mload>` intrinsic.
19635
19636 Arguments:
19637 """"""""""
19638
19639 The first operand is the base pointer for the load. The second operand is a
19640 vector of boolean values with the same number of elements as the return type.
19641 The third is the explicit vector length of the operation. The return type and
19642 underlying type of the base pointer are the same vector types.
19643
19644 Semantics:
19645 """"""""""
19646
19647 The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
19648 the '``llvm.masked.load``' intrinsic, where the mask is taken from the
19649 combination of the '``mask``' and '``evl``' operands in the usual VP way. Of
19650 the '``llvm.masked.load``' operands not set by '``llvm.vp.load``': the
19651 '``passthru``' operand is implicitly ``undef``; the '``alignment``' operand is
19652 taken as the ABI alignment of the return type as specified by the
19653 :ref:`datalayout string<langref_datalayout>`.
19654
19655 Examples:
19656 """""""""
19657
19658 .. code-block:: text
19659
19660      %r = call <8 x i8> @llvm.vp.load.v8i8.p0v8i8(<8 x i8>* %ptr, <8 x i1> %mask, i32 %evl)
19661      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19662      ;; Note that since the alignment is ultimately up to the data layout
19663      ;; string, 8 (the default) is used as an example.
19664
19665      %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0v8i8(<8 x i8>* %ptr, i32 8, <8 x i1> %mask, <8 x i8> undef)
19666
19667
19668 .. _int_vp_store:
19669
19670 '``llvm.vp.store``' Intrinsic
19671 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19672
19673 Syntax:
19674 """""""
19675 This is an overloaded intrinsic.
19676
19677 ::
19678
19679     declare void @llvm.vp.store.v4f32.p0v4f32(<4 x float> %val, <4 x float>* %ptr, <4 x i1> %mask, i32 %evl)
19680     declare void @llvm.vp.store.nxv2i16.p0nxv2i16(<vscale x 2 x i16> %val, <vscale x 2 x i16>* %ptr, <vscale x 2 x i1> %mask, i32 %evl)
19681     declare void @llvm.vp.store.v8f32.p1v8f32(<8 x float> %val, <8 x float> addrspace(1)* %ptr, <8 x i1> %mask, i32 %evl)
19682     declare void @llvm.vp.store.nxv1i64.p6nxv1i64(<vscale x 1 x i64> %val, <vscale x 1 x i64> addrspace(6)* %ptr, <vscale x 1 x i1> %mask, i32 %evl)
19683
19684 Overview:
19685 """""""""
19686
19687 The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
19688 the :ref:`llvm.masked.store <int_mstore>` intrinsic.
19689
19690 Arguments:
19691 """"""""""
19692
19693 The first operand is the vector value to be written to memory. The second
19694 operand is the base pointer for the store. It has the same underlying type as
19695 the value operand. The third operand is a vector of boolean values with the
19696 same number of elements as the return type. The fourth is the explicit vector
19697 length of the operation.
19698
19699 Semantics:
19700 """"""""""
19701
19702 The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
19703 the '``llvm.masked.store``' intrinsic, where the mask is taken from the
19704 combination of the '``mask``' and '``evl``' operands in the usual VP way. The
19705 '``alignment``' operand of the '``llvm.masked.store``' intrinsic is not set by
19706 '``llvm.vp.store``': it is taken as the ABI alignment of the type of the
19707 '``value``' operand as specified by the :ref:`datalayout
19708 string<langref_datalayout>`.
19709
19710 Examples:
19711 """""""""
19712
19713 .. code-block:: text
19714
19715      call void @llvm.vp.store.v8i8.p0v8i8(<8 x i8> %val, <8 x i8>* %ptr, <8 x i1> %mask, i32 %evl)
19716      ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
19717      ;; Note that since the alignment is ultimately up to the data layout
19718      ;; string, 8 (the default) is used as an example.
19719
19720      call void @llvm.masked.store.v8i8.p0v8i8(<8 x i8> %val, <8 x i8>* %ptr, i32 8, <8 x i1> %mask)
19721
19722
19723 .. _int_vp_gather:
19724
19725 '``llvm.vp.gather``' Intrinsic
19726 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19727
19728 Syntax:
19729 """""""
19730 This is an overloaded intrinsic.
19731
19732 ::
19733
19734     declare <4 x double> @llvm.vp.gather.v4f64.v4p0f64(<4 x double*> %ptrs, <4 x i1> %mask, i32 %evl)
19735     declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0i8(<vscale x 2 x i8*> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
19736     declare <2 x float> @llvm.vp.gather.v2f32.v2p2f32(<2 x float addrspace(2)*> %ptrs, <2 x i1> %mask, i32 %evl)
19737     declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4i32(<vscale x 4 x i32 addrspace(4)*> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
19738
19739 Overview:
19740 """""""""
19741
19742 The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
19743 the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
19744
19745 Arguments:
19746 """"""""""
19747
19748 The first operand is a vector of pointers which holds all memory addresses to
19749 read. The second operand is a vector of boolean values with the same number of
19750 elements as the return type. The third is the explicit vector length of the
19751 operation. The return type and underlying type of the vector of pointers are
19752 the same vector types.
19753
19754 Semantics:
19755 """"""""""
19756
19757 The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
19758 the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
19759 from the combination of the '``mask``' and '``evl``' operands in the usual VP
19760 way. Of the '``llvm.masked.gather``' operands not set by '``llvm.vp.gather``':
19761 the '``passthru``' operand is implicitly ``undef``; the '``alignment``' operand
19762 is taken as the ABI alignment of the source addresses as specified by the
19763 :ref:`datalayout string<langref_datalayout>`.
19764
19765 Examples:
19766 """""""""
19767
19768 .. code-block:: text
19769
19770      %r = call void @llvm.vp.scatter.v8i8.v8p0i8(<8 x i8> %val, <8 x i8*> %ptrs, <8 x i1> %mask, i32 %evl)
19771      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19772      ;; Note that since the alignment is ultimately up to the data layout
19773      ;; string, 8 is used as an example.
19774
19775      %also.r = call void @llvm.masked.scatter.v8i8.v8p0i8(<8 x i8> %val, <8 x i8*> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> undef)
19776
19777
19778 .. _int_vp_scatter:
19779
19780 '``llvm.vp.scatter``' Intrinsic
19781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19782
19783 Syntax:
19784 """""""
19785 This is an overloaded intrinsic.
19786
19787 ::
19788
19789     declare void @llvm.vp.scatter.v4f64.v4p0f64(<4 x double> %val, <4 x double*> %ptrs, <4 x i1> %mask, i32 %evl)
19790     declare void @llvm.vp.scatter.nxv2i8.nxv2p0i8(<vscale x 2 x i8> %val, <vscale x 2 x i8*> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
19791     declare void @llvm.vp.scatter.v2f32.v2p2f32(<2 x float> %val, <2 x float addrspace(2)*> %ptrs, <2 x i1> %mask, i32 %evl)
19792     declare void @llvm.vp.scatter.nxv4i32.nxv4p4i32(<vscale x 4 x i32> %val, <vscale x 4 x i32 addrspace(4)*> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
19793
19794 Overview:
19795 """""""""
19796
19797 The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
19798 the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
19799
19800 Arguments:
19801 """"""""""
19802
19803 The first operand is a vector value to be written to memory. The second operand
19804 is a vector of pointers, pointing to where the value elements should be stored.
19805 The third operand is a vector of boolean values with the same number of
19806 elements as the return type. The fourth is the explicit vector length of the
19807 operation.
19808
19809 Semantics:
19810 """"""""""
19811
19812 The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
19813 the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
19814 taken from the combination of the '``mask``' and '``evl``' operands in the
19815 usual VP way. The '``alignment``' operand of the '``llvm.masked.scatter``'
19816 intrinsic is not set by '``llvm.vp.scatter``': it is taken as the ABI alignment
19817 of the destination addresses as specified by the :ref:`datalayout
19818 string<langref_datalayout>`.
19819
19820 Examples:
19821 """""""""
19822
19823 .. code-block:: text
19824
19825      call void @llvm.vp.scatter.v8i8.v8p0i8(<8 x i8> %val, <8 x i8*> %ptrs, <8 x i1> %mask, i32 %evl)
19826      ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
19827      ;; Note that since the alignment is ultimately up to the data layout
19828      ;; string, 8 is used as an example.
19829
19830      call void @llvm.masked.scatter.v8i8.v8p0i8(<8 x i8> %val, <8 x i8*> %ptrs, i32 8, <8 x i1> %mask)
19831
19832
19833 .. _int_mload_mstore:
19834
19835 Masked Vector Load and Store Intrinsics
19836 ---------------------------------------
19837
19838 LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
19839
19840 .. _int_mload:
19841
19842 '``llvm.masked.load.*``' Intrinsics
19843 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19844
19845 Syntax:
19846 """""""
19847 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
19848
19849 ::
19850
19851       declare <16 x float>  @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19852       declare <2 x double>  @llvm.masked.load.v2f64.p0v2f64  (<2 x double>* <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19853       ;; The data is a vector of pointers to double
19854       declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64    (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
19855       ;; The data is a vector of function pointers
19856       declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
19857
19858 Overview:
19859 """""""""
19860
19861 Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19862
19863
19864 Arguments:
19865 """"""""""
19866
19867 The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
19868
19869 Semantics:
19870 """"""""""
19871
19872 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
19873 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
19874
19875
19876 ::
19877
19878        %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
19879
19880        ;; The result of the two following instructions is identical aside from potential memory access exception
19881        %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
19882        %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
19883
19884 .. _int_mstore:
19885
19886 '``llvm.masked.store.*``' Intrinsics
19887 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19888
19889 Syntax:
19890 """""""
19891 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
19892
19893 ::
19894
19895        declare void @llvm.masked.store.v8i32.p0v8i32  (<8  x i32>   <value>, <8  x i32>*   <ptr>, i32 <alignment>,  <8  x i1> <mask>)
19896        declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>,  <16 x i1> <mask>)
19897        ;; The data is a vector of pointers to double
19898        declare void @llvm.masked.store.v8p0f64.p0v8p0f64    (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
19899        ;; The data is a vector of function pointers
19900        declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
19901
19902 Overview:
19903 """""""""
19904
19905 Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
19906
19907 Arguments:
19908 """"""""""
19909
19910 The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
19911
19912
19913 Semantics:
19914 """"""""""
19915
19916 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
19917 The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
19918
19919 ::
19920
19921        call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4,  <16 x i1> %mask)
19922
19923        ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
19924        %oldval = load <16 x float>, <16 x float>* %ptr, align 4
19925        %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
19926        store <16 x float> %res, <16 x float>* %ptr, align 4
19927
19928
19929 Masked Vector Gather and Scatter Intrinsics
19930 -------------------------------------------
19931
19932 LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
19933
19934 .. _int_mgather:
19935
19936 '``llvm.masked.gather.*``' Intrinsics
19937 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19938
19939 Syntax:
19940 """""""
19941 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
19942
19943 ::
19944
19945       declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32   (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19946       declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64     (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19947       declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x float*> <passthru>)
19948
19949 Overview:
19950 """""""""
19951
19952 Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19953
19954
19955 Arguments:
19956 """"""""""
19957
19958 The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
19959
19960 Semantics:
19961 """"""""""
19962
19963 The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
19964 The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
19965
19966
19967 ::
19968
19969        %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
19970
19971        ;; The gather with all-true mask is equivalent to the following instruction sequence
19972        %ptr0 = extractelement <4 x double*> %ptrs, i32 0
19973        %ptr1 = extractelement <4 x double*> %ptrs, i32 1
19974        %ptr2 = extractelement <4 x double*> %ptrs, i32 2
19975        %ptr3 = extractelement <4 x double*> %ptrs, i32 3
19976
19977        %val0 = load double, double* %ptr0, align 8
19978        %val1 = load double, double* %ptr1, align 8
19979        %val2 = load double, double* %ptr2, align 8
19980        %val3 = load double, double* %ptr3, align 8
19981
19982        %vec0    = insertelement <4 x double>undef, %val0, 0
19983        %vec01   = insertelement <4 x double>%vec0, %val1, 1
19984        %vec012  = insertelement <4 x double>%vec01, %val2, 2
19985        %vec0123 = insertelement <4 x double>%vec012, %val3, 3
19986
19987 .. _int_mscatter:
19988
19989 '``llvm.masked.scatter.*``' Intrinsics
19990 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19991
19992 Syntax:
19993 """""""
19994 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
19995
19996 ::
19997
19998        declare void @llvm.masked.scatter.v8i32.v8p0i32     (<8 x i32>     <value>, <8 x i32*>     <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
19999        declare void @llvm.masked.scatter.v16f32.v16p1f32   (<16 x float>  <value>, <16 x float addrspace(1)*>  <ptrs>, i32 <alignment>, <16 x i1> <mask>)
20000        declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
20001
20002 Overview:
20003 """""""""
20004
20005 Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
20006
20007 Arguments:
20008 """"""""""
20009
20010 The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
20011
20012 Semantics:
20013 """"""""""
20014
20015 The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
20016
20017 ::
20018
20019        ;; This instruction unconditionally stores data vector in multiple addresses
20020        call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
20021
20022        ;; It is equivalent to a list of scalar stores
20023        %val0 = extractelement <8 x i32> %value, i32 0
20024        %val1 = extractelement <8 x i32> %value, i32 1
20025        ..
20026        %val7 = extractelement <8 x i32> %value, i32 7
20027        %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
20028        %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
20029        ..
20030        %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
20031        ;; Note: the order of the following stores is important when they overlap:
20032        store i32 %val0, i32* %ptr0, align 4
20033        store i32 %val1, i32* %ptr1, align 4
20034        ..
20035        store i32 %val7, i32* %ptr7, align 4
20036
20037
20038 Masked Vector Expanding Load and Compressing Store Intrinsics
20039 -------------------------------------------------------------
20040
20041 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
20042
20043 .. _int_expandload:
20044
20045 '``llvm.masked.expandload.*``' Intrinsics
20046 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20047
20048 Syntax:
20049 """""""
20050 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
20051
20052 ::
20053
20054       declare <16 x float>  @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
20055       declare <2 x i64>     @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
20056
20057 Overview:
20058 """""""""
20059
20060 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
20061
20062
20063 Arguments:
20064 """"""""""
20065
20066 The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
20067
20068 Semantics:
20069 """"""""""
20070
20071 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
20072
20073 .. code-block:: c
20074
20075     // In this loop we load from B and spread the elements into array A.
20076     double *A, B; int *C;
20077     for (int i = 0; i < size; ++i) {
20078       if (C[i] != 0)
20079         A[i] = B[j++];
20080     }
20081
20082
20083 .. code-block:: llvm
20084
20085     ; Load several elements from array B and expand them in a vector.
20086     ; The number of loaded elements is equal to the number of '1' elements in the Mask.
20087     %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef)
20088     ; Store the result in A
20089     call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask)
20090
20091     ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
20092     %MaskI = bitcast <8 x i1> %Mask to i8
20093     %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
20094     %MaskI64 = zext i8 %MaskIPopcnt to i64
20095     %BNextInd = add i64 %BInd, %MaskI64
20096
20097
20098 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
20099 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
20100
20101 .. _int_compressstore:
20102
20103 '``llvm.masked.compressstore.*``' Intrinsics
20104 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20105
20106 Syntax:
20107 """""""
20108 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
20109
20110 ::
20111
20112       declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, i32*   <ptr>, <8  x i1> <mask>)
20113       declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)
20114
20115 Overview:
20116 """""""""
20117
20118 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
20119
20120 Arguments:
20121 """"""""""
20122
20123 The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
20124
20125
20126 Semantics:
20127 """"""""""
20128
20129 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
20130
20131 .. code-block:: c
20132
20133     // In this loop we load elements from A and store them consecutively in B
20134     double *A, B; int *C;
20135     for (int i = 0; i < size; ++i) {
20136       if (C[i] != 0)
20137         B[j++] = A[i]
20138     }
20139
20140
20141 .. code-block:: llvm
20142
20143     ; Load elements from A.
20144     %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
20145     ; Store all selected elements consecutively in array B
20146     call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask)
20147
20148     ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
20149     %MaskI = bitcast <8 x i1> %Mask to i8
20150     %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
20151     %MaskI64 = zext i8 %MaskIPopcnt to i64
20152     %BNextInd = add i64 %BInd, %MaskI64
20153
20154
20155 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
20156
20157
20158 Memory Use Markers
20159 ------------------
20160
20161 This class of intrinsics provides information about the
20162 :ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
20163 are immutable.
20164
20165 .. _int_lifestart:
20166
20167 '``llvm.lifetime.start``' Intrinsic
20168 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20169
20170 Syntax:
20171 """""""
20172
20173 ::
20174
20175       declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
20176
20177 Overview:
20178 """""""""
20179
20180 The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
20181 object's lifetime.
20182
20183 Arguments:
20184 """"""""""
20185
20186 The first argument is a constant integer representing the size of the
20187 object, or -1 if it is variable sized. The second argument is a pointer
20188 to the object.
20189
20190 Semantics:
20191 """"""""""
20192
20193 If ``ptr`` is a stack-allocated object and it points to the first byte of
20194 the object, the object is initially marked as dead.
20195 ``ptr`` is conservatively considered as a non-stack-allocated object if
20196 the stack coloring algorithm that is used in the optimization pipeline cannot
20197 conclude that ``ptr`` is a stack-allocated object.
20198
20199 After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
20200 as alive and has an uninitialized value.
20201 The stack object is marked as dead when either
20202 :ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
20203 function returns.
20204
20205 After :ref:`llvm.lifetime.end <int_lifeend>` is called,
20206 '``llvm.lifetime.start``' on the stack object can be called again.
20207 The second '``llvm.lifetime.start``' call marks the object as alive, but it
20208 does not change the address of the object.
20209
20210 If ``ptr`` is a non-stack-allocated object, it does not point to the first
20211 byte of the object or it is a stack object that is already alive, it simply
20212 fills all bytes of the object with ``poison``.
20213
20214
20215 .. _int_lifeend:
20216
20217 '``llvm.lifetime.end``' Intrinsic
20218 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20219
20220 Syntax:
20221 """""""
20222
20223 ::
20224
20225       declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
20226
20227 Overview:
20228 """""""""
20229
20230 The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
20231 lifetime.
20232
20233 Arguments:
20234 """"""""""
20235
20236 The first argument is a constant integer representing the size of the
20237 object, or -1 if it is variable sized. The second argument is a pointer
20238 to the object.
20239
20240 Semantics:
20241 """"""""""
20242
20243 If ``ptr`` is a stack-allocated object and it points to the first byte of the
20244 object, the object is dead.
20245 ``ptr`` is conservatively considered as a non-stack-allocated object if
20246 the stack coloring algorithm that is used in the optimization pipeline cannot
20247 conclude that ``ptr`` is a stack-allocated object.
20248
20249 Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
20250
20251 If ``ptr`` is a non-stack-allocated object or it does not point to the first
20252 byte of the object, it is equivalent to simply filling all bytes of the object
20253 with ``poison``.
20254
20255
20256 '``llvm.invariant.start``' Intrinsic
20257 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20258
20259 Syntax:
20260 """""""
20261 This is an overloaded intrinsic. The memory object can belong to any address space.
20262
20263 ::
20264
20265       declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
20266
20267 Overview:
20268 """""""""
20269
20270 The '``llvm.invariant.start``' intrinsic specifies that the contents of
20271 a memory object will not change.
20272
20273 Arguments:
20274 """"""""""
20275
20276 The first argument is a constant integer representing the size of the
20277 object, or -1 if it is variable sized. The second argument is a pointer
20278 to the object.
20279
20280 Semantics:
20281 """"""""""
20282
20283 This intrinsic indicates that until an ``llvm.invariant.end`` that uses
20284 the return value, the referenced memory location is constant and
20285 unchanging.
20286
20287 '``llvm.invariant.end``' Intrinsic
20288 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20289
20290 Syntax:
20291 """""""
20292 This is an overloaded intrinsic. The memory object can belong to any address space.
20293
20294 ::
20295
20296       declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
20297
20298 Overview:
20299 """""""""
20300
20301 The '``llvm.invariant.end``' intrinsic specifies that the contents of a
20302 memory object are mutable.
20303
20304 Arguments:
20305 """"""""""
20306
20307 The first argument is the matching ``llvm.invariant.start`` intrinsic.
20308 The second argument is a constant integer representing the size of the
20309 object, or -1 if it is variable sized and the third argument is a
20310 pointer to the object.
20311
20312 Semantics:
20313 """"""""""
20314
20315 This intrinsic indicates that the memory is mutable again.
20316
20317 '``llvm.launder.invariant.group``' Intrinsic
20318 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20319
20320 Syntax:
20321 """""""
20322 This is an overloaded intrinsic. The memory object can belong to any address
20323 space. The returned pointer must belong to the same address space as the
20324 argument.
20325
20326 ::
20327
20328       declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>)
20329
20330 Overview:
20331 """""""""
20332
20333 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
20334 established by ``invariant.group`` metadata no longer holds, to obtain a new
20335 pointer value that carries fresh invariant group information. It is an
20336 experimental intrinsic, which means that its semantics might change in the
20337 future.
20338
20339
20340 Arguments:
20341 """"""""""
20342
20343 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
20344 to the memory.
20345
20346 Semantics:
20347 """"""""""
20348
20349 Returns another pointer that aliases its argument but which is considered different
20350 for the purposes of ``load``/``store`` ``invariant.group`` metadata.
20351 It does not read any accessible memory and the execution can be speculated.
20352
20353 '``llvm.strip.invariant.group``' Intrinsic
20354 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20355
20356 Syntax:
20357 """""""
20358 This is an overloaded intrinsic. The memory object can belong to any address
20359 space. The returned pointer must belong to the same address space as the
20360 argument.
20361
20362 ::
20363
20364       declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>)
20365
20366 Overview:
20367 """""""""
20368
20369 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
20370 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
20371 value that does not carry the invariant information. It is an experimental
20372 intrinsic, which means that its semantics might change in the future.
20373
20374
20375 Arguments:
20376 """"""""""
20377
20378 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
20379 to the memory.
20380
20381 Semantics:
20382 """"""""""
20383
20384 Returns another pointer that aliases its argument but which has no associated
20385 ``invariant.group`` metadata.
20386 It does not read any memory and can be speculated.
20387
20388
20389
20390 .. _constrainedfp:
20391
20392 Constrained Floating-Point Intrinsics
20393 -------------------------------------
20394
20395 These intrinsics are used to provide special handling of floating-point
20396 operations when specific rounding mode or floating-point exception behavior is
20397 required.  By default, LLVM optimization passes assume that the rounding mode is
20398 round-to-nearest and that floating-point exceptions will not be monitored.
20399 Constrained FP intrinsics are used to support non-default rounding modes and
20400 accurately preserve exception behavior without compromising LLVM's ability to
20401 optimize FP code when the default behavior is used.
20402
20403 If any FP operation in a function is constrained then they all must be
20404 constrained. This is required for correct LLVM IR. Optimizations that
20405 move code around can create miscompiles if mixing of constrained and normal
20406 operations is done. The correct way to mix constrained and less constrained
20407 operations is to use the rounding mode and exception handling metadata to
20408 mark constrained intrinsics as having LLVM's default behavior.
20409
20410 Each of these intrinsics corresponds to a normal floating-point operation. The
20411 data arguments and the return value are the same as the corresponding FP
20412 operation.
20413
20414 The rounding mode argument is a metadata string specifying what
20415 assumptions, if any, the optimizer can make when transforming constant
20416 values. Some constrained FP intrinsics omit this argument. If required
20417 by the intrinsic, this argument must be one of the following strings:
20418
20419 ::
20420
20421       "round.dynamic"
20422       "round.tonearest"
20423       "round.downward"
20424       "round.upward"
20425       "round.towardzero"
20426       "round.tonearestaway"
20427
20428 If this argument is "round.dynamic" optimization passes must assume that the
20429 rounding mode is unknown and may change at runtime.  No transformations that
20430 depend on rounding mode may be performed in this case.
20431
20432 The other possible values for the rounding mode argument correspond to the
20433 similarly named IEEE rounding modes.  If the argument is any of these values
20434 optimization passes may perform transformations as long as they are consistent
20435 with the specified rounding mode.
20436
20437 For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
20438 "round.downward" or "round.dynamic" because if the value of 'x' is +0 then
20439 'x-0' should evaluate to '-0' when rounding downward.  However, this
20440 transformation is legal for all other rounding modes.
20441
20442 For values other than "round.dynamic" optimization passes may assume that the
20443 actual runtime rounding mode (as defined in a target-specific manner) matches
20444 the specified rounding mode, but this is not guaranteed.  Using a specific
20445 non-dynamic rounding mode which does not match the actual rounding mode at
20446 runtime results in undefined behavior.
20447
20448 The exception behavior argument is a metadata string describing the floating
20449 point exception semantics that required for the intrinsic. This argument
20450 must be one of the following strings:
20451
20452 ::
20453
20454       "fpexcept.ignore"
20455       "fpexcept.maytrap"
20456       "fpexcept.strict"
20457
20458 If this argument is "fpexcept.ignore" optimization passes may assume that the
20459 exception status flags will not be read and that floating-point exceptions will
20460 be masked.  This allows transformations to be performed that may change the
20461 exception semantics of the original code.  For example, FP operations may be
20462 speculatively executed in this case whereas they must not be for either of the
20463 other possible values of this argument.
20464
20465 If the exception behavior argument is "fpexcept.maytrap" optimization passes
20466 must avoid transformations that may raise exceptions that would not have been
20467 raised by the original code (such as speculatively executing FP operations), but
20468 passes are not required to preserve all exceptions that are implied by the
20469 original code.  For example, exceptions may be potentially hidden by constant
20470 folding.
20471
20472 If the exception behavior argument is "fpexcept.strict" all transformations must
20473 strictly preserve the floating-point exception semantics of the original code.
20474 Any FP exception that would have been raised by the original code must be raised
20475 by the transformed code, and the transformed code must not raise any FP
20476 exceptions that would not have been raised by the original code.  This is the
20477 exception behavior argument that will be used if the code being compiled reads
20478 the FP exception status flags, but this mode can also be used with code that
20479 unmasks FP exceptions.
20480
20481 The number and order of floating-point exceptions is NOT guaranteed.  For
20482 example, a series of FP operations that each may raise exceptions may be
20483 vectorized into a single instruction that raises each unique exception a single
20484 time.
20485
20486 Proper :ref:`function attributes <fnattrs>` usage is required for the
20487 constrained intrinsics to function correctly.
20488
20489 All function *calls* done in a function that uses constrained floating
20490 point intrinsics must have the ``strictfp`` attribute.
20491
20492 All function *definitions* that use constrained floating point intrinsics
20493 must have the ``strictfp`` attribute.
20494
20495 '``llvm.experimental.constrained.fadd``' Intrinsic
20496 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20497
20498 Syntax:
20499 """""""
20500
20501 ::
20502
20503       declare <type>
20504       @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
20505                                           metadata <rounding mode>,
20506                                           metadata <exception behavior>)
20507
20508 Overview:
20509 """""""""
20510
20511 The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
20512 two operands.
20513
20514
20515 Arguments:
20516 """"""""""
20517
20518 The first two arguments to the '``llvm.experimental.constrained.fadd``'
20519 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20520 of floating-point values. Both arguments must have identical types.
20521
20522 The third and fourth arguments specify the rounding mode and exception
20523 behavior as described above.
20524
20525 Semantics:
20526 """"""""""
20527
20528 The value produced is the floating-point sum of the two value operands and has
20529 the same type as the operands.
20530
20531
20532 '``llvm.experimental.constrained.fsub``' Intrinsic
20533 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20534
20535 Syntax:
20536 """""""
20537
20538 ::
20539
20540       declare <type>
20541       @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
20542                                           metadata <rounding mode>,
20543                                           metadata <exception behavior>)
20544
20545 Overview:
20546 """""""""
20547
20548 The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
20549 of its two operands.
20550
20551
20552 Arguments:
20553 """"""""""
20554
20555 The first two arguments to the '``llvm.experimental.constrained.fsub``'
20556 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20557 of floating-point values. Both arguments must have identical types.
20558
20559 The third and fourth arguments specify the rounding mode and exception
20560 behavior as described above.
20561
20562 Semantics:
20563 """"""""""
20564
20565 The value produced is the floating-point difference of the two value operands
20566 and has the same type as the operands.
20567
20568
20569 '``llvm.experimental.constrained.fmul``' Intrinsic
20570 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20571
20572 Syntax:
20573 """""""
20574
20575 ::
20576
20577       declare <type>
20578       @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
20579                                           metadata <rounding mode>,
20580                                           metadata <exception behavior>)
20581
20582 Overview:
20583 """""""""
20584
20585 The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
20586 its two operands.
20587
20588
20589 Arguments:
20590 """"""""""
20591
20592 The first two arguments to the '``llvm.experimental.constrained.fmul``'
20593 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20594 of floating-point values. Both arguments must have identical types.
20595
20596 The third and fourth arguments specify the rounding mode and exception
20597 behavior as described above.
20598
20599 Semantics:
20600 """"""""""
20601
20602 The value produced is the floating-point product of the two value operands and
20603 has the same type as the operands.
20604
20605
20606 '``llvm.experimental.constrained.fdiv``' Intrinsic
20607 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20608
20609 Syntax:
20610 """""""
20611
20612 ::
20613
20614       declare <type>
20615       @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
20616                                           metadata <rounding mode>,
20617                                           metadata <exception behavior>)
20618
20619 Overview:
20620 """""""""
20621
20622 The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
20623 its two operands.
20624
20625
20626 Arguments:
20627 """"""""""
20628
20629 The first two arguments to the '``llvm.experimental.constrained.fdiv``'
20630 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20631 of floating-point values. Both arguments must have identical types.
20632
20633 The third and fourth arguments specify the rounding mode and exception
20634 behavior as described above.
20635
20636 Semantics:
20637 """"""""""
20638
20639 The value produced is the floating-point quotient of the two value operands and
20640 has the same type as the operands.
20641
20642
20643 '``llvm.experimental.constrained.frem``' Intrinsic
20644 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20645
20646 Syntax:
20647 """""""
20648
20649 ::
20650
20651       declare <type>
20652       @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
20653                                           metadata <rounding mode>,
20654                                           metadata <exception behavior>)
20655
20656 Overview:
20657 """""""""
20658
20659 The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
20660 from the division of its two operands.
20661
20662
20663 Arguments:
20664 """"""""""
20665
20666 The first two arguments to the '``llvm.experimental.constrained.frem``'
20667 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20668 of floating-point values. Both arguments must have identical types.
20669
20670 The third and fourth arguments specify the rounding mode and exception
20671 behavior as described above.  The rounding mode argument has no effect, since
20672 the result of frem is never rounded, but the argument is included for
20673 consistency with the other constrained floating-point intrinsics.
20674
20675 Semantics:
20676 """"""""""
20677
20678 The value produced is the floating-point remainder from the division of the two
20679 value operands and has the same type as the operands.  The remainder has the
20680 same sign as the dividend.
20681
20682 '``llvm.experimental.constrained.fma``' Intrinsic
20683 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20684
20685 Syntax:
20686 """""""
20687
20688 ::
20689
20690       declare <type>
20691       @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
20692                                           metadata <rounding mode>,
20693                                           metadata <exception behavior>)
20694
20695 Overview:
20696 """""""""
20697
20698 The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
20699 fused-multiply-add operation on its operands.
20700
20701 Arguments:
20702 """"""""""
20703
20704 The first three arguments to the '``llvm.experimental.constrained.fma``'
20705 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
20706 <t_vector>` of floating-point values. All arguments must have identical types.
20707
20708 The fourth and fifth arguments specify the rounding mode and exception behavior
20709 as described above.
20710
20711 Semantics:
20712 """"""""""
20713
20714 The result produced is the product of the first two operands added to the third
20715 operand computed with infinite precision, and then rounded to the target
20716 precision.
20717
20718 '``llvm.experimental.constrained.fptoui``' Intrinsic
20719 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20720
20721 Syntax:
20722 """""""
20723
20724 ::
20725
20726       declare <ty2>
20727       @llvm.experimental.constrained.fptoui(<type> <value>,
20728                                           metadata <exception behavior>)
20729
20730 Overview:
20731 """""""""
20732
20733 The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
20734 floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
20735
20736 Arguments:
20737 """"""""""
20738
20739 The first argument to the '``llvm.experimental.constrained.fptoui``'
20740 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20741 <t_vector>` of floating point values.
20742
20743 The second argument specifies the exception behavior as described above.
20744
20745 Semantics:
20746 """"""""""
20747
20748 The result produced is an unsigned integer converted from the floating
20749 point operand. The value is truncated, so it is rounded towards zero.
20750
20751 '``llvm.experimental.constrained.fptosi``' Intrinsic
20752 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20753
20754 Syntax:
20755 """""""
20756
20757 ::
20758
20759       declare <ty2>
20760       @llvm.experimental.constrained.fptosi(<type> <value>,
20761                                           metadata <exception behavior>)
20762
20763 Overview:
20764 """""""""
20765
20766 The '``llvm.experimental.constrained.fptosi``' intrinsic converts
20767 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
20768
20769 Arguments:
20770 """"""""""
20771
20772 The first argument to the '``llvm.experimental.constrained.fptosi``'
20773 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20774 <t_vector>` of floating point values.
20775
20776 The second argument specifies the exception behavior as described above.
20777
20778 Semantics:
20779 """"""""""
20780
20781 The result produced is a signed integer converted from the floating
20782 point operand. The value is truncated, so it is rounded towards zero.
20783
20784 '``llvm.experimental.constrained.uitofp``' Intrinsic
20785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20786
20787 Syntax:
20788 """""""
20789
20790 ::
20791
20792       declare <ty2>
20793       @llvm.experimental.constrained.uitofp(<type> <value>,
20794                                           metadata <rounding mode>,
20795                                           metadata <exception behavior>)
20796
20797 Overview:
20798 """""""""
20799
20800 The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
20801 unsigned integer ``value`` to a floating-point of type ``ty2``.
20802
20803 Arguments:
20804 """"""""""
20805
20806 The first argument to the '``llvm.experimental.constrained.uitofp``'
20807 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20808 <t_vector>` of integer values.
20809
20810 The second and third arguments specify the rounding mode and exception
20811 behavior as described above.
20812
20813 Semantics:
20814 """"""""""
20815
20816 An inexact floating-point exception will be raised if rounding is required.
20817 Any result produced is a floating point value converted from the input
20818 integer operand.
20819
20820 '``llvm.experimental.constrained.sitofp``' Intrinsic
20821 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20822
20823 Syntax:
20824 """""""
20825
20826 ::
20827
20828       declare <ty2>
20829       @llvm.experimental.constrained.sitofp(<type> <value>,
20830                                           metadata <rounding mode>,
20831                                           metadata <exception behavior>)
20832
20833 Overview:
20834 """""""""
20835
20836 The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
20837 signed integer ``value`` to a floating-point of type ``ty2``.
20838
20839 Arguments:
20840 """"""""""
20841
20842 The first argument to the '``llvm.experimental.constrained.sitofp``'
20843 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20844 <t_vector>` of integer values.
20845
20846 The second and third arguments specify the rounding mode and exception
20847 behavior as described above.
20848
20849 Semantics:
20850 """"""""""
20851
20852 An inexact floating-point exception will be raised if rounding is required.
20853 Any result produced is a floating point value converted from the input
20854 integer operand.
20855
20856 '``llvm.experimental.constrained.fptrunc``' Intrinsic
20857 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20858
20859 Syntax:
20860 """""""
20861
20862 ::
20863
20864       declare <ty2>
20865       @llvm.experimental.constrained.fptrunc(<type> <value>,
20866                                           metadata <rounding mode>,
20867                                           metadata <exception behavior>)
20868
20869 Overview:
20870 """""""""
20871
20872 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
20873 to type ``ty2``.
20874
20875 Arguments:
20876 """"""""""
20877
20878 The first argument to the '``llvm.experimental.constrained.fptrunc``'
20879 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20880 <t_vector>` of floating point values. This argument must be larger in size
20881 than the result.
20882
20883 The second and third arguments specify the rounding mode and exception
20884 behavior as described above.
20885
20886 Semantics:
20887 """"""""""
20888
20889 The result produced is a floating point value truncated to be smaller in size
20890 than the operand.
20891
20892 '``llvm.experimental.constrained.fpext``' Intrinsic
20893 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20894
20895 Syntax:
20896 """""""
20897
20898 ::
20899
20900       declare <ty2>
20901       @llvm.experimental.constrained.fpext(<type> <value>,
20902                                           metadata <exception behavior>)
20903
20904 Overview:
20905 """""""""
20906
20907 The '``llvm.experimental.constrained.fpext``' intrinsic extends a
20908 floating-point ``value`` to a larger floating-point value.
20909
20910 Arguments:
20911 """"""""""
20912
20913 The first argument to the '``llvm.experimental.constrained.fpext``'
20914 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20915 <t_vector>` of floating point values. This argument must be smaller in size
20916 than the result.
20917
20918 The second argument specifies the exception behavior as described above.
20919
20920 Semantics:
20921 """"""""""
20922
20923 The result produced is a floating point value extended to be larger in size
20924 than the operand. All restrictions that apply to the fpext instruction also
20925 apply to this intrinsic.
20926
20927 '``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
20928 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20929
20930 Syntax:
20931 """""""
20932
20933 ::
20934
20935       declare <ty2>
20936       @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
20937                                           metadata <condition code>,
20938                                           metadata <exception behavior>)
20939       declare <ty2>
20940       @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
20941                                            metadata <condition code>,
20942                                            metadata <exception behavior>)
20943
20944 Overview:
20945 """""""""
20946
20947 The '``llvm.experimental.constrained.fcmp``' and
20948 '``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
20949 value or vector of boolean values based on comparison of its operands.
20950
20951 If the operands are floating-point scalars, then the result type is a
20952 boolean (:ref:`i1 <t_integer>`).
20953
20954 If the operands are floating-point vectors, then the result type is a
20955 vector of boolean with the same number of elements as the operands being
20956 compared.
20957
20958 The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
20959 comparison operation while the '``llvm.experimental.constrained.fcmps``'
20960 intrinsic performs a signaling comparison operation.
20961
20962 Arguments:
20963 """"""""""
20964
20965 The first two arguments to the '``llvm.experimental.constrained.fcmp``'
20966 and '``llvm.experimental.constrained.fcmps``' intrinsics must be
20967 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20968 of floating-point values. Both arguments must have identical types.
20969
20970 The third argument is the condition code indicating the kind of comparison
20971 to perform. It must be a metadata string with one of the following values:
20972
20973 - "``oeq``": ordered and equal
20974 - "``ogt``": ordered and greater than
20975 - "``oge``": ordered and greater than or equal
20976 - "``olt``": ordered and less than
20977 - "``ole``": ordered and less than or equal
20978 - "``one``": ordered and not equal
20979 - "``ord``": ordered (no nans)
20980 - "``ueq``": unordered or equal
20981 - "``ugt``": unordered or greater than
20982 - "``uge``": unordered or greater than or equal
20983 - "``ult``": unordered or less than
20984 - "``ule``": unordered or less than or equal
20985 - "``une``": unordered or not equal
20986 - "``uno``": unordered (either nans)
20987
20988 *Ordered* means that neither operand is a NAN while *unordered* means
20989 that either operand may be a NAN.
20990
20991 The fourth argument specifies the exception behavior as described above.
20992
20993 Semantics:
20994 """"""""""
20995
20996 ``op1`` and ``op2`` are compared according to the condition code given
20997 as the third argument. If the operands are vectors, then the
20998 vectors are compared element by element. Each comparison performed
20999 always yields an :ref:`i1 <t_integer>` result, as follows:
21000
21001 - "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
21002   is equal to ``op2``.
21003 - "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
21004   is greater than ``op2``.
21005 - "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
21006   is greater than or equal to ``op2``.
21007 - "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
21008   is less than ``op2``.
21009 - "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
21010   is less than or equal to ``op2``.
21011 - "``one``": yields ``true`` if both operands are not a NAN and ``op1``
21012   is not equal to ``op2``.
21013 - "``ord``": yields ``true`` if both operands are not a NAN.
21014 - "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
21015   equal to ``op2``.
21016 - "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
21017   greater than ``op2``.
21018 - "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
21019   greater than or equal to ``op2``.
21020 - "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
21021   less than ``op2``.
21022 - "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
21023   less than or equal to ``op2``.
21024 - "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
21025   not equal to ``op2``.
21026 - "``uno``": yields ``true`` if either operand is a NAN.
21027
21028 The quiet comparison operation performed by
21029 '``llvm.experimental.constrained.fcmp``' will only raise an exception
21030 if either operand is a SNAN.  The signaling comparison operation
21031 performed by '``llvm.experimental.constrained.fcmps``' will raise an
21032 exception if either operand is a NAN (QNAN or SNAN). Such an exception
21033 does not preclude a result being produced (e.g. exception might only
21034 set a flag), therefore the distinction between ordered and unordered
21035 comparisons is also relevant for the
21036 '``llvm.experimental.constrained.fcmps``' intrinsic.
21037
21038 '``llvm.experimental.constrained.fmuladd``' Intrinsic
21039 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21040
21041 Syntax:
21042 """""""
21043
21044 ::
21045
21046       declare <type>
21047       @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
21048                                              <type> <op3>,
21049                                              metadata <rounding mode>,
21050                                              metadata <exception behavior>)
21051
21052 Overview:
21053 """""""""
21054
21055 The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
21056 multiply-add expressions that can be fused if the code generator determines
21057 that (a) the target instruction set has support for a fused operation,
21058 and (b) that the fused operation is more efficient than the equivalent,
21059 separate pair of mul and add instructions.
21060
21061 Arguments:
21062 """"""""""
21063
21064 The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
21065 intrinsic must be floating-point or vector of floating-point values.
21066 All three arguments must have identical types.
21067
21068 The fourth and fifth arguments specify the rounding mode and exception behavior
21069 as described above.
21070
21071 Semantics:
21072 """"""""""
21073
21074 The expression:
21075
21076 ::
21077
21078       %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
21079                                                                  metadata <rounding mode>,
21080                                                                  metadata <exception behavior>)
21081
21082 is equivalent to the expression:
21083
21084 ::
21085
21086       %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
21087                                                               metadata <rounding mode>,
21088                                                               metadata <exception behavior>)
21089       %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
21090                                                               metadata <rounding mode>,
21091                                                               metadata <exception behavior>)
21092
21093 except that it is unspecified whether rounding will be performed between the
21094 multiplication and addition steps. Fusion is not guaranteed, even if the target
21095 platform supports it.
21096 If a fused multiply-add is required, the corresponding
21097 :ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
21098 used instead.
21099 This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
21100
21101 Constrained libm-equivalent Intrinsics
21102 --------------------------------------
21103
21104 In addition to the basic floating-point operations for which constrained
21105 intrinsics are described above, there are constrained versions of various
21106 operations which provide equivalent behavior to a corresponding libm function.
21107 These intrinsics allow the precise behavior of these operations with respect to
21108 rounding mode and exception behavior to be controlled.
21109
21110 As with the basic constrained floating-point intrinsics, the rounding mode
21111 and exception behavior arguments only control the behavior of the optimizer.
21112 They do not change the runtime floating-point environment.
21113
21114
21115 '``llvm.experimental.constrained.sqrt``' Intrinsic
21116 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21117
21118 Syntax:
21119 """""""
21120
21121 ::
21122
21123       declare <type>
21124       @llvm.experimental.constrained.sqrt(<type> <op1>,
21125                                           metadata <rounding mode>,
21126                                           metadata <exception behavior>)
21127
21128 Overview:
21129 """""""""
21130
21131 The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
21132 of the specified value, returning the same value as the libm '``sqrt``'
21133 functions would, but without setting ``errno``.
21134
21135 Arguments:
21136 """"""""""
21137
21138 The first argument and the return type are floating-point numbers of the same
21139 type.
21140
21141 The second and third arguments specify the rounding mode and exception
21142 behavior as described above.
21143
21144 Semantics:
21145 """"""""""
21146
21147 This function returns the nonnegative square root of the specified value.
21148 If the value is less than negative zero, a floating-point exception occurs
21149 and the return value is architecture specific.
21150
21151
21152 '``llvm.experimental.constrained.pow``' Intrinsic
21153 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21154
21155 Syntax:
21156 """""""
21157
21158 ::
21159
21160       declare <type>
21161       @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
21162                                          metadata <rounding mode>,
21163                                          metadata <exception behavior>)
21164
21165 Overview:
21166 """""""""
21167
21168 The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
21169 raised to the (positive or negative) power specified by the second operand.
21170
21171 Arguments:
21172 """"""""""
21173
21174 The first two arguments and the return value are floating-point numbers of the
21175 same type.  The second argument specifies the power to which the first argument
21176 should be raised.
21177
21178 The third and fourth arguments specify the rounding mode and exception
21179 behavior as described above.
21180
21181 Semantics:
21182 """"""""""
21183
21184 This function returns the first value raised to the second power,
21185 returning the same values as the libm ``pow`` functions would, and
21186 handles error conditions in the same way.
21187
21188
21189 '``llvm.experimental.constrained.powi``' Intrinsic
21190 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21191
21192 Syntax:
21193 """""""
21194
21195 ::
21196
21197       declare <type>
21198       @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
21199                                           metadata <rounding mode>,
21200                                           metadata <exception behavior>)
21201
21202 Overview:
21203 """""""""
21204
21205 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
21206 raised to the (positive or negative) power specified by the second operand. The
21207 order of evaluation of multiplications is not defined. When a vector of
21208 floating-point type is used, the second argument remains a scalar integer value.
21209
21210
21211 Arguments:
21212 """"""""""
21213
21214 The first argument and the return value are floating-point numbers of the same
21215 type.  The second argument is a 32-bit signed integer specifying the power to
21216 which the first argument should be raised.
21217
21218 The third and fourth arguments specify the rounding mode and exception
21219 behavior as described above.
21220
21221 Semantics:
21222 """"""""""
21223
21224 This function returns the first value raised to the second power with an
21225 unspecified sequence of rounding operations.
21226
21227
21228 '``llvm.experimental.constrained.sin``' Intrinsic
21229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21230
21231 Syntax:
21232 """""""
21233
21234 ::
21235
21236       declare <type>
21237       @llvm.experimental.constrained.sin(<type> <op1>,
21238                                          metadata <rounding mode>,
21239                                          metadata <exception behavior>)
21240
21241 Overview:
21242 """""""""
21243
21244 The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
21245 first operand.
21246
21247 Arguments:
21248 """"""""""
21249
21250 The first argument and the return type are floating-point numbers of the same
21251 type.
21252
21253 The second and third arguments specify the rounding mode and exception
21254 behavior as described above.
21255
21256 Semantics:
21257 """"""""""
21258
21259 This function returns the sine of the specified operand, returning the
21260 same values as the libm ``sin`` functions would, and handles error
21261 conditions in the same way.
21262
21263
21264 '``llvm.experimental.constrained.cos``' Intrinsic
21265 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21266
21267 Syntax:
21268 """""""
21269
21270 ::
21271
21272       declare <type>
21273       @llvm.experimental.constrained.cos(<type> <op1>,
21274                                          metadata <rounding mode>,
21275                                          metadata <exception behavior>)
21276
21277 Overview:
21278 """""""""
21279
21280 The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
21281 first operand.
21282
21283 Arguments:
21284 """"""""""
21285
21286 The first argument and the return type are floating-point numbers of the same
21287 type.
21288
21289 The second and third arguments specify the rounding mode and exception
21290 behavior as described above.
21291
21292 Semantics:
21293 """"""""""
21294
21295 This function returns the cosine of the specified operand, returning the
21296 same values as the libm ``cos`` functions would, and handles error
21297 conditions in the same way.
21298
21299
21300 '``llvm.experimental.constrained.exp``' Intrinsic
21301 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21302
21303 Syntax:
21304 """""""
21305
21306 ::
21307
21308       declare <type>
21309       @llvm.experimental.constrained.exp(<type> <op1>,
21310                                          metadata <rounding mode>,
21311                                          metadata <exception behavior>)
21312
21313 Overview:
21314 """""""""
21315
21316 The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
21317 exponential of the specified value.
21318
21319 Arguments:
21320 """"""""""
21321
21322 The first argument and the return value are floating-point numbers of the same
21323 type.
21324
21325 The second and third arguments specify the rounding mode and exception
21326 behavior as described above.
21327
21328 Semantics:
21329 """"""""""
21330
21331 This function returns the same values as the libm ``exp`` functions
21332 would, and handles error conditions in the same way.
21333
21334
21335 '``llvm.experimental.constrained.exp2``' Intrinsic
21336 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21337
21338 Syntax:
21339 """""""
21340
21341 ::
21342
21343       declare <type>
21344       @llvm.experimental.constrained.exp2(<type> <op1>,
21345                                           metadata <rounding mode>,
21346                                           metadata <exception behavior>)
21347
21348 Overview:
21349 """""""""
21350
21351 The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
21352 exponential of the specified value.
21353
21354
21355 Arguments:
21356 """"""""""
21357
21358 The first argument and the return value are floating-point numbers of the same
21359 type.
21360
21361 The second and third arguments specify the rounding mode and exception
21362 behavior as described above.
21363
21364 Semantics:
21365 """"""""""
21366
21367 This function returns the same values as the libm ``exp2`` functions
21368 would, and handles error conditions in the same way.
21369
21370
21371 '``llvm.experimental.constrained.log``' Intrinsic
21372 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21373
21374 Syntax:
21375 """""""
21376
21377 ::
21378
21379       declare <type>
21380       @llvm.experimental.constrained.log(<type> <op1>,
21381                                          metadata <rounding mode>,
21382                                          metadata <exception behavior>)
21383
21384 Overview:
21385 """""""""
21386
21387 The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
21388 logarithm of the specified value.
21389
21390 Arguments:
21391 """"""""""
21392
21393 The first argument and the return value are floating-point numbers of the same
21394 type.
21395
21396 The second and third arguments specify the rounding mode and exception
21397 behavior as described above.
21398
21399
21400 Semantics:
21401 """"""""""
21402
21403 This function returns the same values as the libm ``log`` functions
21404 would, and handles error conditions in the same way.
21405
21406
21407 '``llvm.experimental.constrained.log10``' Intrinsic
21408 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21409
21410 Syntax:
21411 """""""
21412
21413 ::
21414
21415       declare <type>
21416       @llvm.experimental.constrained.log10(<type> <op1>,
21417                                            metadata <rounding mode>,
21418                                            metadata <exception behavior>)
21419
21420 Overview:
21421 """""""""
21422
21423 The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
21424 logarithm of the specified value.
21425
21426 Arguments:
21427 """"""""""
21428
21429 The first argument and the return value are floating-point numbers of the same
21430 type.
21431
21432 The second and third arguments specify the rounding mode and exception
21433 behavior as described above.
21434
21435 Semantics:
21436 """"""""""
21437
21438 This function returns the same values as the libm ``log10`` functions
21439 would, and handles error conditions in the same way.
21440
21441
21442 '``llvm.experimental.constrained.log2``' Intrinsic
21443 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21444
21445 Syntax:
21446 """""""
21447
21448 ::
21449
21450       declare <type>
21451       @llvm.experimental.constrained.log2(<type> <op1>,
21452                                           metadata <rounding mode>,
21453                                           metadata <exception behavior>)
21454
21455 Overview:
21456 """""""""
21457
21458 The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
21459 logarithm of the specified value.
21460
21461 Arguments:
21462 """"""""""
21463
21464 The first argument and the return value are floating-point numbers of the same
21465 type.
21466
21467 The second and third arguments specify the rounding mode and exception
21468 behavior as described above.
21469
21470 Semantics:
21471 """"""""""
21472
21473 This function returns the same values as the libm ``log2`` functions
21474 would, and handles error conditions in the same way.
21475
21476
21477 '``llvm.experimental.constrained.rint``' Intrinsic
21478 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21479
21480 Syntax:
21481 """""""
21482
21483 ::
21484
21485       declare <type>
21486       @llvm.experimental.constrained.rint(<type> <op1>,
21487                                           metadata <rounding mode>,
21488                                           metadata <exception behavior>)
21489
21490 Overview:
21491 """""""""
21492
21493 The '``llvm.experimental.constrained.rint``' intrinsic returns the first
21494 operand rounded to the nearest integer. It may raise an inexact floating-point
21495 exception if the operand is not an integer.
21496
21497 Arguments:
21498 """"""""""
21499
21500 The first argument and the return value are floating-point numbers of the same
21501 type.
21502
21503 The second and third arguments specify the rounding mode and exception
21504 behavior as described above.
21505
21506 Semantics:
21507 """"""""""
21508
21509 This function returns the same values as the libm ``rint`` functions
21510 would, and handles error conditions in the same way.  The rounding mode is
21511 described, not determined, by the rounding mode argument.  The actual rounding
21512 mode is determined by the runtime floating-point environment.  The rounding
21513 mode argument is only intended as information to the compiler.
21514
21515
21516 '``llvm.experimental.constrained.lrint``' Intrinsic
21517 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21518
21519 Syntax:
21520 """""""
21521
21522 ::
21523
21524       declare <inttype>
21525       @llvm.experimental.constrained.lrint(<fptype> <op1>,
21526                                            metadata <rounding mode>,
21527                                            metadata <exception behavior>)
21528
21529 Overview:
21530 """""""""
21531
21532 The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
21533 operand rounded to the nearest integer. An inexact floating-point exception
21534 will be raised if the operand is not an integer. An invalid exception is
21535 raised if the result is too large to fit into a supported integer type,
21536 and in this case the result is undefined.
21537
21538 Arguments:
21539 """"""""""
21540
21541 The first argument is a floating-point number. The return value is an
21542 integer type. Not all types are supported on all targets. The supported
21543 types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
21544 libm functions.
21545
21546 The second and third arguments specify the rounding mode and exception
21547 behavior as described above.
21548
21549 Semantics:
21550 """"""""""
21551
21552 This function returns the same values as the libm ``lrint`` functions
21553 would, and handles error conditions in the same way.
21554
21555 The rounding mode is described, not determined, by the rounding mode
21556 argument.  The actual rounding mode is determined by the runtime floating-point
21557 environment.  The rounding mode argument is only intended as information
21558 to the compiler.
21559
21560 If the runtime floating-point environment is using the default rounding mode
21561 then the results will be the same as the llvm.lrint intrinsic.
21562
21563
21564 '``llvm.experimental.constrained.llrint``' Intrinsic
21565 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21566
21567 Syntax:
21568 """""""
21569
21570 ::
21571
21572       declare <inttype>
21573       @llvm.experimental.constrained.llrint(<fptype> <op1>,
21574                                             metadata <rounding mode>,
21575                                             metadata <exception behavior>)
21576
21577 Overview:
21578 """""""""
21579
21580 The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
21581 operand rounded to the nearest integer. An inexact floating-point exception
21582 will be raised if the operand is not an integer. An invalid exception is
21583 raised if the result is too large to fit into a supported integer type,
21584 and in this case the result is undefined.
21585
21586 Arguments:
21587 """"""""""
21588
21589 The first argument is a floating-point number. The return value is an
21590 integer type. Not all types are supported on all targets. The supported
21591 types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
21592 libm functions.
21593
21594 The second and third arguments specify the rounding mode and exception
21595 behavior as described above.
21596
21597 Semantics:
21598 """"""""""
21599
21600 This function returns the same values as the libm ``llrint`` functions
21601 would, and handles error conditions in the same way.
21602
21603 The rounding mode is described, not determined, by the rounding mode
21604 argument.  The actual rounding mode is determined by the runtime floating-point
21605 environment.  The rounding mode argument is only intended as information
21606 to the compiler.
21607
21608 If the runtime floating-point environment is using the default rounding mode
21609 then the results will be the same as the llvm.llrint intrinsic.
21610
21611
21612 '``llvm.experimental.constrained.nearbyint``' Intrinsic
21613 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21614
21615 Syntax:
21616 """""""
21617
21618 ::
21619
21620       declare <type>
21621       @llvm.experimental.constrained.nearbyint(<type> <op1>,
21622                                                metadata <rounding mode>,
21623                                                metadata <exception behavior>)
21624
21625 Overview:
21626 """""""""
21627
21628 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
21629 operand rounded to the nearest integer. It will not raise an inexact
21630 floating-point exception if the operand is not an integer.
21631
21632
21633 Arguments:
21634 """"""""""
21635
21636 The first argument and the return value are floating-point numbers of the same
21637 type.
21638
21639 The second and third arguments specify the rounding mode and exception
21640 behavior as described above.
21641
21642 Semantics:
21643 """"""""""
21644
21645 This function returns the same values as the libm ``nearbyint`` functions
21646 would, and handles error conditions in the same way.  The rounding mode is
21647 described, not determined, by the rounding mode argument.  The actual rounding
21648 mode is determined by the runtime floating-point environment.  The rounding
21649 mode argument is only intended as information to the compiler.
21650
21651
21652 '``llvm.experimental.constrained.maxnum``' Intrinsic
21653 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21654
21655 Syntax:
21656 """""""
21657
21658 ::
21659
21660       declare <type>
21661       @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
21662                                             metadata <exception behavior>)
21663
21664 Overview:
21665 """""""""
21666
21667 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
21668 of the two arguments.
21669
21670 Arguments:
21671 """"""""""
21672
21673 The first two arguments and the return value are floating-point numbers
21674 of the same type.
21675
21676 The third argument specifies the exception behavior as described above.
21677
21678 Semantics:
21679 """"""""""
21680
21681 This function follows the IEEE-754 semantics for maxNum.
21682
21683
21684 '``llvm.experimental.constrained.minnum``' Intrinsic
21685 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21686
21687 Syntax:
21688 """""""
21689
21690 ::
21691
21692       declare <type>
21693       @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
21694                                             metadata <exception behavior>)
21695
21696 Overview:
21697 """""""""
21698
21699 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
21700 of the two arguments.
21701
21702 Arguments:
21703 """"""""""
21704
21705 The first two arguments and the return value are floating-point numbers
21706 of the same type.
21707
21708 The third argument specifies the exception behavior as described above.
21709
21710 Semantics:
21711 """"""""""
21712
21713 This function follows the IEEE-754 semantics for minNum.
21714
21715
21716 '``llvm.experimental.constrained.maximum``' Intrinsic
21717 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21718
21719 Syntax:
21720 """""""
21721
21722 ::
21723
21724       declare <type>
21725       @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
21726                                              metadata <exception behavior>)
21727
21728 Overview:
21729 """""""""
21730
21731 The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
21732 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21733
21734 Arguments:
21735 """"""""""
21736
21737 The first two arguments and the return value are floating-point numbers
21738 of the same type.
21739
21740 The third argument specifies the exception behavior as described above.
21741
21742 Semantics:
21743 """"""""""
21744
21745 This function follows semantics specified in the draft of IEEE 754-2018.
21746
21747
21748 '``llvm.experimental.constrained.minimum``' Intrinsic
21749 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21750
21751 Syntax:
21752 """""""
21753
21754 ::
21755
21756       declare <type>
21757       @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
21758                                              metadata <exception behavior>)
21759
21760 Overview:
21761 """""""""
21762
21763 The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
21764 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21765
21766 Arguments:
21767 """"""""""
21768
21769 The first two arguments and the return value are floating-point numbers
21770 of the same type.
21771
21772 The third argument specifies the exception behavior as described above.
21773
21774 Semantics:
21775 """"""""""
21776
21777 This function follows semantics specified in the draft of IEEE 754-2018.
21778
21779
21780 '``llvm.experimental.constrained.ceil``' Intrinsic
21781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21782
21783 Syntax:
21784 """""""
21785
21786 ::
21787
21788       declare <type>
21789       @llvm.experimental.constrained.ceil(<type> <op1>,
21790                                           metadata <exception behavior>)
21791
21792 Overview:
21793 """""""""
21794
21795 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
21796 first operand.
21797
21798 Arguments:
21799 """"""""""
21800
21801 The first argument and the return value are floating-point numbers of the same
21802 type.
21803
21804 The second argument specifies the exception behavior as described above.
21805
21806 Semantics:
21807 """"""""""
21808
21809 This function returns the same values as the libm ``ceil`` functions
21810 would and handles error conditions in the same way.
21811
21812
21813 '``llvm.experimental.constrained.floor``' Intrinsic
21814 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21815
21816 Syntax:
21817 """""""
21818
21819 ::
21820
21821       declare <type>
21822       @llvm.experimental.constrained.floor(<type> <op1>,
21823                                            metadata <exception behavior>)
21824
21825 Overview:
21826 """""""""
21827
21828 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
21829 first operand.
21830
21831 Arguments:
21832 """"""""""
21833
21834 The first argument and the return value are floating-point numbers of the same
21835 type.
21836
21837 The second argument specifies the exception behavior as described above.
21838
21839 Semantics:
21840 """"""""""
21841
21842 This function returns the same values as the libm ``floor`` functions
21843 would and handles error conditions in the same way.
21844
21845
21846 '``llvm.experimental.constrained.round``' Intrinsic
21847 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21848
21849 Syntax:
21850 """""""
21851
21852 ::
21853
21854       declare <type>
21855       @llvm.experimental.constrained.round(<type> <op1>,
21856                                            metadata <exception behavior>)
21857
21858 Overview:
21859 """""""""
21860
21861 The '``llvm.experimental.constrained.round``' intrinsic returns the first
21862 operand rounded to the nearest integer.
21863
21864 Arguments:
21865 """"""""""
21866
21867 The first argument and the return value are floating-point numbers of the same
21868 type.
21869
21870 The second argument specifies the exception behavior as described above.
21871
21872 Semantics:
21873 """"""""""
21874
21875 This function returns the same values as the libm ``round`` functions
21876 would and handles error conditions in the same way.
21877
21878
21879 '``llvm.experimental.constrained.roundeven``' Intrinsic
21880 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21881
21882 Syntax:
21883 """""""
21884
21885 ::
21886
21887       declare <type>
21888       @llvm.experimental.constrained.roundeven(<type> <op1>,
21889                                                metadata <exception behavior>)
21890
21891 Overview:
21892 """""""""
21893
21894 The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
21895 operand rounded to the nearest integer in floating-point format, rounding
21896 halfway cases to even (that is, to the nearest value that is an even integer),
21897 regardless of the current rounding direction.
21898
21899 Arguments:
21900 """"""""""
21901
21902 The first argument and the return value are floating-point numbers of the same
21903 type.
21904
21905 The second argument specifies the exception behavior as described above.
21906
21907 Semantics:
21908 """"""""""
21909
21910 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
21911 also behaves in the same way as C standard function ``roundeven`` and can signal
21912 the invalid operation exception for a SNAN operand.
21913
21914
21915 '``llvm.experimental.constrained.lround``' Intrinsic
21916 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21917
21918 Syntax:
21919 """""""
21920
21921 ::
21922
21923       declare <inttype>
21924       @llvm.experimental.constrained.lround(<fptype> <op1>,
21925                                             metadata <exception behavior>)
21926
21927 Overview:
21928 """""""""
21929
21930 The '``llvm.experimental.constrained.lround``' intrinsic returns the first
21931 operand rounded to the nearest integer with ties away from zero.  It will
21932 raise an inexact floating-point exception if the operand is not an integer.
21933 An invalid exception is raised if the result is too large to fit into a
21934 supported integer type, and in this case the result is undefined.
21935
21936 Arguments:
21937 """"""""""
21938
21939 The first argument is a floating-point number. The return value is an
21940 integer type. Not all types are supported on all targets. The supported
21941 types are the same as the ``llvm.lround`` intrinsic and the ``lround``
21942 libm functions.
21943
21944 The second argument specifies the exception behavior as described above.
21945
21946 Semantics:
21947 """"""""""
21948
21949 This function returns the same values as the libm ``lround`` functions
21950 would and handles error conditions in the same way.
21951
21952
21953 '``llvm.experimental.constrained.llround``' Intrinsic
21954 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21955
21956 Syntax:
21957 """""""
21958
21959 ::
21960
21961       declare <inttype>
21962       @llvm.experimental.constrained.llround(<fptype> <op1>,
21963                                              metadata <exception behavior>)
21964
21965 Overview:
21966 """""""""
21967
21968 The '``llvm.experimental.constrained.llround``' intrinsic returns the first
21969 operand rounded to the nearest integer with ties away from zero. It will
21970 raise an inexact floating-point exception if the operand is not an integer.
21971 An invalid exception is raised if the result is too large to fit into a
21972 supported integer type, and in this case the result is undefined.
21973
21974 Arguments:
21975 """"""""""
21976
21977 The first argument is a floating-point number. The return value is an
21978 integer type. Not all types are supported on all targets. The supported
21979 types are the same as the ``llvm.llround`` intrinsic and the ``llround``
21980 libm functions.
21981
21982 The second argument specifies the exception behavior as described above.
21983
21984 Semantics:
21985 """"""""""
21986
21987 This function returns the same values as the libm ``llround`` functions
21988 would and handles error conditions in the same way.
21989
21990
21991 '``llvm.experimental.constrained.trunc``' Intrinsic
21992 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21993
21994 Syntax:
21995 """""""
21996
21997 ::
21998
21999       declare <type>
22000       @llvm.experimental.constrained.trunc(<type> <op1>,
22001                                            metadata <exception behavior>)
22002
22003 Overview:
22004 """""""""
22005
22006 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
22007 operand rounded to the nearest integer not larger in magnitude than the
22008 operand.
22009
22010 Arguments:
22011 """"""""""
22012
22013 The first argument and the return value are floating-point numbers of the same
22014 type.
22015
22016 The second argument specifies the exception behavior as described above.
22017
22018 Semantics:
22019 """"""""""
22020
22021 This function returns the same values as the libm ``trunc`` functions
22022 would and handles error conditions in the same way.
22023
22024 .. _int_experimental_noalias_scope_decl:
22025
22026 '``llvm.experimental.noalias.scope.decl``' Intrinsic
22027 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22028
22029 Syntax:
22030 """""""
22031
22032
22033 ::
22034
22035       declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
22036
22037 Overview:
22038 """""""""
22039
22040 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
22041 noalias scope is declared. When the intrinsic is duplicated, a decision must
22042 also be made about the scope: depending on the reason of the duplication,
22043 the scope might need to be duplicated as well.
22044
22045
22046 Arguments:
22047 """"""""""
22048
22049 The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
22050 metadata references. The format is identical to that required for ``noalias``
22051 metadata. This list must have exactly one element.
22052
22053 Semantics:
22054 """"""""""
22055
22056 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
22057 noalias scope is declared. When the intrinsic is duplicated, a decision must
22058 also be made about the scope: depending on the reason of the duplication,
22059 the scope might need to be duplicated as well.
22060
22061 For example, when the intrinsic is used inside a loop body, and that loop is
22062 unrolled, the associated noalias scope must also be duplicated. Otherwise, the
22063 noalias property it signifies would spill across loop iterations, whereas it
22064 was only valid within a single iteration.
22065
22066 .. code-block:: llvm
22067
22068   ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
22069   ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
22070   ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
22071   declare void @decl_in_loop(i8* %a.base, i8* %b.base) {
22072   entry:
22073     ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
22074     br label %loop
22075
22076   loop:
22077     %a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ]
22078     %b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ]
22079     ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
22080     %val = load i8, i8* %a, !alias.scope !2
22081     store i8 %val, i8* %b, !noalias !2
22082     %a.inc = getelementptr inbounds i8, i8* %a, i64 1
22083     %b.inc = getelementptr inbounds i8, i8* %b, i64 1
22084     %cond = call i1 @cond()
22085     br i1 %cond, label %loop, label %exit
22086
22087   exit:
22088     ret void
22089   }
22090
22091   !0 = !{!0} ; domain
22092   !1 = !{!1, !0} ; scope
22093   !2 = !{!1} ; scope list
22094
22095 Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
22096 are possible, but one should never dominate another. Violations are pointed out
22097 by the verifier as they indicate a problem in either a transformation pass or
22098 the input.
22099
22100
22101 Floating Point Environment Manipulation intrinsics
22102 --------------------------------------------------
22103
22104 These functions read or write floating point environment, such as rounding
22105 mode or state of floating point exceptions. Altering the floating point
22106 environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
22107
22108 '``llvm.flt.rounds``' Intrinsic
22109 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22110
22111 Syntax:
22112 """""""
22113
22114 ::
22115
22116       declare i32 @llvm.flt.rounds()
22117
22118 Overview:
22119 """""""""
22120
22121 The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.
22122
22123 Semantics:
22124 """"""""""
22125
22126 The '``llvm.flt.rounds``' intrinsic returns the current rounding mode.
22127 Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
22128 specified by C standard:
22129
22130 ::
22131
22132     0  - toward zero
22133     1  - to nearest, ties to even
22134     2  - toward positive infinity
22135     3  - toward negative infinity
22136     4  - to nearest, ties away from zero
22137
22138 Other values may be used to represent additional rounding modes, supported by a
22139 target. These values are target-specific.
22140
22141
22142 '``llvm.set.rounding``' Intrinsic
22143 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22144
22145 Syntax:
22146 """""""
22147
22148 ::
22149
22150       declare void @llvm.set.rounding(i32 <val>)
22151
22152 Overview:
22153 """""""""
22154
22155 The '``llvm.set.rounding``' intrinsic sets current rounding mode.
22156
22157 Arguments:
22158 """"""""""
22159
22160 The argument is the required rounding mode. Encoding of rounding mode is
22161 the same as used by '``llvm.flt.rounds``'.
22162
22163 Semantics:
22164 """"""""""
22165
22166 The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
22167 similar to C library function 'fesetround', however this intrinsic does not
22168 return any value and uses platform-independent representation of IEEE rounding
22169 modes.
22170
22171
22172 General Intrinsics
22173 ------------------
22174
22175 This class of intrinsics is designed to be generic and has no specific
22176 purpose.
22177
22178 '``llvm.var.annotation``' Intrinsic
22179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22180
22181 Syntax:
22182 """""""
22183
22184 ::
22185
22186       declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
22187
22188 Overview:
22189 """""""""
22190
22191 The '``llvm.var.annotation``' intrinsic.
22192
22193 Arguments:
22194 """"""""""
22195
22196 The first argument is a pointer to a value, the second is a pointer to a
22197 global string, the third is a pointer to a global string which is the
22198 source file name, and the last argument is the line number.
22199
22200 Semantics:
22201 """"""""""
22202
22203 This intrinsic allows annotation of local variables with arbitrary
22204 strings. This can be useful for special purpose optimizations that want
22205 to look for these annotations. These have no other defined use; they are
22206 ignored by code generation and optimization.
22207
22208 '``llvm.ptr.annotation.*``' Intrinsic
22209 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22210
22211 Syntax:
22212 """""""
22213
22214 This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
22215 pointer to an integer of any width. *NOTE* you must specify an address space for
22216 the pointer. The identifier for the default address space is the integer
22217 '``0``'.
22218
22219 ::
22220
22221       declare i8*   @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
22222       declare i16*  @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32  <int>)
22223       declare i32*  @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32  <int>)
22224       declare i64*  @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32  <int>)
22225       declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32  <int>)
22226
22227 Overview:
22228 """""""""
22229
22230 The '``llvm.ptr.annotation``' intrinsic.
22231
22232 Arguments:
22233 """"""""""
22234
22235 The first argument is a pointer to an integer value of arbitrary bitwidth
22236 (result of some expression), the second is a pointer to a global string, the
22237 third is a pointer to a global string which is the source file name, and the
22238 last argument is the line number. It returns the value of the first argument.
22239
22240 Semantics:
22241 """"""""""
22242
22243 This intrinsic allows annotation of a pointer to an integer with arbitrary
22244 strings. This can be useful for special purpose optimizations that want to look
22245 for these annotations. These have no other defined use; they are ignored by code
22246 generation and optimization.
22247
22248 '``llvm.annotation.*``' Intrinsic
22249 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22250
22251 Syntax:
22252 """""""
22253
22254 This is an overloaded intrinsic. You can use '``llvm.annotation``' on
22255 any integer bit width.
22256
22257 ::
22258
22259       declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32  <int>)
22260       declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32  <int>)
22261       declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32  <int>)
22262       declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32  <int>)
22263       declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32  <int>)
22264
22265 Overview:
22266 """""""""
22267
22268 The '``llvm.annotation``' intrinsic.
22269
22270 Arguments:
22271 """"""""""
22272
22273 The first argument is an integer value (result of some expression), the
22274 second is a pointer to a global string, the third is a pointer to a
22275 global string which is the source file name, and the last argument is
22276 the line number. It returns the value of the first argument.
22277
22278 Semantics:
22279 """"""""""
22280
22281 This intrinsic allows annotations to be put on arbitrary expressions
22282 with arbitrary strings. This can be useful for special purpose
22283 optimizations that want to look for these annotations. These have no
22284 other defined use; they are ignored by code generation and optimization.
22285
22286 '``llvm.codeview.annotation``' Intrinsic
22287 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22288
22289 Syntax:
22290 """""""
22291
22292 This annotation emits a label at its program point and an associated
22293 ``S_ANNOTATION`` codeview record with some additional string metadata. This is
22294 used to implement MSVC's ``__annotation`` intrinsic. It is marked
22295 ``noduplicate``, so calls to this intrinsic prevent inlining and should be
22296 considered expensive.
22297
22298 ::
22299
22300       declare void @llvm.codeview.annotation(metadata)
22301
22302 Arguments:
22303 """"""""""
22304
22305 The argument should be an MDTuple containing any number of MDStrings.
22306
22307 '``llvm.trap``' Intrinsic
22308 ^^^^^^^^^^^^^^^^^^^^^^^^^
22309
22310 Syntax:
22311 """""""
22312
22313 ::
22314
22315       declare void @llvm.trap() cold noreturn nounwind
22316
22317 Overview:
22318 """""""""
22319
22320 The '``llvm.trap``' intrinsic.
22321
22322 Arguments:
22323 """"""""""
22324
22325 None.
22326
22327 Semantics:
22328 """"""""""
22329
22330 This intrinsic is lowered to the target dependent trap instruction. If
22331 the target does not have a trap instruction, this intrinsic will be
22332 lowered to a call of the ``abort()`` function.
22333
22334 '``llvm.debugtrap``' Intrinsic
22335 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22336
22337 Syntax:
22338 """""""
22339
22340 ::
22341
22342       declare void @llvm.debugtrap() nounwind
22343
22344 Overview:
22345 """""""""
22346
22347 The '``llvm.debugtrap``' intrinsic.
22348
22349 Arguments:
22350 """"""""""
22351
22352 None.
22353
22354 Semantics:
22355 """"""""""
22356
22357 This intrinsic is lowered to code which is intended to cause an
22358 execution trap with the intention of requesting the attention of a
22359 debugger.
22360
22361 '``llvm.ubsantrap``' Intrinsic
22362 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22363
22364 Syntax:
22365 """""""
22366
22367 ::
22368
22369       declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
22370
22371 Overview:
22372 """""""""
22373
22374 The '``llvm.ubsantrap``' intrinsic.
22375
22376 Arguments:
22377 """"""""""
22378
22379 An integer describing the kind of failure detected.
22380
22381 Semantics:
22382 """"""""""
22383
22384 This intrinsic is lowered to code which is intended to cause an execution trap,
22385 embedding the argument into encoding of that trap somehow to discriminate
22386 crashes if possible.
22387
22388 Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
22389
22390 '``llvm.stackprotector``' Intrinsic
22391 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22392
22393 Syntax:
22394 """""""
22395
22396 ::
22397
22398       declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
22399
22400 Overview:
22401 """""""""
22402
22403 The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
22404 onto the stack at ``slot``. The stack slot is adjusted to ensure that it
22405 is placed on the stack before local variables.
22406
22407 Arguments:
22408 """"""""""
22409
22410 The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
22411 The first argument is the value loaded from the stack guard
22412 ``@__stack_chk_guard``. The second variable is an ``alloca`` that has
22413 enough space to hold the value of the guard.
22414
22415 Semantics:
22416 """"""""""
22417
22418 This intrinsic causes the prologue/epilogue inserter to force the position of
22419 the ``AllocaInst`` stack slot to be before local variables on the stack. This is
22420 to ensure that if a local variable on the stack is overwritten, it will destroy
22421 the value of the guard. When the function exits, the guard on the stack is
22422 checked against the original guard by ``llvm.stackprotectorcheck``. If they are
22423 different, then ``llvm.stackprotectorcheck`` causes the program to abort by
22424 calling the ``__stack_chk_fail()`` function.
22425
22426 '``llvm.stackguard``' Intrinsic
22427 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22428
22429 Syntax:
22430 """""""
22431
22432 ::
22433
22434       declare i8* @llvm.stackguard()
22435
22436 Overview:
22437 """""""""
22438
22439 The ``llvm.stackguard`` intrinsic returns the system stack guard value.
22440
22441 It should not be generated by frontends, since it is only for internal usage.
22442 The reason why we create this intrinsic is that we still support IR form Stack
22443 Protector in FastISel.
22444
22445 Arguments:
22446 """"""""""
22447
22448 None.
22449
22450 Semantics:
22451 """"""""""
22452
22453 On some platforms, the value returned by this intrinsic remains unchanged
22454 between loads in the same thread. On other platforms, it returns the same
22455 global variable value, if any, e.g. ``@__stack_chk_guard``.
22456
22457 Currently some platforms have IR-level customized stack guard loading (e.g.
22458 X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
22459 in the future.
22460
22461 '``llvm.objectsize``' Intrinsic
22462 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22463
22464 Syntax:
22465 """""""
22466
22467 ::
22468
22469       declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22470       declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22471
22472 Overview:
22473 """""""""
22474
22475 The ``llvm.objectsize`` intrinsic is designed to provide information to the
22476 optimizer to determine whether a) an operation (like memcpy) will overflow a
22477 buffer that corresponds to an object, or b) that a runtime check for overflow
22478 isn't necessary. An object in this context means an allocation of a specific
22479 class, structure, array, or other object.
22480
22481 Arguments:
22482 """"""""""
22483
22484 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
22485 pointer to or into the ``object``. The second argument determines whether
22486 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
22487 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
22488 in address space 0 is used as its pointer argument. If it's ``false``,
22489 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
22490 the ``null`` is in a non-zero address space or if ``true`` is given for the
22491 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
22492 argument to ``llvm.objectsize`` determines if the value should be evaluated at
22493 runtime.
22494
22495 The second, third, and fourth arguments only accept constants.
22496
22497 Semantics:
22498 """"""""""
22499
22500 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
22501 the object concerned. If the size cannot be determined, ``llvm.objectsize``
22502 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
22503
22504 '``llvm.expect``' Intrinsic
22505 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
22506
22507 Syntax:
22508 """""""
22509
22510 This is an overloaded intrinsic. You can use ``llvm.expect`` on any
22511 integer bit width.
22512
22513 ::
22514
22515       declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
22516       declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
22517       declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
22518
22519 Overview:
22520 """""""""
22521
22522 The ``llvm.expect`` intrinsic provides information about expected (the
22523 most probable) value of ``val``, which can be used by optimizers.
22524
22525 Arguments:
22526 """"""""""
22527
22528 The ``llvm.expect`` intrinsic takes two arguments. The first argument is
22529 a value. The second argument is an expected value.
22530
22531 Semantics:
22532 """"""""""
22533
22534 This intrinsic is lowered to the ``val``.
22535
22536 '``llvm.expect.with.probability``' Intrinsic
22537 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22538
22539 Syntax:
22540 """""""
22541
22542 This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
22543 You can use ``llvm.expect.with.probability`` on any integer bit width.
22544
22545 ::
22546
22547       declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
22548       declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
22549       declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
22550
22551 Overview:
22552 """""""""
22553
22554 The ``llvm.expect.with.probability`` intrinsic provides information about
22555 expected value of ``val`` with probability(or confidence) ``prob``, which can
22556 be used by optimizers.
22557
22558 Arguments:
22559 """"""""""
22560
22561 The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
22562 argument is a value. The second argument is an expected value. The third
22563 argument is a probability.
22564
22565 Semantics:
22566 """"""""""
22567
22568 This intrinsic is lowered to the ``val``.
22569
22570 .. _int_assume:
22571
22572 '``llvm.assume``' Intrinsic
22573 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22574
22575 Syntax:
22576 """""""
22577
22578 ::
22579
22580       declare void @llvm.assume(i1 %cond)
22581
22582 Overview:
22583 """""""""
22584
22585 The ``llvm.assume`` allows the optimizer to assume that the provided
22586 condition is true. This information can then be used in simplifying other parts
22587 of the code.
22588
22589 More complex assumptions can be encoded as
22590 :ref:`assume operand bundles <assume_opbundles>`.
22591
22592 Arguments:
22593 """"""""""
22594
22595 The argument of the call is the condition which the optimizer may assume is
22596 always true.
22597
22598 Semantics:
22599 """"""""""
22600
22601 The intrinsic allows the optimizer to assume that the provided condition is
22602 always true whenever the control flow reaches the intrinsic call. No code is
22603 generated for this intrinsic, and instructions that contribute only to the
22604 provided condition are not used for code generation. If the condition is
22605 violated during execution, the behavior is undefined.
22606
22607 Note that the optimizer might limit the transformations performed on values
22608 used by the ``llvm.assume`` intrinsic in order to preserve the instructions
22609 only used to form the intrinsic's input argument. This might prove undesirable
22610 if the extra information provided by the ``llvm.assume`` intrinsic does not cause
22611 sufficient overall improvement in code quality. For this reason,
22612 ``llvm.assume`` should not be used to document basic mathematical invariants
22613 that the optimizer can otherwise deduce or facts that are of little use to the
22614 optimizer.
22615
22616 .. _int_ssa_copy:
22617
22618 '``llvm.ssa.copy``' Intrinsic
22619 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22620
22621 Syntax:
22622 """""""
22623
22624 ::
22625
22626       declare type @llvm.ssa.copy(type %operand) returned(1) readnone
22627
22628 Arguments:
22629 """"""""""
22630
22631 The first argument is an operand which is used as the returned value.
22632
22633 Overview:
22634 """"""""""
22635
22636 The ``llvm.ssa.copy`` intrinsic can be used to attach information to
22637 operations by copying them and giving them new names.  For example,
22638 the PredicateInfo utility uses it to build Extended SSA form, and
22639 attach various forms of information to operands that dominate specific
22640 uses.  It is not meant for general use, only for building temporary
22641 renaming forms that require value splits at certain points.
22642
22643 .. _type.test:
22644
22645 '``llvm.type.test``' Intrinsic
22646 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22647
22648 Syntax:
22649 """""""
22650
22651 ::
22652
22653       declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
22654
22655
22656 Arguments:
22657 """"""""""
22658
22659 The first argument is a pointer to be tested. The second argument is a
22660 metadata object representing a :doc:`type identifier <TypeMetadata>`.
22661
22662 Overview:
22663 """""""""
22664
22665 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
22666 with the given type identifier.
22667
22668 .. _type.checked.load:
22669
22670 '``llvm.type.checked.load``' Intrinsic
22671 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22672
22673 Syntax:
22674 """""""
22675
22676 ::
22677
22678       declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
22679
22680
22681 Arguments:
22682 """"""""""
22683
22684 The first argument is a pointer from which to load a function pointer. The
22685 second argument is the byte offset from which to load the function pointer. The
22686 third argument is a metadata object representing a :doc:`type identifier
22687 <TypeMetadata>`.
22688
22689 Overview:
22690 """""""""
22691
22692 The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
22693 virtual table pointer using type metadata. This intrinsic is used to implement
22694 control flow integrity in conjunction with virtual call optimization. The
22695 virtual call optimization pass will optimize away ``llvm.type.checked.load``
22696 intrinsics associated with devirtualized calls, thereby removing the type
22697 check in cases where it is not needed to enforce the control flow integrity
22698 constraint.
22699
22700 If the given pointer is associated with a type metadata identifier, this
22701 function returns true as the second element of its return value. (Note that
22702 the function may also return true if the given pointer is not associated
22703 with a type metadata identifier.) If the function's return value's second
22704 element is true, the following rules apply to the first element:
22705
22706 - If the given pointer is associated with the given type metadata identifier,
22707   it is the function pointer loaded from the given byte offset from the given
22708   pointer.
22709
22710 - If the given pointer is not associated with the given type metadata
22711   identifier, it is one of the following (the choice of which is unspecified):
22712
22713   1. The function pointer that would have been loaded from an arbitrarily chosen
22714      (through an unspecified mechanism) pointer associated with the type
22715      metadata.
22716
22717   2. If the function has a non-void return type, a pointer to a function that
22718      returns an unspecified value without causing side effects.
22719
22720 If the function's return value's second element is false, the value of the
22721 first element is undefined.
22722
22723
22724 '``llvm.arithmetic.fence``' Intrinsic
22725 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22726
22727 Syntax:
22728 """""""
22729
22730 ::
22731
22732       declare <type>
22733       @llvm.arithmetic.fence(<type> <op>)
22734
22735 Overview:
22736 """""""""
22737
22738 The purpose of the ``llvm.arithmetic.fence`` intrinsic
22739 is to prevent the optimizer from performing fast-math optimizations,
22740 particularly reassociation,
22741 between the argument and the expression that contains the argument.
22742 It can be used to preserve the parentheses in the source language.
22743
22744 Arguments:
22745 """"""""""
22746
22747 The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
22748 The argument and the return value are floating-point numbers,
22749 or vector floating-point numbers, of the same type.
22750
22751 Semantics:
22752 """"""""""
22753
22754 This intrinsic returns the value of its operand. The optimizer can optimize
22755 the argument, but the optimizer cannot hoist any component of the operand
22756 to the containing context, and the optimizer cannot move the calculation of
22757 any expression in the containing context into the operand.
22758
22759
22760 '``llvm.donothing``' Intrinsic
22761 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22762
22763 Syntax:
22764 """""""
22765
22766 ::
22767
22768       declare void @llvm.donothing() nounwind readnone
22769
22770 Overview:
22771 """""""""
22772
22773 The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
22774 three intrinsics (besides ``llvm.experimental.patchpoint`` and
22775 ``llvm.experimental.gc.statepoint``) that can be called with an invoke
22776 instruction.
22777
22778 Arguments:
22779 """"""""""
22780
22781 None.
22782
22783 Semantics:
22784 """"""""""
22785
22786 This intrinsic does nothing, and it's removed by optimizers and ignored
22787 by codegen.
22788
22789 '``llvm.experimental.deoptimize``' Intrinsic
22790 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22791
22792 Syntax:
22793 """""""
22794
22795 ::
22796
22797       declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
22798
22799 Overview:
22800 """""""""
22801
22802 This intrinsic, together with :ref:`deoptimization operand bundles
22803 <deopt_opbundles>`, allow frontends to express transfer of control and
22804 frame-local state from the currently executing (typically more specialized,
22805 hence faster) version of a function into another (typically more generic, hence
22806 slower) version.
22807
22808 In languages with a fully integrated managed runtime like Java and JavaScript
22809 this intrinsic can be used to implement "uncommon trap" or "side exit" like
22810 functionality.  In unmanaged languages like C and C++, this intrinsic can be
22811 used to represent the slow paths of specialized functions.
22812
22813
22814 Arguments:
22815 """"""""""
22816
22817 The intrinsic takes an arbitrary number of arguments, whose meaning is
22818 decided by the :ref:`lowering strategy<deoptimize_lowering>`.
22819
22820 Semantics:
22821 """"""""""
22822
22823 The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
22824 deoptimization continuation (denoted using a :ref:`deoptimization
22825 operand bundle <deopt_opbundles>`) and returns the value returned by
22826 the deoptimization continuation.  Defining the semantic properties of
22827 the continuation itself is out of scope of the language reference --
22828 as far as LLVM is concerned, the deoptimization continuation can
22829 invoke arbitrary side effects, including reading from and writing to
22830 the entire heap.
22831
22832 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
22833 continue execution to the end of the physical frame containing them, so all
22834 calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
22835
22836    - ``@llvm.experimental.deoptimize`` cannot be invoked.
22837    - The call must immediately precede a :ref:`ret <i_ret>` instruction.
22838    - The ``ret`` instruction must return the value produced by the
22839      ``@llvm.experimental.deoptimize`` call if there is one, or void.
22840
22841 Note that the above restrictions imply that the return type for a call to
22842 ``@llvm.experimental.deoptimize`` will match the return type of its immediate
22843 caller.
22844
22845 The inliner composes the ``"deopt"`` continuations of the caller into the
22846 ``"deopt"`` continuations present in the inlinee, and also updates calls to this
22847 intrinsic to return directly from the frame of the function it inlined into.
22848
22849 All declarations of ``@llvm.experimental.deoptimize`` must share the
22850 same calling convention.
22851
22852 .. _deoptimize_lowering:
22853
22854 Lowering:
22855 """""""""
22856
22857 Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
22858 symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
22859 ensure that this symbol is defined).  The call arguments to
22860 ``@llvm.experimental.deoptimize`` are lowered as if they were formal
22861 arguments of the specified types, and not as varargs.
22862
22863
22864 '``llvm.experimental.guard``' Intrinsic
22865 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22866
22867 Syntax:
22868 """""""
22869
22870 ::
22871
22872       declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
22873
22874 Overview:
22875 """""""""
22876
22877 This intrinsic, together with :ref:`deoptimization operand bundles
22878 <deopt_opbundles>`, allows frontends to express guards or checks on
22879 optimistic assumptions made during compilation.  The semantics of
22880 ``@llvm.experimental.guard`` is defined in terms of
22881 ``@llvm.experimental.deoptimize`` -- its body is defined to be
22882 equivalent to:
22883
22884 .. code-block:: text
22885
22886   define void @llvm.experimental.guard(i1 %pred, <args...>) {
22887     %realPred = and i1 %pred, undef
22888     br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
22889
22890   leave:
22891     call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
22892     ret void
22893
22894   continue:
22895     ret void
22896   }
22897
22898
22899 with the optional ``[, !make.implicit !{}]`` present if and only if it
22900 is present on the call site.  For more details on ``!make.implicit``,
22901 see :doc:`FaultMaps`.
22902
22903 In words, ``@llvm.experimental.guard`` executes the attached
22904 ``"deopt"`` continuation if (but **not** only if) its first argument
22905 is ``false``.  Since the optimizer is allowed to replace the ``undef``
22906 with an arbitrary value, it can optimize guard to fail "spuriously",
22907 i.e. without the original condition being false (hence the "not only
22908 if"); and this allows for "check widening" type optimizations.
22909
22910 ``@llvm.experimental.guard`` cannot be invoked.
22911
22912 After ``@llvm.experimental.guard`` was first added, a more general
22913 formulation was found in ``@llvm.experimental.widenable.condition``.
22914 Support for ``@llvm.experimental.guard`` is slowly being rephrased in
22915 terms of this alternate.
22916
22917 '``llvm.experimental.widenable.condition``' Intrinsic
22918 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22919
22920 Syntax:
22921 """""""
22922
22923 ::
22924
22925       declare i1 @llvm.experimental.widenable.condition()
22926
22927 Overview:
22928 """""""""
22929
22930 This intrinsic represents a "widenable condition" which is
22931 boolean expressions with the following property: whether this
22932 expression is `true` or `false`, the program is correct and
22933 well-defined.
22934
22935 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
22936 ``@llvm.experimental.widenable.condition`` allows frontends to
22937 express guards or checks on optimistic assumptions made during
22938 compilation and represent them as branch instructions on special
22939 conditions.
22940
22941 While this may appear similar in semantics to `undef`, it is very
22942 different in that an invocation produces a particular, singular
22943 value. It is also intended to be lowered late, and remain available
22944 for specific optimizations and transforms that can benefit from its
22945 special properties.
22946
22947 Arguments:
22948 """"""""""
22949
22950 None.
22951
22952 Semantics:
22953 """"""""""
22954
22955 The intrinsic ``@llvm.experimental.widenable.condition()``
22956 returns either `true` or `false`. For each evaluation of a call
22957 to this intrinsic, the program must be valid and correct both if
22958 it returns `true` and if it returns `false`. This allows
22959 transformation passes to replace evaluations of this intrinsic
22960 with either value whenever one is beneficial.
22961
22962 When used in a branch condition, it allows us to choose between
22963 two alternative correct solutions for the same problem, like
22964 in example below:
22965
22966 .. code-block:: text
22967
22968     %cond = call i1 @llvm.experimental.widenable.condition()
22969     br i1 %cond, label %solution_1, label %solution_2
22970
22971   label %fast_path:
22972     ; Apply memory-consuming but fast solution for a task.
22973
22974   label %slow_path:
22975     ; Cheap in memory but slow solution.
22976
22977 Whether the result of intrinsic's call is `true` or `false`,
22978 it should be correct to pick either solution. We can switch
22979 between them by replacing the result of
22980 ``@llvm.experimental.widenable.condition`` with different
22981 `i1` expressions.
22982
22983 This is how it can be used to represent guards as widenable branches:
22984
22985 .. code-block:: text
22986
22987   block:
22988     ; Unguarded instructions
22989     call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
22990     ; Guarded instructions
22991
22992 Can be expressed in an alternative equivalent form of explicit branch using
22993 ``@llvm.experimental.widenable.condition``:
22994
22995 .. code-block:: text
22996
22997   block:
22998     ; Unguarded instructions
22999     %widenable_condition = call i1 @llvm.experimental.widenable.condition()
23000     %guard_condition = and i1 %cond, %widenable_condition
23001     br i1 %guard_condition, label %guarded, label %deopt
23002
23003   guarded:
23004     ; Guarded instructions
23005
23006   deopt:
23007     call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
23008
23009 So the block `guarded` is only reachable when `%cond` is `true`,
23010 and it should be valid to go to the block `deopt` whenever `%cond`
23011 is `true` or `false`.
23012
23013 ``@llvm.experimental.widenable.condition`` will never throw, thus
23014 it cannot be invoked.
23015
23016 Guard widening:
23017 """""""""""""""
23018
23019 When ``@llvm.experimental.widenable.condition()`` is used in
23020 condition of a guard represented as explicit branch, it is
23021 legal to widen the guard's condition with any additional
23022 conditions.
23023
23024 Guard widening looks like replacement of
23025
23026 .. code-block:: text
23027
23028   %widenable_cond = call i1 @llvm.experimental.widenable.condition()
23029   %guard_cond = and i1 %cond, %widenable_cond
23030   br i1 %guard_cond, label %guarded, label %deopt
23031
23032 with
23033
23034 .. code-block:: text
23035
23036   %widenable_cond = call i1 @llvm.experimental.widenable.condition()
23037   %new_cond = and i1 %any_other_cond, %widenable_cond
23038   %new_guard_cond = and i1 %cond, %new_cond
23039   br i1 %new_guard_cond, label %guarded, label %deopt
23040
23041 for this branch. Here `%any_other_cond` is an arbitrarily chosen
23042 well-defined `i1` value. By making guard widening, we may
23043 impose stricter conditions on `guarded` block and bail to the
23044 deopt when the new condition is not met.
23045
23046 Lowering:
23047 """""""""
23048
23049 Default lowering strategy is replacing the result of
23050 call of ``@llvm.experimental.widenable.condition``  with
23051 constant `true`. However it is always correct to replace
23052 it with any other `i1` value. Any pass can
23053 freely do it if it can benefit from non-default lowering.
23054
23055
23056 '``llvm.load.relative``' Intrinsic
23057 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23058
23059 Syntax:
23060 """""""
23061
23062 ::
23063
23064       declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
23065
23066 Overview:
23067 """""""""
23068
23069 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
23070 adds ``%ptr`` to that value and returns it. The constant folder specifically
23071 recognizes the form of this intrinsic and the constant initializers it may
23072 load from; if a loaded constant initializer is known to have the form
23073 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
23074
23075 LLVM provides that the calculation of such a constant initializer will
23076 not overflow at link time under the medium code model if ``x`` is an
23077 ``unnamed_addr`` function. However, it does not provide this guarantee for
23078 a constant initializer folded into a function body. This intrinsic can be
23079 used to avoid the possibility of overflows when loading from such a constant.
23080
23081 .. _llvm_sideeffect:
23082
23083 '``llvm.sideeffect``' Intrinsic
23084 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23085
23086 Syntax:
23087 """""""
23088
23089 ::
23090
23091       declare void @llvm.sideeffect() inaccessiblememonly nounwind
23092
23093 Overview:
23094 """""""""
23095
23096 The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
23097 treat it as having side effects, so it can be inserted into a loop to
23098 indicate that the loop shouldn't be assumed to terminate (which could
23099 potentially lead to the loop being optimized away entirely), even if it's
23100 an infinite loop with no other side effects.
23101
23102 Arguments:
23103 """"""""""
23104
23105 None.
23106
23107 Semantics:
23108 """"""""""
23109
23110 This intrinsic actually does nothing, but optimizers must assume that it
23111 has externally observable side effects.
23112
23113 '``llvm.is.constant.*``' Intrinsic
23114 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23115
23116 Syntax:
23117 """""""
23118
23119 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
23120
23121 ::
23122
23123       declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone
23124       declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone
23125       declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone
23126
23127 Overview:
23128 """""""""
23129
23130 The '``llvm.is.constant``' intrinsic will return true if the argument
23131 is known to be a manifest compile-time constant. It is guaranteed to
23132 fold to either true or false before generating machine code.
23133
23134 Semantics:
23135 """"""""""
23136
23137 This intrinsic generates no code. If its argument is known to be a
23138 manifest compile-time constant value, then the intrinsic will be
23139 converted to a constant true value. Otherwise, it will be converted to
23140 a constant false value.
23141
23142 In particular, note that if the argument is a constant expression
23143 which refers to a global (the address of which _is_ a constant, but
23144 not manifest during the compile), then the intrinsic evaluates to
23145 false.
23146
23147 The result also intentionally depends on the result of optimization
23148 passes -- e.g., the result can change depending on whether a
23149 function gets inlined or not. A function's parameters are
23150 obviously not constant. However, a call like
23151 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the
23152 function is inlined, if the value passed to the function parameter was
23153 a constant.
23154
23155 On the other hand, if constant folding is not run, it will never
23156 evaluate to true, even in simple cases.
23157
23158 .. _int_ptrmask:
23159
23160 '``llvm.ptrmask``' Intrinsic
23161 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23162
23163 Syntax:
23164 """""""
23165
23166 ::
23167
23168       declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable
23169
23170 Arguments:
23171 """"""""""
23172
23173 The first argument is a pointer. The second argument is an integer.
23174
23175 Overview:
23176 """"""""""
23177
23178 The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
23179 This allows stripping data from tagged pointers without converting them to an
23180 integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
23181 to facilitate alias analysis and underlying-object detection.
23182
23183 Semantics:
23184 """"""""""
23185
23186 The result of ``ptrmask(ptr, mask)`` is equivalent to
23187 ``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned
23188 pointer and the first argument are based on the same underlying object (for more
23189 information on the *based on* terminology see
23190 :ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the
23191 mask argument does not match the pointer size of the target, the mask is
23192 zero-extended or truncated accordingly.
23193
23194 .. _int_vscale:
23195
23196 '``llvm.vscale``' Intrinsic
23197 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
23198
23199 Syntax:
23200 """""""
23201
23202 ::
23203
23204       declare i32 llvm.vscale.i32()
23205       declare i64 llvm.vscale.i64()
23206
23207 Overview:
23208 """""""""
23209
23210 The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
23211 vectors such as ``<vscale x 16 x i8>``.
23212
23213 Semantics:
23214 """"""""""
23215
23216 ``vscale`` is a positive value that is constant throughout program
23217 execution, but is unknown at compile time.
23218 If the result value does not fit in the result type, then the result is
23219 a :ref:`poison value <poisonvalues>`.
23220
23221
23222 Stack Map Intrinsics
23223 --------------------
23224
23225 LLVM provides experimental intrinsics to support runtime patching
23226 mechanisms commonly desired in dynamic language JITs. These intrinsics
23227 are described in :doc:`StackMaps`.
23228
23229 Element Wise Atomic Memory Intrinsics
23230 -------------------------------------
23231
23232 These intrinsics are similar to the standard library memory intrinsics except
23233 that they perform memory transfer as a sequence of atomic memory accesses.
23234
23235 .. _int_memcpy_element_unordered_atomic:
23236
23237 '``llvm.memcpy.element.unordered.atomic``' Intrinsic
23238 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23239
23240 Syntax:
23241 """""""
23242
23243 This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
23244 any integer bit width and for different address spaces. Not all targets
23245 support all bit widths however.
23246
23247 ::
23248
23249       declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
23250                                                                        i8* <src>,
23251                                                                        i32 <len>,
23252                                                                        i32 <element_size>)
23253       declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
23254                                                                        i8* <src>,
23255                                                                        i64 <len>,
23256                                                                        i32 <element_size>)
23257
23258 Overview:
23259 """""""""
23260
23261 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
23262 '``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
23263 as arrays with elements that are exactly ``element_size`` bytes, and the copy between
23264 buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
23265 that are a positive integer multiple of the ``element_size`` in size.
23266
23267 Arguments:
23268 """"""""""
23269
23270 The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
23271 intrinsic, with the added constraint that ``len`` is required to be a positive integer
23272 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23273 ``element_size``, then the behaviour of the intrinsic is undefined.
23274
23275 ``element_size`` must be a compile-time constant positive power of two no greater than
23276 target-specific atomic access size limit.
23277
23278 For each of the input pointers ``align`` parameter attribute must be specified. It
23279 must be a power of two no less than the ``element_size``. Caller guarantees that
23280 both the source and destination pointers are aligned to that boundary.
23281
23282 Semantics:
23283 """"""""""
23284
23285 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
23286 memory from the source location to the destination location. These locations are not
23287 allowed to overlap. The memory copy is performed as a sequence of load/store operations
23288 where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
23289 aligned at an ``element_size`` boundary.
23290
23291 The order of the copy is unspecified. The same value may be read from the source
23292 buffer many times, but only one write is issued to the destination buffer per
23293 element. It is well defined to have concurrent reads and writes to both source and
23294 destination provided those reads and writes are unordered atomic when specified.
23295
23296 This intrinsic does not provide any additional ordering guarantees over those
23297 provided by a set of unordered loads from the source location and stores to the
23298 destination.
23299
23300 Lowering:
23301 """""""""
23302
23303 In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
23304 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
23305 is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
23306 lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23307 lowering.
23308
23309 Optimizer is allowed to inline memory copy when it's profitable to do so.
23310
23311 '``llvm.memmove.element.unordered.atomic``' Intrinsic
23312 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23313
23314 Syntax:
23315 """""""
23316
23317 This is an overloaded intrinsic. You can use
23318 ``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
23319 different address spaces. Not all targets support all bit widths however.
23320
23321 ::
23322
23323       declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
23324                                                                         i8* <src>,
23325                                                                         i32 <len>,
23326                                                                         i32 <element_size>)
23327       declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
23328                                                                         i8* <src>,
23329                                                                         i64 <len>,
23330                                                                         i32 <element_size>)
23331
23332 Overview:
23333 """""""""
23334
23335 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
23336 of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
23337 ``src`` are treated as arrays with elements that are exactly ``element_size``
23338 bytes, and the copy between buffers uses a sequence of
23339 :ref:`unordered atomic <ordering>` load/store operations that are a positive
23340 integer multiple of the ``element_size`` in size.
23341
23342 Arguments:
23343 """"""""""
23344
23345 The first three arguments are the same as they are in the
23346 :ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
23347 ``len`` is required to be a positive integer multiple of the ``element_size``.
23348 If ``len`` is not a positive integer multiple of ``element_size``, then the
23349 behaviour of the intrinsic is undefined.
23350
23351 ``element_size`` must be a compile-time constant positive power of two no
23352 greater than a target-specific atomic access size limit.
23353
23354 For each of the input pointers the ``align`` parameter attribute must be
23355 specified. It must be a power of two no less than the ``element_size``. Caller
23356 guarantees that both the source and destination pointers are aligned to that
23357 boundary.
23358
23359 Semantics:
23360 """"""""""
23361
23362 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
23363 of memory from the source location to the destination location. These locations
23364 are allowed to overlap. The memory copy is performed as a sequence of load/store
23365 operations where each access is guaranteed to be a multiple of ``element_size``
23366 bytes wide and aligned at an ``element_size`` boundary.
23367
23368 The order of the copy is unspecified. The same value may be read from the source
23369 buffer many times, but only one write is issued to the destination buffer per
23370 element. It is well defined to have concurrent reads and writes to both source
23371 and destination provided those reads and writes are unordered atomic when
23372 specified.
23373
23374 This intrinsic does not provide any additional ordering guarantees over those
23375 provided by a set of unordered loads from the source location and stores to the
23376 destination.
23377
23378 Lowering:
23379 """""""""
23380
23381 In the most general case call to the
23382 '``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
23383 ``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
23384 actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
23385 <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23386 lowering.
23387
23388 The optimizer is allowed to inline the memory copy when it's profitable to do so.
23389
23390 .. _int_memset_element_unordered_atomic:
23391
23392 '``llvm.memset.element.unordered.atomic``' Intrinsic
23393 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23394
23395 Syntax:
23396 """""""
23397
23398 This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
23399 any integer bit width and for different address spaces. Not all targets
23400 support all bit widths however.
23401
23402 ::
23403
23404       declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
23405                                                                   i8 <value>,
23406                                                                   i32 <len>,
23407                                                                   i32 <element_size>)
23408       declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
23409                                                                   i8 <value>,
23410                                                                   i64 <len>,
23411                                                                   i32 <element_size>)
23412
23413 Overview:
23414 """""""""
23415
23416 The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
23417 '``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
23418 with elements that are exactly ``element_size`` bytes, and the assignment to that array
23419 uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
23420 that are a positive integer multiple of the ``element_size`` in size.
23421
23422 Arguments:
23423 """"""""""
23424
23425 The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
23426 intrinsic, with the added constraint that ``len`` is required to be a positive integer
23427 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23428 ``element_size``, then the behaviour of the intrinsic is undefined.
23429
23430 ``element_size`` must be a compile-time constant positive power of two no greater than
23431 target-specific atomic access size limit.
23432
23433 The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
23434 must be a power of two no less than the ``element_size``. Caller guarantees that
23435 the destination pointer is aligned to that boundary.
23436
23437 Semantics:
23438 """"""""""
23439
23440 The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
23441 memory starting at the destination location to the given ``value``. The memory is
23442 set with a sequence of store operations where each access is guaranteed to be a
23443 multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
23444
23445 The order of the assignment is unspecified. Only one write is issued to the
23446 destination buffer per element. It is well defined to have concurrent reads and
23447 writes to the destination provided those reads and writes are unordered atomic
23448 when specified.
23449
23450 This intrinsic does not provide any additional ordering guarantees over those
23451 provided by a set of unordered stores to the destination.
23452
23453 Lowering:
23454 """""""""
23455
23456 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
23457 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
23458 is replaced with an actual element size.
23459
23460 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
23461
23462 Objective-C ARC Runtime Intrinsics
23463 ----------------------------------
23464
23465 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
23466 LLVM is aware of the semantics of these functions, and optimizes based on that
23467 knowledge. You can read more about the details of Objective-C ARC `here
23468 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
23469
23470 '``llvm.objc.autorelease``' Intrinsic
23471 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23472
23473 Syntax:
23474 """""""
23475 ::
23476
23477       declare i8* @llvm.objc.autorelease(i8*)
23478
23479 Lowering:
23480 """""""""
23481
23482 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
23483
23484 '``llvm.objc.autoreleasePoolPop``' Intrinsic
23485 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23486
23487 Syntax:
23488 """""""
23489 ::
23490
23491       declare void @llvm.objc.autoreleasePoolPop(i8*)
23492
23493 Lowering:
23494 """""""""
23495
23496 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
23497
23498 '``llvm.objc.autoreleasePoolPush``' Intrinsic
23499 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23500
23501 Syntax:
23502 """""""
23503 ::
23504
23505       declare i8* @llvm.objc.autoreleasePoolPush()
23506
23507 Lowering:
23508 """""""""
23509
23510 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
23511
23512 '``llvm.objc.autoreleaseReturnValue``' Intrinsic
23513 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23514
23515 Syntax:
23516 """""""
23517 ::
23518
23519       declare i8* @llvm.objc.autoreleaseReturnValue(i8*)
23520
23521 Lowering:
23522 """""""""
23523
23524 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
23525
23526 '``llvm.objc.copyWeak``' Intrinsic
23527 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23528
23529 Syntax:
23530 """""""
23531 ::
23532
23533       declare void @llvm.objc.copyWeak(i8**, i8**)
23534
23535 Lowering:
23536 """""""""
23537
23538 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
23539
23540 '``llvm.objc.destroyWeak``' Intrinsic
23541 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23542
23543 Syntax:
23544 """""""
23545 ::
23546
23547       declare void @llvm.objc.destroyWeak(i8**)
23548
23549 Lowering:
23550 """""""""
23551
23552 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
23553
23554 '``llvm.objc.initWeak``' Intrinsic
23555 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23556
23557 Syntax:
23558 """""""
23559 ::
23560
23561       declare i8* @llvm.objc.initWeak(i8**, i8*)
23562
23563 Lowering:
23564 """""""""
23565
23566 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
23567
23568 '``llvm.objc.loadWeak``' Intrinsic
23569 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23570
23571 Syntax:
23572 """""""
23573 ::
23574
23575       declare i8* @llvm.objc.loadWeak(i8**)
23576
23577 Lowering:
23578 """""""""
23579
23580 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
23581
23582 '``llvm.objc.loadWeakRetained``' Intrinsic
23583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23584
23585 Syntax:
23586 """""""
23587 ::
23588
23589       declare i8* @llvm.objc.loadWeakRetained(i8**)
23590
23591 Lowering:
23592 """""""""
23593
23594 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
23595
23596 '``llvm.objc.moveWeak``' Intrinsic
23597 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23598
23599 Syntax:
23600 """""""
23601 ::
23602
23603       declare void @llvm.objc.moveWeak(i8**, i8**)
23604
23605 Lowering:
23606 """""""""
23607
23608 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
23609
23610 '``llvm.objc.release``' Intrinsic
23611 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23612
23613 Syntax:
23614 """""""
23615 ::
23616
23617       declare void @llvm.objc.release(i8*)
23618
23619 Lowering:
23620 """""""""
23621
23622 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
23623
23624 '``llvm.objc.retain``' Intrinsic
23625 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23626
23627 Syntax:
23628 """""""
23629 ::
23630
23631       declare i8* @llvm.objc.retain(i8*)
23632
23633 Lowering:
23634 """""""""
23635
23636 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
23637
23638 '``llvm.objc.retainAutorelease``' Intrinsic
23639 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23640
23641 Syntax:
23642 """""""
23643 ::
23644
23645       declare i8* @llvm.objc.retainAutorelease(i8*)
23646
23647 Lowering:
23648 """""""""
23649
23650 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
23651
23652 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
23653 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23654
23655 Syntax:
23656 """""""
23657 ::
23658
23659       declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*)
23660
23661 Lowering:
23662 """""""""
23663
23664 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
23665
23666 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
23667 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23668
23669 Syntax:
23670 """""""
23671 ::
23672
23673       declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*)
23674
23675 Lowering:
23676 """""""""
23677
23678 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
23679
23680 '``llvm.objc.retainBlock``' Intrinsic
23681 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23682
23683 Syntax:
23684 """""""
23685 ::
23686
23687       declare i8* @llvm.objc.retainBlock(i8*)
23688
23689 Lowering:
23690 """""""""
23691
23692 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
23693
23694 '``llvm.objc.storeStrong``' Intrinsic
23695 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23696
23697 Syntax:
23698 """""""
23699 ::
23700
23701       declare void @llvm.objc.storeStrong(i8**, i8*)
23702
23703 Lowering:
23704 """""""""
23705
23706 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
23707
23708 '``llvm.objc.storeWeak``' Intrinsic
23709 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23710
23711 Syntax:
23712 """""""
23713 ::
23714
23715       declare i8* @llvm.objc.storeWeak(i8**, i8*)
23716
23717 Lowering:
23718 """""""""
23719
23720 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
23721
23722 Preserving Debug Information Intrinsics
23723 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23724
23725 These intrinsics are used to carry certain debuginfo together with
23726 IR-level operations. For example, it may be desirable to
23727 know the structure/union name and the original user-level field
23728 indices. Such information got lost in IR GetElementPtr instruction
23729 since the IR types are different from debugInfo types and unions
23730 are converted to structs in IR.
23731
23732 '``llvm.preserve.array.access.index``' Intrinsic
23733 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23734
23735 Syntax:
23736 """""""
23737 ::
23738
23739       declare <ret_type>
23740       @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
23741                                                                            i32 dim,
23742                                                                            i32 index)
23743
23744 Overview:
23745 """""""""
23746
23747 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
23748 based on array base ``base``, array dimension ``dim`` and the last access index ``index``
23749 into the array. The return type ``ret_type`` is a pointer type to the array element.
23750 The array ``dim`` and ``index`` are preserved which is more robust than
23751 getelementptr instruction which may be subject to compiler transformation.
23752 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23753 to provide array or pointer debuginfo type.
23754 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
23755 debuginfo version of ``type``.
23756
23757 Arguments:
23758 """"""""""
23759
23760 The ``base`` is the array base address.  The ``dim`` is the array dimension.
23761 The ``base`` is a pointer if ``dim`` equals 0.
23762 The ``index`` is the last access index into the array or pointer.
23763
23764 The ``base`` argument must be annotated with an :ref:`elementtype
23765 <attr_elementtype>` attribute at the call-site. This attribute specifies the
23766 getelementptr element type.
23767
23768 Semantics:
23769 """"""""""
23770
23771 The '``llvm.preserve.array.access.index``' intrinsic produces the same result
23772 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
23773
23774 '``llvm.preserve.union.access.index``' Intrinsic
23775 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23776
23777 Syntax:
23778 """""""
23779 ::
23780
23781       declare <type>
23782       @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
23783                                                                         i32 di_index)
23784
23785 Overview:
23786 """""""""
23787
23788 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
23789 ``di_index`` and returns the ``base`` address.
23790 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23791 to provide union debuginfo type.
23792 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23793 The return type ``type`` is the same as the ``base`` type.
23794
23795 Arguments:
23796 """"""""""
23797
23798 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
23799
23800 Semantics:
23801 """"""""""
23802
23803 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
23804
23805 '``llvm.preserve.struct.access.index``' Intrinsic
23806 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23807
23808 Syntax:
23809 """""""
23810 ::
23811
23812       declare <ret_type>
23813       @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
23814                                                                  i32 gep_index,
23815                                                                  i32 di_index)
23816
23817 Overview:
23818 """""""""
23819
23820 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
23821 based on struct base ``base`` and IR struct member index ``gep_index``.
23822 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23823 to provide struct debuginfo type.
23824 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23825 The return type ``ret_type`` is a pointer type to the structure member.
23826
23827 Arguments:
23828 """"""""""
23829
23830 The ``base`` is the structure base address. The ``gep_index`` is the struct member index
23831 based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
23832
23833 The ``base`` argument must be annotated with an :ref:`elementtype
23834 <attr_elementtype>` attribute at the call-site. This attribute specifies the
23835 getelementptr element type.
23836
23837 Semantics:
23838 """"""""""
23839
23840 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
23841 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.