From e0c01e6cb07133f0bb155a168d967cf854f03ffa Mon Sep 17 00:00:00 2001 From: "Paul C. Anagnostopoulos" Date: Fri, 21 Aug 2020 23:07:30 +0200 Subject: [PATCH] New TableGen Programmer's Reference document MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit This new TableGen Programmer's Reference document replaces the current Language Introduction and Language Reference documents. It brings all the TableGen reference information into one document. As an experiment, I numbered the sections in the document. See what you think about that. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D85838 (changes by Nicolai Hähnle : - fixed build error due to toctree in docs/LangRef/index.rst - fixed reference to ProgRef) Change-Id: Ifbdfa39768b8a460aae2873103d31c7b347aff00 --- llvm/docs/TableGen/LangIntro.rst | 737 ---------------- llvm/docs/TableGen/LangRef.rst | 556 ------------- llvm/docs/TableGen/ProgRef.rst | 1709 ++++++++++++++++++++++++++++++++++++++ llvm/docs/TableGen/index.rst | 27 +- 4 files changed, 1721 insertions(+), 1308 deletions(-) delete mode 100644 llvm/docs/TableGen/LangIntro.rst delete mode 100644 llvm/docs/TableGen/LangRef.rst create mode 100644 llvm/docs/TableGen/ProgRef.rst diff --git a/llvm/docs/TableGen/LangIntro.rst b/llvm/docs/TableGen/LangIntro.rst deleted file mode 100644 index 7990d1f..0000000 --- a/llvm/docs/TableGen/LangIntro.rst +++ /dev/null @@ -1,737 +0,0 @@ -============================== -TableGen Language Introduction -============================== - -.. contents:: - :local: - -.. warning:: - This document is extremely rough. If you find something lacking, please - fix it, file a documentation bug, or ask about it on llvm-dev. - -Introduction -============ - -This document is not meant to be a normative spec about the TableGen language -in and of itself (i.e. how to understand a given construct in terms of how -it affects the final set of records represented by the TableGen file). For -the formal language specification, see :doc:`LangRef`. - -TableGen syntax -=============== - -TableGen doesn't care about the meaning of data (that is up to the backend to -define), but it does care about syntax, and it enforces a simple type system. -This section describes the syntax and the constructs allowed in a TableGen file. - -TableGen primitives -------------------- - -TableGen comments -^^^^^^^^^^^^^^^^^ - -TableGen supports C++ style "``//``" comments, which run to the end of the -line, and it also supports **nestable** "``/* */``" comments. - -.. _TableGen type: - -The TableGen type system -^^^^^^^^^^^^^^^^^^^^^^^^ - -TableGen files are strongly typed, in a simple (but complete) type-system. -These types are used to perform automatic conversions, check for errors, and to -help interface designers constrain the input that they allow. Every `value -definition`_ is required to have an associated type. - -TableGen supports a mixture of very low-level types (such as ``bit``) and very -high-level types (such as ``dag``). This flexibility is what allows it to -describe a wide range of information conveniently and compactly. The TableGen -types are: - -``bit`` - A 'bit' is a boolean value that can hold either 0 or 1. - -``int`` - The 'int' type represents a simple 32-bit integer value, such as 5. - -``string`` - The 'string' type represents an ordered sequence of characters of arbitrary - length. - -``code`` - The `code` type represents a code fragment, which can be single/multi-line - string literal. - -``bits`` - A 'bits' type is an arbitrary, but fixed, size integer that is broken up - into individual bits. This type is useful because it can handle some bits - being defined while others are undefined. - -``list`` - This type represents a list whose elements are some other type. The - contained type is arbitrary: it can even be another list type. - -Class type - Specifying a class name in a type context means that the defined value must - be a subclass of the specified class. This is useful in conjunction with - the ``list`` type, for example, to constrain the elements of the list to a - common base class (e.g., a ``list`` can only contain definitions - derived from the "``Register``" class). - -``dag`` - This type represents a nestable directed graph of elements. - -To date, these types have been sufficient for describing things that TableGen -has been used for, but it is straight-forward to extend this list if needed. - -.. _TableGen expressions: - -TableGen values and expressions -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -TableGen allows for a pretty reasonable number of different expression forms -when building up values. These forms allow the TableGen file to be written in a -natural syntax and flavor for the application. The current expression forms -supported include: - -``?`` - uninitialized field - -``0b1001011`` - binary integer value. - Note that this is sized by the number of bits given and will not be - silently extended/truncated. - -``7`` - decimal integer value - -``0x7F`` - hexadecimal integer value - -``"foo"`` - a single-line string value, can be assigned to ``string`` or ``code`` variable. - -``[{ ... }]`` - usually called a "code fragment", but is just a multiline string literal - -``[ X, Y, Z ]`` - list value. is the type of the list element and is usually optional. - In rare cases, TableGen is unable to deduce the element type in which case - the user must specify it explicitly. - -``{ a, b, 0b10 }`` - initializer for a "bits<4>" value. - 1-bit from "a", 1-bit from "b", 2-bits from 0b10. - -``value`` - value reference - -``value{17}`` - access to one bit of a value - -``value{15-17}`` - access to an ordered sequence of bits of a value, in particular ``value{15-17}`` - produces an order that is the reverse of ``value{17-15}``. - -``DEF`` - reference to a record definition - -``CLASS`` - reference to a new anonymous definition of CLASS with the specified template - arguments. - -``X.Y`` - reference to the subfield of a value - -``list[4-7,17,2-3]`` - A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it. - Elements may be included multiple times. - -``foreach = [ ] in { }`` - -``foreach = [ ] in `` - Replicate or , replacing instances of with each value - in . is scoped at the level of the ``foreach`` loop and must - not conflict with any other object introduced in or . Only - ``def``\s and ``defm``\s are expanded within . - -``foreach = 0-15 in ...`` - -``foreach = {0-15,32-47} in ...`` - Loop over ranges of integers. The braces are required for multiple ranges. - -``(DEF a, b)`` - a dag value. The first element is required to be a record definition, the - remaining elements in the list may be arbitrary other values, including - nested ```dag``' values. - -``!con(a, b, ...)`` - Concatenate two or more DAG nodes. Their operations must equal. - - Example: !con((op a1:$name1, a2:$name2), (op b1:$name3)) results in - the DAG node (op a1:$name1, a2:$name2, b1:$name3). - -``!dag(op, children, names)`` - Generate a DAG node programmatically. 'children' and 'names' must be lists - of equal length or unset ('?'). 'names' must be a 'list'. - - Due to limitations of the type system, 'children' must be a list of items - of a common type. In practice, this means that they should either have the - same type or be records with a common superclass. Mixing dag and non-dag - items is not possible. However, '?' can be used. - - Example: !dag(op, [a1, a2, ?], ["name1", "name2", "name3"]) results in - (op a1:$name1, a2:$name2, ?:$name3). - -``!setop(dag, op)`` - Return a DAG node with the same arguments as ``dag``, but with its - operator replaced with ``op``. - - Example: ``!setop((foo 1, 2), bar)`` results in ``(bar 1, 2)``. - -``!getop(dag)`` - -``!getop(dag)`` - Return the operator of the given DAG node. - Example: ``!getop((foo 1, 2))`` results in ``foo``. - - The result of ``!getop`` can be used directly in a context where - any record value at all is acceptable (typically placing it into - another dag value). But in other contexts, it must be explicitly - cast to a particular class type. The ``!getop`` syntax is - provided to make this easy. - - For example, to assign the result to a class-typed value, you - could write either of these: - ``BaseClass b = !getop(someDag);`` - - ``BaseClass b = !cast(!getop(someDag));`` - - But to build a new dag node reusing the operator from another, no - cast is necessary: - ``dag d = !dag(!getop(someDag), args, names);`` - -``!listconcat(a, b, ...)`` - A list value that is the result of concatenating the 'a' and 'b' lists. - The lists must have the same element type. - More than two arguments are accepted with the result being the concatenation - of all the lists given. - -``!listsplat(a, size)`` - A list value that contains the value ``a`` ``size`` times. - Example: ``!listsplat(0, 2)`` results in ``[0, 0]``. - -``!strconcat(a, b, ...)`` - A string value that is the result of concatenating the 'a' and 'b' strings. - More than two arguments are accepted with the result being the concatenation - of all the strings given. - -``str1#str2`` - "#" (paste) is a shorthand for !strconcat. It may concatenate things that - are not quoted strings, in which case an implicit !cast is done on - the operand of the paste. - -``!cast(a)`` - If 'a' is a string, a record of type *type* obtained by looking up the - string 'a' in the list of all records defined by the time that all template - arguments in 'a' are fully resolved. - - For example, if !cast(a) appears in a multiclass definition, or in a - class instantiated inside of a multiclass definition, and 'a' does not - reference any template arguments of the multiclass, then a record of name - 'a' must be instantiated earlier in the source file. If 'a' does reference - a template argument, then the lookup is delayed until defm statements - instantiating the multiclass (or later, if the defm occurs in another - multiclass and template arguments of the inner multiclass that are - referenced by 'a' are substituted by values that themselves contain - references to template arguments of the outer multiclass). - - If the type of 'a' does not match *type*, TableGen aborts with an error. - - Otherwise, perform a normal type cast e.g. between an int and a bit, or - between record types. This allows casting a record to a subclass, though if - the types do not match, constant folding will be inhibited. !cast - is a special case in that the argument can be an int or a record. In the - latter case, the record's name is returned. - -``!isa(a)`` - Returns an integer: 1 if 'a' is dynamically of the given type, 0 otherwise. - -``!subst(a, b, c)`` - If 'a' and 'b' are of string type or are symbol references, substitute 'b' - for 'a' in 'c.' This operation is analogous to $(subst) in GNU make. - -``!foreach(a, b, c)`` - For each member of dag or list 'b' apply operator 'c'. 'a' is the name - of a variable that will be substituted by members of 'b' in 'c'. - This operation is analogous to $(foreach) in GNU make. - -``!foldl(start, lst, a, b, expr)`` - Perform a left-fold over 'lst' with the given starting value. 'a' and 'b' - are variable names which will be substituted in 'expr'. If you think of - expr as a function f(a,b), the fold will compute - 'f(...f(f(start, lst[0]), lst[1]), ...), lst[n-1])' for a list of length n. - As usual, 'a' will be of the type of 'start', and 'b' will be of the type - of elements of 'lst'. These types need not be the same, but 'expr' must be - of the same type as 'start'. - -``!head(a)`` - The first element of list 'a.' - -``!tail(a)`` - The 2nd-N elements of list 'a.' - -``!empty(a)`` - An integer {0,1} indicating whether list 'a' is empty. - -``!size(a)`` - An integer indicating the number of elements in list 'a'. - -``!if(a,b,c)`` - 'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise. - -``!cond(condition_1 : val1, condition_2 : val2, ..., condition_n : valn)`` - Instead of embedding !if inside !if which can get cumbersome, - one can use !cond. !cond returns 'val1' if the result of 'int' or 'bit' - operator 'condition1' is nonzero. Otherwise, it checks 'condition2'. - If 'condition2' is nonzero, returns 'val2', and so on. - If all conditions are zero, it reports an error. - - For example, to convert an integer 'x' into a string: - !cond(!lt(x,0) : "negative", !eq(x,0) : "zero", 1 : "positive") - -``!eq(a,b)`` - 'bit 1' if string a is equal to string b, 0 otherwise. This only operates - on string, int and bit objects. Use !cast to compare other types of - objects. - -``!ne(a,b)`` - The negation of ``!eq(a,b)``. - -``!le(a,b), !lt(a,b), !ge(a,b), !gt(a,b)`` - (Signed) comparison of integer values that returns bit 1 or 0 depending on - the result of the comparison. - -``!shl(a,b)`` ``!srl(a,b)`` ``!sra(a,b)`` - The usual shift operators. Operations are on 64-bit integers, the result - is undefined for shift counts outside [0, 63]. - -``!add(a,b,...)`` ``!mul(a,b,...)`` ``!and(a,b,...)`` ``!or(a,b,...)`` - The usual arithmetic and binary operators. - -Note that all of the values have rules specifying how they convert to values -for different types. These rules allow you to assign a value like "``7``" -to a "``bits<4>``" value, for example. - -Classes and definitions ------------------------ - -As mentioned in the :doc:`introduction `, classes and definitions (collectively known as -'records') in TableGen are the main high-level unit of information that TableGen -collects. Records are defined with a ``def`` or ``class`` keyword, the record -name, and an optional list of "`template arguments`_". If the record has -superclasses, they are specified as a comma separated list that starts with a -colon character ("``:``"). If `value definitions`_ or `let expressions`_ are -needed for the class, they are enclosed in curly braces ("``{}``"); otherwise, -the record ends with a semicolon. - -Here is a simple TableGen file: - -.. code-block:: text - - class C { bit V = 1; } - def X : C; - def Y : C { - string Greeting = "hello"; - } - -This example defines two definitions, ``X`` and ``Y``, both of which derive from -the ``C`` class. Because of this, they both get the ``V`` bit value. The ``Y`` -definition also gets the Greeting member as well. - -In general, classes are useful for collecting together the commonality between a -group of records and isolating it in a single place. Also, classes permit the -specification of default values for their subclasses, allowing the subclasses to -override them as they wish. - -.. _value definition: -.. _value definitions: - -Value definitions -^^^^^^^^^^^^^^^^^ - -Value definitions define named entries in records. A value must be defined -before it can be referred to as the operand for another value definition or -before the value is reset with a `let expression`_. A value is defined by -specifying a `TableGen type`_ and a name. If an initial value is available, it -may be specified after the type with an equal sign. Value definitions require -terminating semicolons. - -.. _let expression: -.. _let expressions: -.. _"let" expressions within a record: - -'let' expressions -^^^^^^^^^^^^^^^^^ - -A record-level let expression is used to change the value of a value definition -in a record. This is primarily useful when a superclass defines a value that a -derived class or definition wants to override. Let expressions consist of the -'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new -value. For example, a new class could be added to the example above, redefining -the ``V`` field for all of its subclasses: - -.. code-block:: text - - class D : C { let V = 0; } - def Z : D; - -In this case, the ``Z`` definition will have a zero value for its ``V`` value, -despite the fact that it derives (indirectly) from the ``C`` class, because the -``D`` class overrode its value. - -References between variables in a record are substituted late, which gives -``let`` expressions unusual power. Consider this admittedly silly example: - -.. code-block:: text - - class A { - int Y = x; - int Yplus1 = !add(Y, 1); - int xplus1 = !add(x, 1); - } - def Z : A<5> { - let Y = 10; - } - -The value of ``Z.xplus1`` will be 6, but the value of ``Z.Yplus1`` is 11. Use -this power wisely. - -.. _template arguments: - -Class template arguments -^^^^^^^^^^^^^^^^^^^^^^^^ - -TableGen permits the definition of parameterized classes as well as normal -concrete classes. Parameterized TableGen classes specify a list of variable -bindings (which may optionally have defaults) that are bound when used. Here is -a simple example: - -.. code-block:: text - - class FPFormat val> { - bits<3> Value = val; - } - def NotFP : FPFormat<0>; - def ZeroArgFP : FPFormat<1>; - def OneArgFP : FPFormat<2>; - def OneArgFPRW : FPFormat<3>; - def TwoArgFP : FPFormat<4>; - def CompareFP : FPFormat<5>; - def CondMovFP : FPFormat<6>; - def SpecialFP : FPFormat<7>; - -In this case, template arguments are used as a space efficient way to specify a -list of "enumeration values", each with a "``Value``" field set to the specified -integer. - -The more esoteric forms of `TableGen expressions`_ are useful in conjunction -with template arguments. As an example: - -.. code-block:: text - - class ModRefVal val> { - bits<2> Value = val; - } - - def None : ModRefVal<0>; - def Mod : ModRefVal<1>; - def Ref : ModRefVal<2>; - def ModRef : ModRefVal<3>; - - class Value { - // Decode some information into a more convenient format, while providing - // a nice interface to the user of the "Value" class. - bit isMod = MR.Value{0}; - bit isRef = MR.Value{1}; - - // other stuff... - } - - // Example uses - def bork : Value; - def zork : Value; - def hork : Value; - -This is obviously a contrived example, but it shows how template arguments can -be used to decouple the interface provided to the user of the class from the -actual internal data representation expected by the class. In this case, -running ``llvm-tblgen`` on the example prints the following definitions: - -.. code-block:: text - - def bork { // Value - bit isMod = 1; - bit isRef = 0; - } - def hork { // Value - bit isMod = 1; - bit isRef = 1; - } - def zork { // Value - bit isMod = 0; - bit isRef = 1; - } - -This shows that TableGen was able to dig into the argument and extract a piece -of information that was requested by the designer of the "Value" class. For -more realistic examples, please see existing users of TableGen, such as the X86 -backend. - -Multiclass definitions and instances -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -While classes with template arguments are a good way to factor commonality -between two instances of a definition, multiclasses allow a convenient notation -for defining multiple definitions at once (instances of implicitly constructed -classes). For example, consider an 3-address instruction set whose instructions -come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``" -(e.g. SPARC). In this case, you'd like to specify in one place that this -commonality exists, then in a separate place indicate what all the ops are. - -Here is an example TableGen fragment that shows this idea: - -.. code-block:: text - - def ops; - def GPR; - def Imm; - class inst; - - multiclass ri_inst { - def _rr : inst; - def _ri : inst; - } - - // Instantiations of the ri_inst multiclass. - defm ADD : ri_inst<0b111, "add">; - defm SUB : ri_inst<0b101, "sub">; - defm MUL : ri_inst<0b100, "mul">; - ... - -The name of the resultant definitions has the multidef fragment names appended -to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc. A defm may -inherit from multiple multiclasses, instantiating definitions from each -multiclass. Using a multiclass this way is exactly equivalent to instantiating -the classes multiple times yourself, e.g. by writing: - -.. code-block:: text - - def ops; - def GPR; - def Imm; - class inst; - - class rrinst - : inst; - - class riinst - : inst; - - // Instantiations of the ri_inst multiclass. - def ADD_rr : rrinst<0b111, "add">; - def ADD_ri : riinst<0b111, "add">; - def SUB_rr : rrinst<0b101, "sub">; - def SUB_ri : riinst<0b101, "sub">; - def MUL_rr : rrinst<0b100, "mul">; - def MUL_ri : riinst<0b100, "mul">; - ... - -A ``defm`` can also be used inside a multiclass providing several levels of -multiclass instantiations. - -.. code-block:: text - - class Instruction opc, string Name> { - bits<4> opcode = opc; - string name = Name; - } - - multiclass basic_r opc> { - def rr : Instruction; - def rm : Instruction; - } - - multiclass basic_s opc> { - defm SS : basic_r; - defm SD : basic_r; - def X : Instruction; - } - - multiclass basic_p opc> { - defm PS : basic_r; - defm PD : basic_r; - def Y : Instruction; - } - - defm ADD : basic_s<0xf>, basic_p<0xf>; - ... - - // Results - def ADDPDrm { ... - def ADDPDrr { ... - def ADDPSrm { ... - def ADDPSrr { ... - def ADDSDrm { ... - def ADDSDrr { ... - def ADDY { ... - def ADDX { ... - -``defm`` declarations can inherit from classes too, the rule to follow is that -the class list must start after the last multiclass, and there must be at least -one multiclass before them. - -.. code-block:: text - - class XD { bits<4> Prefix = 11; } - class XS { bits<4> Prefix = 12; } - - class I op> { - bits<4> opcode = op; - } - - multiclass R { - def rr : I<4>; - def rm : I<2>; - } - - multiclass Y { - defm SS : R, XD; - defm SD : R, XS; - } - - defm Instr : Y; - - // Results - def InstrSDrm { - bits<4> opcode = { 0, 0, 1, 0 }; - bits<4> Prefix = { 1, 1, 0, 0 }; - } - ... - def InstrSSrr { - bits<4> opcode = { 0, 1, 0, 0 }; - bits<4> Prefix = { 1, 0, 1, 1 }; - } - -File scope entities -------------------- - -File inclusion -^^^^^^^^^^^^^^ - -TableGen supports the '``include``' token, which textually substitutes the -specified file in place of the include directive. The filename should be -specified as a double quoted string immediately after the '``include``' keyword. -Example: - -.. code-block:: text - - include "foo.td" - -'let' expressions -^^^^^^^^^^^^^^^^^ - -"Let" expressions at file scope are similar to `"let" expressions within a -record`_, except they can specify a value binding for multiple records at a -time, and may be useful in certain other cases. File-scope let expressions are -really just another way that TableGen allows the end-user to factor out -commonality from the records. - -File-scope "let" expressions take a comma-separated list of bindings to apply, -and one or more records to bind the values in. Here are some examples: - -.. code-block:: text - - let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in - def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; - - let isCall = 1 in - // All calls clobber the non-callee saved registers... - let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, - MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, - XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { - def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), - "call\t${dst:call}", []>; - def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), - "call\t{*}$dst", [(X86call GR32:$dst)]>; - def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), - "call\t{*}$dst", []>; - } - -File-scope "let" expressions are often useful when a couple of definitions need -to be added to several records, and the records do not otherwise need to be -opened, as in the case with the ``CALL*`` instructions above. - -It's also possible to use "let" expressions inside multiclasses, providing more -ways to factor out commonality from the records, specially if using several -levels of multiclass instantiations. This also avoids the need of using "let" -expressions within subsequent records inside a multiclass. - -.. code-block:: text - - multiclass basic_r opc> { - let Predicates = [HasSSE2] in { - def rr : Instruction; - def rm : Instruction; - } - let Predicates = [HasSSE3] in - def rx : Instruction; - } - - multiclass basic_ss opc> { - let IsDouble = 0 in - defm SS : basic_r; - - let IsDouble = 1 in - defm SD : basic_r; - } - - defm ADD : basic_ss<0xf>; - -Looping -^^^^^^^ - -TableGen supports the '``foreach``' block, which textually replicates the loop -body, substituting iterator values for iterator references in the body. -Example: - -.. code-block:: text - - foreach i = [0, 1, 2, 3] in { - def R#i : Register<...>; - def F#i : Register<...>; - } - -This will create objects ``R0``, ``R1``, ``R2`` and ``R3``. ``foreach`` blocks -may be nested. If there is only one item in the body the braces may be -elided: - -.. code-block:: text - - foreach i = [0, 1, 2, 3] in - def R#i : Register<...>; - -Code Generator backend info -=========================== - -Expressions used by code generator to describe instructions and isel patterns: - -``(implicit a)`` - an implicitly defined physical register. This tells the dag instruction - selection emitter the input pattern's extra definitions matches implicit - physical register definitions. - diff --git a/llvm/docs/TableGen/LangRef.rst b/llvm/docs/TableGen/LangRef.rst deleted file mode 100644 index bb10b43..0000000 --- a/llvm/docs/TableGen/LangRef.rst +++ /dev/null @@ -1,556 +0,0 @@ -=========================== -TableGen Language Reference -=========================== - -.. contents:: - :local: - -.. warning:: - This document is extremely rough. If you find something lacking, please - fix it, file a documentation bug, or ask about it on llvm-dev. - -Introduction -============ - -This document is meant to be a normative spec about the TableGen language -in and of itself (i.e. how to understand a given construct in terms of how -it affects the final set of records represented by the TableGen file). If -you are unsure if this document is really what you are looking for, please -read the :doc:`introduction to TableGen ` first. - -Notation -======== - -The lexical and syntax notation used here is intended to imitate -`Python's`_. In particular, for lexical definitions, the productions -operate at the character level and there is no implied whitespace between -elements. The syntax definitions operate at the token level, so there is -implied whitespace between tokens. - -.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation - -Lexical Analysis -================ - -TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) -comments. TableGen also provides simple `Preprocessing Support`_. - -The following is a listing of the basic punctuation tokens:: - - - + [ ] { } ( ) < > : ; . = ? # - -Numeric literals take one of the following forms: - -.. TableGen actually will lex some pretty strange sequences an interpret - them as numbers. What is shown here is an attempt to approximate what it - "should" accept. - -.. productionlist:: - TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` - DecimalInteger: ["+" | "-"] ("0"..."9")+ - HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ - BinInteger: "0b" ("0" | "1")+ - -One aspect to note is that the :token:`DecimalInteger` token *includes* the -``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as -most languages do. - -Also note that :token:`BinInteger` creates a value of type ``bits`` -(where ``n`` is the number of bits). This will implicitly convert to -integers when needed. - -TableGen has identifier-like tokens: - -.. productionlist:: - ualpha: "a"..."z" | "A"..."Z" | "_" - TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* - TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* - -Note that unlike most languages, TableGen allows :token:`TokIdentifier` to -begin with a number. In case of ambiguity, a token will be interpreted as a -numeric literal rather than an identifier. - -TableGen also has two string-like literals: - -.. productionlist:: - TokString: '"' '"' - TokCodeFragment: "[{" "}]" - -:token:`TokCodeFragment` is essentially a multiline string literal -delimited by ``[{`` and ``}]``. - -.. note:: - The current implementation accepts the following C-like escapes:: - - \\ \' \" \t \n - -TableGen also has the following keywords:: - - bit bits class code dag - def foreach defm field in - int let list multiclass string - if then else - -TableGen also has "bang operators" which have a -wide variety of meanings: - -.. productionlist:: - BangOperator: one of - :!eq !if !head !tail !con - :!add !shl !sra !srl !and - :!or !empty !subst !foreach !strconcat - :!cast !listconcat !size !foldl - :!isa !dag !le !lt !ge - :!gt !ne !mul !listsplat !setop - :!getop - -TableGen also has !cond operator that needs a slightly different -syntax compared to other "bang operators": - -.. productionlist:: - CondOperator: !cond - - -Syntax -====== - -TableGen has an ``include`` mechanism. It does not play a role in the -syntax per se, since it is lexically replaced with the contents of the -included file. - -.. productionlist:: - IncludeDirective: "include" `TokString` - -TableGen's top-level production consists of "objects". - -.. productionlist:: - TableGenFile: `Object`* - Object: `Class` | `Def` | `Defm` | `Defset` | `Defvar` | `Let` | - `MultiClass` | `Foreach` | `If` - -``class``\es ------------- - -.. productionlist:: - Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` - TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" - -A ``class`` declaration creates a record which other records can inherit -from. A class can be parameterized by a list of "template arguments", whose -values can be used in the class body. - -A given class can only be defined once. A ``class`` declaration is -considered to define the class if any of the following is true: - -.. break ObjectBody into its constituents so that they are present here? - -#. The :token:`TemplateArgList` is present. -#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. -#. The :token:`BaseClassList` in the :token:`ObjectBody` is present. - -You can declare an empty class by giving an empty :token:`TemplateArgList` -and an empty :token:`ObjectBody`. This can serve as a restricted form of -forward declaration: note that records deriving from the forward-declared -class will inherit no fields from it since the record expansion is done -when the record is parsed. - -Every class has an implicit template argument called ``NAME``, which is set -to the name of the instantiating ``def`` or ``defm``. The result is undefined -if the class is instantiated by an anonymous record. - -Declarations ------------- - -.. Omitting mention of arcane "field" prefix to discourage its use. - -The declaration syntax is pretty much what you would expect as a C++ -programmer. - -.. productionlist:: - Declaration: `Type` `TokIdentifier` ["=" `Value`] - -It assigns the value to the identifier. - -Types ------ - -.. productionlist:: - Type: "string" | "code" | "bit" | "int" | "dag" - :| "bits" "<" `TokInteger` ">" - :| "list" "<" `Type` ">" - :| `ClassID` - ClassID: `TokIdentifier` - -Both ``string`` and ``code`` correspond to the string type; the difference -is purely to indicate programmer intention. - -The :token:`ClassID` must identify a class that has been previously -declared or defined. - -Values ------- - -.. productionlist:: - Value: `SimpleValue` `ValueSuffix`* - ValueSuffix: "{" `RangeList` "}" - :| "[" `RangeList` "]" - :| "." `TokIdentifier` - RangeList: `RangePiece` ("," `RangePiece`)* - RangePiece: `TokInteger` - :| `TokInteger` "-" `TokInteger` - :| `TokInteger` `TokInteger` - -The peculiar last form of :token:`RangePiece` is due to the fact that the -"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as -two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, -instead of "1", "-", and "5". -The :token:`RangeList` can be thought of as specifying "list slice" in some -contexts. - - -:token:`SimpleValue` has a number of forms: - - -.. productionlist:: - SimpleValue: `TokIdentifier` - -The value will be the variable referenced by the identifier. It can be one -of: - -.. The code for this is exceptionally abstruse. These examples are a - best-effort attempt. - -* name of a ``def``, such as the use of ``Bar`` in:: - - def Bar : SomeClass { - int X = 5; - } - - def Foo { - SomeClass Baz = Bar; - } - -* value local to a ``def``, such as the use of ``Bar`` in:: - - def Foo { - int Bar = 5; - int Baz = Bar; - } - - Values defined in superclasses can be accessed the same way. - -* a template arg of a ``class``, such as the use of ``Bar`` in:: - - class Foo { - int Baz = Bar; - } - -* value local to a ``class``, such as the use of ``Bar`` in:: - - class Foo { - int Bar = 5; - int Baz = Bar; - } - -* a template arg to a ``multiclass``, such as the use of ``Bar`` in:: - - multiclass Foo { - def : SomeClass; - } - -* the iteration variable of a ``foreach``, such as the use of ``i`` in:: - - foreach i = 0-5 in - def Foo#i; - -* a variable defined by ``defset`` or ``defvar`` - -* the implicit template argument ``NAME`` in a ``class`` or ``multiclass`` - -.. productionlist:: - SimpleValue: `TokInteger` - -This represents the numeric value of the integer. - -.. productionlist:: - SimpleValue: `TokString`+ - -Multiple adjacent string literals are concatenated like in C/C++. The value -is the concatenation of the strings. - -.. productionlist:: - SimpleValue: `TokCodeFragment` - -The value is the string value of the code fragment. - -.. productionlist:: - SimpleValue: "?" - -``?`` represents an "unset" initializer. - -.. productionlist:: - SimpleValue: "{" `ValueList` "}" - ValueList: [`ValueListNE`] - ValueListNE: `Value` ("," `Value`)* - -This represents a sequence of bits, as would be used to initialize a -``bits`` field (where ``n`` is the number of bits). - -.. productionlist:: - SimpleValue: `ClassID` "<" `ValueListNE` ">" - -This generates a new anonymous record definition (as would be created by an -unnamed ``def`` inheriting from the given class with the given template -arguments) and the value is the value of that record definition. - -.. productionlist:: - SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] - -A list initializer. The optional :token:`Type` can be used to indicate a -specific element type, otherwise the element type will be deduced from the -given values. - -.. The initial `DagArg` of the dag must start with an identifier or - !cast, but this is more of an implementation detail and so for now just - leave it out. - -.. productionlist:: - SimpleValue: "(" `DagArg` [`DagArgList`] ")" - DagArgList: `DagArg` ("," `DagArg`)* - DagArg: `Value` [":" `TokVarName`] | `TokVarName` - -The initial :token:`DagArg` is called the "operator" of the dag. - -.. productionlist:: - SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" - :| `CondOperator` "(" `CondVal` ("," `CondVal`)* ")" - CondVal: `Value` ":" `Value` - -Bodies ------- - -.. productionlist:: - ObjectBody: `BaseClassList` `Body` - BaseClassList: [":" `BaseClassListNE`] - BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* - SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] - DefmID: `TokIdentifier` - -The version with the :token:`MultiClassID` is only valid in the -:token:`BaseClassList` of a ``defm``. -The :token:`MultiClassID` should be the name of a ``multiclass``. - -.. put this somewhere else - -It is after parsing the base class list that the "let stack" is applied. - -.. productionlist:: - Body: ";" | "{" BodyList "}" - BodyList: BodyItem* - BodyItem: `Declaration` ";" - :| "let" `TokIdentifier` [ "{" `RangeList` "}" ] "=" `Value` ";" - :| `Defvar` - -The ``let`` form allows overriding the value of an inherited field. - -``def`` -------- - -.. productionlist:: - Def: "def" [`Value`] `ObjectBody` - -Defines a record whose name is given by the optional :token:`Value`. The value -is parsed in a special mode where global identifiers (records and variables -defined by ``defset``, and variables defined at global scope by ``defvar``) are -not recognized, and all unrecognized identifiers are interpreted as strings. - -If no name is given, the record is anonymous. The final name of anonymous -records is undefined, but globally unique. - -Special handling occurs if this ``def`` appears inside a ``multiclass`` or -a ``foreach``. - -When a non-anonymous record is defined in a multiclass and the given name -does not contain a reference to the implicit template argument ``NAME``, such -a reference will automatically be prepended. That is, the following are -equivalent inside a multiclass:: - - def Foo; - def NAME#Foo; - -``defm`` --------- - -.. productionlist:: - Defm: "defm" [`Value`] ":" `BaseClassListNE` ";" - -The :token:`BaseClassList` is a list of at least one ``multiclass`` and any -number of ``class``'s. The ``multiclass``'s must occur before any ``class``'s. - -Instantiates all records defined in all given ``multiclass``'s and adds the -given ``class``'s as superclasses. - -The name is parsed in the same special mode used by ``def``. If the name is -missing, a globally unique string is used instead (but instantiated records -are not considered to be anonymous, unless they were originally defined by an -anonymous ``def``) That is, the following have different semantics:: - - defm : SomeMultiClass<...>; // some globally unique name - defm "" : SomeMultiClass<...>; // empty name string - -When it occurs inside a multiclass, the second variant is equivalent to -``defm NAME : ...``. More generally, when ``defm`` occurs in a multiclass and -its name does not contain a reference to the implicit template argument -``NAME``, such a reference will automatically be prepended. That is, the -following are equivalent inside a multiclass:: - - defm Foo : SomeMultiClass<...>; - defm NAME#Foo : SomeMultiClass<...>; - -``defset`` ----------- -.. productionlist:: - Defset: "defset" `Type` `TokIdentifier` "=" "{" `Object`* "}" - -All records defined inside the braces via ``def`` and ``defm`` are collected -in a globally accessible list of the given name (in addition to being added -to the global collection of records as usual). Anonymous records created inside -initializier expressions using the ``Class`` syntax are never collected -in a defset. - -The given type must be ``list``, where ``A`` is some class. It is an error -to define a record (via ``def`` or ``defm``) inside the braces which doesn't -derive from ``A``. - -``defvar`` ----------- -.. productionlist:: - Defvar: "defvar" `TokIdentifier` "=" `Value` ";" - -The identifier on the left of the ``=`` is defined to be a global or local -variable, whose value is given by the expression on the right of the ``=``. The -type of the variable is automatically inferred. - -A ``defvar`` statement at the top level of the file defines a global variable, -in the same scope used by ``defset``. If a ``defvar`` statement appears inside -any other construction, including classes, multiclasses and ``foreach`` -statements, then the variable is scoped to the inside of that construction -only. - -In contexts where the ``defvar`` statement will be encountered multiple times, -the definition is re-evaluated for each instance. For example, a ``defvar`` -inside a ``foreach`` can construct a value based on the iteration variable, -which will be different every time round the loop; a ``defvar`` inside a -templated class or multiclass can have a definition depending on the template -parameters. - -Variables local to a ``foreach`` go out of scope at the end of each loop -iteration, so their previous value is not accessible in the next iteration. (It -won't work to ``defvar i=!add(i,1)`` each time you go round the loop.) - -In general, ``defvar`` variables are immutable once they are defined. It is an -error to define the same variable name twice in the same scope (but legal to -shadow the first definition temporarily in an inner scope). - -``foreach`` ------------ - -.. productionlist:: - Foreach: "foreach" `ForeachDeclaration` "in" "{" `Object`* "}" - :| "foreach" `ForeachDeclaration` "in" `Object` - ForeachDeclaration: ID "=" ( "{" `RangeList` "}" | `RangePiece` | `Value` ) - -The value assigned to the variable in the declaration is iterated over and -the object or object list is reevaluated with the variable set at each -iterated value. - -Note that the productions involving RangeList and RangePiece have precedence -over the more generic value parsing based on the first token. - -``if`` ------- - -.. productionlist:: - If: "if" `Value` "then" `IfBody` - :| "if" `Value` "then" `IfBody` "else" `IfBody` - IfBody: "{" `Object`* "}" | `Object` - -The value expression after the ``if`` keyword is evaluated, and if it evaluates -to true (in the same sense used by the ``!if`` operator), then the object -definition(s) after the ``then`` keyword are executed. Otherwise, if there is -an ``else`` keyword, the definition(s) after the ``else`` are executed instead. - -Because the braces around the ``then`` clause are optional, this grammar rule -has the usual ambiguity about dangling ``else`` clauses, and it is resolved in -the usual way: in a case like ``if v1 then if v2 then {...} else {...}``, the -``else`` binds to the inner ``if`` rather than the outer one. - -Top-Level ``let`` ------------------ - -.. productionlist:: - Let: "let" `LetList` "in" "{" `Object`* "}" - :| "let" `LetList` "in" `Object` - LetList: `LetItem` ("," `LetItem`)* - LetItem: `TokIdentifier` [`RangeList`] "=" `Value` - -This is effectively equivalent to ``let`` inside the body of a record -except that it applies to multiple records at a time. The bindings are -applied at the end of parsing the base classes of a record. - -``multiclass`` --------------- - -.. productionlist:: - MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] - : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" - BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* - MultiClassID: `TokIdentifier` - MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` - -Preprocessing Support -===================== - -TableGen's embedded preprocessor is only intended for conditional compilation. -It supports the following directives: - -.. productionlist:: - LineBegin: ^ - LineEnd: "\n" | "\r" | EOF - WhiteSpace: " " | "\t" - CStyleComment: "/*" (.* - "*/") "*/" - BCPLComment: "//" (.* - `LineEnd`) `LineEnd` - WhiteSpaceOrCStyleComment: `WhiteSpace` | `CStyleComment` - WhiteSpaceOrAnyComment: `WhiteSpace` | `CStyleComment` | `BCPLComment` - MacroName: `ualpha` (`ualpha` | "0"..."9")* - PrepDefine: `LineBegin` (`WhiteSpaceOrCStyleComment`)* - : "#define" (`WhiteSpace`)+ `MacroName` - : (`WhiteSpaceOrAnyComment`)* `LineEnd` - PrepIfdef: `LineBegin` (`WhiteSpaceOrCStyleComment`)* - : "#ifdef" (`WhiteSpace`)+ `MacroName` - : (`WhiteSpaceOrAnyComment`)* `LineEnd` - PrepElse: `LineBegin` (`WhiteSpaceOrCStyleComment`)* - : "#else" (`WhiteSpaceOrAnyComment`)* `LineEnd` - PrepEndif: `LineBegin` (`WhiteSpaceOrCStyleComment`)* - : "#endif" (`WhiteSpaceOrAnyComment`)* `LineEnd` - PrepRegContentException: `PrepIfdef` | `PrepElse` | `PrepEndif` | EOF - PrepRegion: .* - `PrepRegContentException` - :| `PrepIfdef` - : (`PrepRegion`)* - : [`PrepElse`] - : (`PrepRegion`)* - : `PrepEndif` - -:token:`PrepRegion` may occur anywhere in a TD file, as long as it matches -the grammar specification. - -:token:`PrepDefine` allows defining a :token:`MacroName` so that any following -:token:`PrepIfdef` - :token:`PrepElse` preprocessing region part and -:token:`PrepIfdef` - :token:`PrepEndif` preprocessing region -are enabled for TableGen tokens parsing. - -A preprocessing region, starting (i.e. having its :token:`PrepIfdef`) in a file, -must end (i.e. have its :token:`PrepEndif`) in the same file. - -A :token:`MacroName` may be defined externally by using ``{ -D }`` -option of TableGen. diff --git a/llvm/docs/TableGen/ProgRef.rst b/llvm/docs/TableGen/ProgRef.rst new file mode 100644 index 0000000..e331585 --- /dev/null +++ b/llvm/docs/TableGen/ProgRef.rst @@ -0,0 +1,1709 @@ +=============================== +TableGen Programmer's Reference +=============================== + +.. sectnum:: + +.. contents:: + :local: + +Introduction +============ + +The purpose of TableGen is to generate complex output files based on +information from source files that are significantly easier to code than the +output files would be, and also easier to maintain and modify over time. The +information is coded in a declarative style involving classes and records, +which are then processed by TableGen. The internalized records are passed on +to various backends, which extract information from a subset of the records +and generate one or more output files. These output files are typically +``.inc`` files for C++, but may be any type of file that the backend +developer needs. + +This document describes the LLVM TableGen facility in detail. It is intended +for the programmer who is using TableGen to produce tables for a project. If +you are looking for a simple overview, check out :doc:`TableGen Overview <./index>`. + +An example of a backend is ``RegisterInfo``, which generates the register +file information for a particular target machine, for use by the LLVM +target-independent code generator. See :doc:`TableGen Backends <./BackEnds>` for +a description of the LLVM TableGen backends. Here are a few of the things +backends can do. + +* Generate the register file information for a particular target machine. + +* Generate the instruction definitions for a target. + +* Generate the patterns that the code generator uses to match instructions + to intermediate representation (IR) nodes. + +* Generate semantic attribute identifiers for Clang. + +* Generate abstract syntax tree (AST) declaration node definitions for Clang. + +* Generate AST statement node definitions for Clang. + + +Concepts +-------- + +TableGen source files contain two primary items: *abstract records* and +*concrete records*. In this and other TableGen documents, abstract records +are called *classes.* (These classes are different from C++ classes and do +not map onto them.) In addition, concrete records are usually just called +records, although sometimes the term *record* refers to both classes and +concrete records. The distinction should be clear in context. + +Classes and concrete records have a unique *name*, either chosen by +the programmer or generated by TableGen. Associated with that name +is a list of *fields* with values and an optional list of *superclasses* +(sometimes called base or parent classes). The fields are the primary data that +backends will process. Note that TableGen assigns no meanings to fields; the +meanings are entirely up to the backends and the programs that incorporate +the output of those backends. + +A backend processes some subset of the concrete records built by the +TableGen parser and emits the output files. These files are usually C++ +``.inc`` files that are included by the programs that require the data in +those records. However, a backend can produce any type of output files. For +example, it could produce a data file containing messages tagged with +identifiers and substitution parameters. In a complex use case such as the +LLVM code generator, there can be many concrete records and some of them can +have an unexpectedly large number of fields, resulting in large output files. + +In order to reduce the complexity of TableGen files, classes are used to +abstract out groups of record fields. For example, a few classes may +abstract the concept of a machine register file, while other classes may +abstract the instruction formats, and still others may abstract the +individual instructions. TableGen allows an arbitrary hierarchy of classes, +so that the abstract classes for two concepts can share a third superclass that +abstracts common "sub-concepts" from the two original concepts. + +In order to make classes more useful, a concrete record (or another class) +can request a class as a superclass and pass *template arguments* to it. +These template arguments can be used in the fields of the superclass to +initialize them in a custom manner. That is, record or class ``A`` can +request superclass ``S`` with one set of template arguments, while record or class +``B`` can request ``S`` with a different set of arguments. Without template +arguments, many more classes would be required, one for each combination of +the template arguments. + +Both classes and concrete records can include fields that are uninitialized. +The uninitialized "value" is represented by a question mark (``?``). Classes +often have uninitialized fields that are expected to be filled in when those +classes are inherited by concrete records. Even so, some fields of concrete +records may remain uninitialized. + +TableGen provides *multiclasses* to collect a group of record definitions in +one place. A multiclass is a sort of macro that can be "invoked" to define +multiple concrete records all at once. A multiclass can inherit from other +multiclasses, which means that the multiclass inherits all the definitions +from its parent multiclasses. + +`Appendix B: Sample Record`_ illustrates a complex record in the Intel X86 +target and the simple way in which it is defined. + +Source Files +============ + +TableGen source files are plain ASCII text files. The files can contain +statements, comments, and blank lines (see `Lexical Analysis`_). The standard file +extension for TableGen files is ``.td``. + +TableGen files can grow quite large, so there is an include mechanism that +allows one file to include the content of another file (see `Include +Files`_). This allows large files to be broken up into smaller ones, and +also provides a simple library mechanism where multiple source files can +include the same library file. + +TableGen supports a simple preprocessor that can be used to conditionalize +portions of ``.td`` files. See `Preprocessing Facilities`_ for more +information. + +Lexical Analysis +================ + +The lexical and syntax notation used here is intended to imitate +`Python's`_ notation. In particular, for lexical definitions, the productions +operate at the character level and there is no implied whitespace between +elements. The syntax definitions operate at the token level, so there is +implied whitespace between tokens. + +.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation + +TableGen supports BCPL-style comments (``// ...``) and nestable C-style +comments (``/* ... */``). +TableGen also provides simple `Preprocessing Facilities`_. + +Formfeed characters may be used freely in files to produce page breaks when +the file is printed for review. + +The following are the basic punctuation tokens:: + + - + [ ] { } ( ) < > : ; . = ? # + +Literals +-------- + +Numeric literals take one of the following forms: + +.. productionlist:: + TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` + DecimalInteger: ["+" | "-"] ("0"..."9")+ + HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ + BinInteger: "0b" ("0" | "1")+ + +Observe that the :token:`DecimalInteger` token includes the optional ``+`` +or ``-`` sign, unlike most languages where the sign would be treated as a +unary operator. + +TableGen has two kinds of string literals: + +.. productionlist:: + TokString: '"' (non-'"' characters and escapes) '"' + TokCodeFragment: "[{" (shortest text not containing "}]") "}]" + +A :token:`TokCodeFragment` is nothing more than a multi-line string literal +delimited by ``[{`` and ``}]``. It can break across lines. + +The current implementation accepts the following escape sequences:: + + \\ \' \" \t \n + +Identifiers +----------- + +TableGen has name- and identifier-like tokens, which are case-sensitive. + +.. productionlist:: + ualpha: "a"..."z" | "A"..."Z" | "_" + TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* + TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* + +Note that, unlike most languages, TableGen allows :token:`TokIdentifier` to +begin with an integer. In case of ambiguity, a token is interpreted as a +numeric literal rather than an identifier. + +TableGen has the following reserved words, which cannot be used as +identifiers:: + + bit bits class code dag + def else foreach defm defset + defvar field if in include + int let list multiclass string + then + +.. warning:: + The ``field`` reserved word is deprecated. + +Bang operators +-------------- + +TableGen provides "bang operators" that have a wide variety of uses: + +.. productionlist:: + BangOperator: one of + : !add !and !cast !con !dag + : !empty !eq !foldl !foreach !ge + : !getop !gt !head !if !isa + : !le !listconcat !listsplat !lt !mul + : !ne !or !setop !shl !size + : !sra !srl !strconcat !subst !tail + +The ``!cond`` operator has a slightly different +syntax compared to other bang operators, so it is defined separately: + +.. productionlist:: + CondOperator: !cond + +See `Appendix A: Bang Operators`_ for a description of each bang operator. + +Include files +------------- + +TableGen has an include mechanism. The content of the included file +lexically replaces the ``include`` directive and is then parsed as if it was +originally in the main file. + +.. productionlist:: + IncludeDirective: "include" `TokString` + +Portions of the main file and included files can be conditionalized using +preprocessor directives. + +.. productionlist:: + PreprocessorDirective: "#define" | "#ifdef" | "#ifndef" + +Types +===== + +The TableGen language is statically typed, using a simple but complete type +system. Types are used to check for errors, to perform implicit conversions, +and to help interface designers constrain the allowed input. Every value is +required to have an associated type. + +TableGen supports a mixture of low-level types (e.g., ``bit``) and +high-level types (e.g., ``dag``). This flexibility allows you to describe a +wide range of records conveniently and compactly. + +.. productionlist:: + Type: "bit" | "int" | "string" | "code" | "dag" + :| "bits" "<" `TokInteger` ">" + :| "list" "<" `Type` ">" + :| `ClassID` + ClassID: `TokIdentifier` + +``bit`` + A ``bit`` is a boolean value that can be 0 or 1. + +``int`` + The ``int`` type represents a simple 64-bit integer value, such as 5 or + -42. + +``string`` + The ``string`` type represents an ordered sequence of characters of arbitrary + length. + +``code`` + The ``code`` type represents a code fragment. The values are the same as + those for the ``string`` type; the ``code`` type is provided just to indicate + the programmer's intention. + +``bits<``\ *n*\ ``>`` + The ``bits`` type is a fixed-size integer of arbitrary length *n* that + is treated as separate bits. These bits can be accessed individually. + A field of this type is useful for representing an instruction operation + code, register number, or address mode/register/displacement. The bits of + the field can be set individually or as subfields. For example, in an + instruction address, the addressing mode, base register number, and + displacement can be set separately. + +``list<``\ *type*\ ``>`` + This type represents a list whose elements are of the *type* specified in + angle brackets. The element type is arbitrary; it can even be another + list type. List elements are indexed from 0. + +``dag`` + This type represents a nestable directed acyclic graph (DAG) of nodes. + Each node has an operator and one or more operands. A operand can be + another ``dag`` object, allowing an arbitrary tree of nodes and edges. + As an example, DAGs are used to represent code and patterns for use by + the code generator instruction selection algorithms. + +:token:`ClassID` + Specifying a class name in a type context indicates + that the type of the defined value must + be a subclass of the specified class. This is useful in conjunction with + the ``list`` type; for example, to constrain the elements of the list to a + common base class (e.g., a ``list`` can only contain definitions + derived from the ``Register`` class). + The :token:`ClassID` must name a class that has been previously + declared or defined. + + +Values and Expressions +====================== + +There are many contexts in TableGen statements where a value is required. A +common example is in the definition of a record, where each field is +specified by a name and an optional value. TableGen allows for a reasonable +number of different forms when building up values. These forms allow the +TableGen file to be written in a syntax that is natural for the application. + +Note that all of the values have rules for converting them from one type to +another. For example, these rules allow you to assign a value like ``7`` +to an entity of type ``bits<4>``. + +.. productionlist:: + Value: `SimpleValue` `ValueSuffix`* + ValueSuffix: "{" `RangeList` "}" + :| "[" `RangeList` "]" + :| "." `TokIdentifier` + RangeList: `RangePiece` ("," `RangePiece`)* + RangePiece: `TokInteger` + :| `TokInteger` ".." `TokInteger` + :| `TokInteger` "-" `TokInteger` + :| `TokInteger` `TokInteger` + +.. warning:: + The peculiar last form of :token:`RangePiece` is due to the fact that the + "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as + two consecutive tokens, with values ``1`` and ``-5``, + instead of "1", "-", and "5". + +Simple values +------------- + +The :token:`SimpleValue` has a number of forms. + +.. productionlist:: + SimpleValue: `TokInteger` | `TokString`+ | `TokCodeFragment` + +A value can be an integer literal, a string literal, or a code fragment +literal. Multiple adjacent string literals are concatenated as in C/C++; the +simple value is the concatenation of the strings. Code fragments become +strings and then are indistinguishable from them. + +.. productionlist:: + SimpleValue2: "?" + +A question mark represents an uninitialized value. + +.. productionlist:: + SimpleValue3: "{" [`ValueList`] "}" + ValueList: `ValueListNE` + ValueListNE: `Value` ("," `Value`)* + +This value represents a sequence of bits, which can be used to initialize a +``bits<``\ *n*\ ``>`` field (note the braces). When doing so, the values +must represent a total of *n* bits. + +.. productionlist:: + SimpleValue4: "[" `ValueList` "]" ["<" `Type` ">"] + +This value is a list initializer (note the brackets). The values in brackets +are the elements of the list. The optional :token:`Type` can be used to +indicate a specific element type; otherwise the element type is inferred +from the given values. TableGen can usually infer the type, although +sometimes not when the value is the empty list (``[]``). + +.. productionlist:: + SimpleValue5: "(" `DagArg` [`DagArgList`] ")" + DagArgList: `DagArg` ("," `DagArg`)* + DagArg: `Value` [":" `TokVarName`] | `TokVarName` + +This represents a DAG initializer (note the parentheses). The first +:token:`DagArg` is called the "operator" of the DAG and must be a record. + +.. productionlist:: + SimpleValue6: `TokIdentifier` + +The resulting value is the value of the entity named by the identifier. The +possible identifiers are described here, but the descriptions will make more +sense after reading the remainder of this guide. + +.. The code for this is exceptionally abstruse. These examples are a + best-effort attempt. + +* A template argument of a ``class``, such as the use of ``Bar`` in:: + + class Foo { + int Baz = Bar; + } + +* The implicit template argument ``NAME`` in a ``class`` or ``multiclass`` + definition (see `NAME`_). + +* A field local to a ``class``, such as the use of ``Bar`` in:: + + class Foo { + int Bar = 5; + int Baz = Bar; + } + +* The name of a record definition, such as the use of ``Bar`` in the + definition of ``Foo``:: + + def Bar : SomeClass { + int X = 5; + } + + def Foo { + SomeClass Baz = Bar; + } + +* A field local to a record definition, such as the use of ``Bar`` in:: + + def Foo { + int Bar = 5; + int Baz = Bar; + } + + Fields inherited from the record's parent classes can be accessed the same way. + +* A template argument of a ``multiclass``, such as the use of ``Bar`` in:: + + multiclass Foo { + def : SomeClass; + } + +* A variable defined with the ``defvar`` or ``defset`` statements. + +* The iteration variable of a ``foreach``, such as the use of ``i`` in:: + + foreach i = 0..5 in + def Foo#i; + +.. productionlist:: + SimpleValue7: `ClassID` "<" `ValueListNE` ">" + +This form creates a new anonymous record definition (as would be created by an +unnamed ``def`` inheriting from the given class with the given template +arguments; see `def`_) and the value is that record. (A field of the record can be +obtained using a suffix; see `Suffixed Values`_.) + +.. productionlist:: + SimpleValue8: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" + :| `CondOperator` "(" `CondClause` ("," `CondClause`)* ")" + CondClause: `Value` ":" `Value` + +The bang operators provide functions that are not available with the other +simple values. Except in the case of ``!cond``, a bang +operator takes a list of arguments enclosed in parentheses and performs some +function on those arguments, producing a value for that +bang operator. The ``!cond`` operator takes a list of pairs of arguments +separated by colons. See `Appendix A: Bang Operators`_ for a description of +each bang operator. + + +Suffixed values +--------------- + +The :token:`SimpleValue` values described above can be specified with +certain suffixes. The purpose of a suffix is to obtain a subvalue of the +primary value. Here are the possible suffixes for some primary *value*. + +*value*\ ``{17}`` + The final value is bit 17 of the integer *value* (note the braces). + +*value*\ ``{8..15}`` + The final value is bits 8--15 of the integer *value*. The order of the + bits can be reversed by specifying ``{15..8}``. + +*value*\ ``[4..7,17,2..3,4]`` + The final value is a new list that is a slice of the list *value* (note + the brackets). The + new list contains elements 4, 5, 6, 7, 17, 2, 3, and 4. Elements may be + included multiple times and in any order. + +*value*\ ``.`` *field* + The final value is the value of the specified *field* in the specified + record *value*. + +Statements +========== + +The following statements may appear at the top level of TableGen source +files. + +.. productionlist:: + TableGenFile: `Statement`* + Statement: `Class` | `Def` | `Defm` | `Defset` | `Defvar` | `Foreach` + :| `If` | `Let` | `MultiClass` + +The following sections describe each of these top-level statements. + + +``class`` --- define an abstract record class +--------------------------------------------- + +A ``class`` statement defines an abstract record class from which other +classes and records can inherit. + +.. productionlist:: + Class: "class" `ClassID` [`TemplateArgList`] `RecordBody` + TemplateArgList: "<" `TemplateArgDecl` ("," `TemplateArgDecl`)* ">" + TemplateArgDecl: `Type` `TokIdentifier` ["=" `Value`] + +A class can be parameterized by a list of "template arguments," whose values +can be used in the class's record body. These template arguments are +specified each time the class is inherited by another class or record. + +If a template argument is not assigned a default value with ``=``, it is +uninitialized (has the "value" ``?``) and must be specified in the template +argument list when the class is inherited. If an argument is assigned a +default value, then it need not be specified in the argument list. The +template argument default values are evaluated from left to right. + +The :token:`RecordBody` is defined below. It can include a list of +superclasses from which the current class inherits, along with field definitions +and other statements. When a class ``C`` inherits from another class ``D``, +the fields of ``D`` are effectively merged into the fields of ``C``. + +A given class can only be defined once. A ``class`` statement is +considered to define the class if *any* of the following are true (the +:token:`RecordBody` elements are described below). + +* The :token:`TemplateArgList` is present, or +* The :token:`ParentClassList` in the :token:`RecordBody` is present, or +* The :token:`Body` in the :token:`RecordBody` is present and not empty. + +You can declare an empty class by specifying an empty :token:`TemplateArgList` +and an empty :token:`RecordBody`. This can serve as a restricted form of +forward declaration. Note that records derived from a forward-declared +class will inherit no fields from it, because those records are built when +their declarations are parsed, and thus before the class is finally defined. + +.. _NAME: + +Every class has an implicit template argument named ``NAME`` (uppercse), +which is bound to the name of the :token:`Def` or :token:`Defm` inheriting +the class. The value of ``NAME`` is undefined if the class is inherited by +an anonymous record. + +See `Examples: classes and records`_ for examples. + +Record Bodies +````````````` + +Record bodies appear in both class and record definitions. A record body can +include a parent class list, which specifies the classes from which the +current class or record inherits fields. Such classes are called the +superclasses or parent classes of the class or record. The record body also +includes the main body of the definition, which contains the specification +of the fields of the class or record. + +.. productionlist:: + RecordBody: `ParentClassList` `Body` + ParentClassList: [":" `ParentClassListNE`] + ParentClassListNE: `ClassRef` ("," `ClassRef`)* + ClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] + +A :token:`ParentClassList` containing a :token:`MultiClassID` is valid only +in the class list of a ``defm`` statement. In that case, the ID must be the +name of a multiclass. + +.. productionlist:: + Body: ";" | "{" `BodyItem`* "}" + BodyItem: `Type` `TokIdentifier` ["=" `Value`] ";" + :| "let" `TokIdentifier` ["{" `RangeList` "}"] "=" `Value` ";" + :| "defvar" `TokIdentifier` "=" `Value` ";" + +A field definition in the body specifies a field to be included in the class +or record. If no initial value is specified, then the field's value is +uninitialized. The type must be specified; TableGen will not infer it from +the value. + +The ``let`` form is used to reset a field to a new value. This can be done +for fields defined directly in the body or fields inherited from +superclasses. A :token:`RangeList` can be specified to reset certain bits +in a ``bit`` field. + +The ``defvar`` form defines a variable whose value can be used in other +value expressions within the body. The variable is not a field: it does not +become a field of the class or record being defined. Variables are provided +to hold temporary values while processing the body. See `Defvar in Record +Body`_ for more details. + +When class ``C2`` inherits from class ``C1``, it acquires all the field +definitions of ``C1``. As those definitions are merged into class ``C2``, any +template arguments passed to ``C1`` by ``C2`` are substituted into the +definitions. In other words, the abstract record fields defined by ``C1`` are +expanded with the template arguments before being merged into ``C2``. + + +.. _def: + +``def`` --- define a concrete record +------------------------------------ + +A ``def`` statement defines a new concrete record. + +.. productionlist:: + Def: "def" [`NameValue`] `RecordBody` + NameValue: `Value` + +The name value is optional. If specified, it is parsed in a special mode +where undefined (unrecognized) identifiers are interpreted as literal +strings. In particular, global identifiers are considered unrecognized. +These include global variables defined by ``defvar`` and ``defset``. + +If no name value is given, the record is *anonymous*. The final name of an +anonymous record is unspecified but globally unique. + +Special handling occurs if a ``def`` appears inside a ``multiclass`` +statement. See the ``multiclass`` section below for details. + +A record can inherit from one or more classes by specifying the +:token:`ParentClassList` clause at the beginning of its record body. All of +the fields in the parent classes are added to the record. If two or more +parent classes provide the same field, the record ends up with the field value +of the last parent class. + +As a special case, the name of a record can be passed in a template argument +to that record's superclasses. For example: + +.. code-block:: text + + class A { + dag the_dag = d; + } + + def rec1 : A<(ops rec1)> + +The DAG ``(ops rec1)`` is passed as a template argument to class ``A``. Notice +that the DAG includes ``rec1``, the record being defined. + +The steps taken to create a new record are somewhat complex. See `How +records are built`_. + +See `Examples: classes and records`_ for examples. + + +Examples: classes and records +----------------------------- + +Here is a simple TableGen file with one class and two record definitions. + +.. code-block:: text + + class C { + bit V = 1; + } + + def X : C; + def Y : C { + let V = 0; + string Greeting = "Hello!"; + } + +First, the abstract class ``C`` is defined. It has one field named ``V`` +that is a bit initialized to 1. + +Next, two records are defined, derived from class ``C``; that is, with ``C`` +as their superclass. Thus they both inherit the ``V`` field. Record ``Y`` +also defines another string field, ``Greeting``, which is initialized to +``"Hello!"``. In addition, ``Y`` overrides the inherited ``V`` field, +setting it to 0. + +A class is useful for isolating the common features of multiple records in +one place. A class can initialize common fields to default values, but +records inheriting from that class can override the defaults. + +TableGen supports the definition of parameterized classes as well as +nonparameterized ones. Parameterized classes specify a list of variable +declarations, which may optionally have defaults, that are bound when the +class is specified as a superclass of another class or record. + +.. code-block:: text + + class FPFormat val> { + bits<3> Value = val; + } + + def NotFP : FPFormat<0>; + def ZeroArgFP : FPFormat<1>; + def OneArgFP : FPFormat<2>; + def OneArgFPRW : FPFormat<3>; + def TwoArgFP : FPFormat<4>; + def CompareFP : FPFormat<5>; + def CondMovFP : FPFormat<6>; + def SpecialFP : FPFormat<7>; + +The purpose of the ``FPFormat`` class is to act as a sort of enumerated +type. It provides a single field, ``Value``, which holds a 3-bit number. Its +template argument, ``val``, is used to set the ``Value`` field. +Each of the eight records is defined with ``FPFormat`` as its superclass. The +enumeration value is passed in angle brackets as the template argument. Each +record will inherent the ``Value`` field with the appropriate enumeration +value. + +Here is a more complex example of classes with template arguments. First, we +define a class similar to the ``FPFormat`` class above. It takes a template +argument and uses it to initialize a field named ``Value``. Then we define +four records that inherit the ``Value`` field with its four different +integer values. + +.. code-block:: text + + class ModRefVal val> { + bits<2> Value = val; + } + + def None : ModRefVal<0>; + def Mod : ModRefVal<1>; + def Ref : ModRefVal<2>; + def ModRef : ModRefVal<3>; + +This is somewhat contrived, but let's say we would like to examine the two +bits of the ``Value`` field independently. We can define a class that +accepts a ``ModRefVal`` record as a template argument and splits up its +value into two fields, one bit each. Then we can define records that inherit from +``ModRefBits`` and so acquire two fields from it, one for each bit in the +``ModRefVal`` record passed as the template argument. + +.. code-block:: text + + class ModRefBits { + // Break the value up into its bits, which can provide a nice + // interface to the ModRefVal values. + bit isMod = mrv.Value{0}; + bit isRef = mrv.Value{1}; + } + + // Example uses. + def foo : ModRefBits; + def bar : ModRefBits; + def snork : ModRefBits; + +This illustrates how one class can be defined to reorganize the +fields in another class, thus hiding the internal representation of that +other class. + +Running ``llvm-tblgen`` on the example prints the following definitions: + +.. code-block:: text + + def bar { // Value + bit isMod = 0; + bit isRef = 1; + } + def foo { // Value + bit isMod = 1; + bit isRef = 0; + } + def snork { // Value + bit isMod = 1; + bit isRef = 1; + } + +``let`` --- override fields in classes or records +------------------------------------------------- + +A ``let`` statement collects a set of field values (sometimes called +*bindings*) and applies them to all the classes and records defined by +statements within the scope of the ``let``. + +.. productionlist:: + Let: "let" `LetList` "in" "{" `Statement`* "}" + :| "let" `LetList` "in" `Statement` + LetList: `LetItem` ("," `LetItem`)* + LetItem: `TokIdentifier` ["<" `RangeList` ">"] "=" `Value` + +The ``let`` statement establishes a scope, which is a sequence of statements +in braces or a single statement with no braces. The bindings in the +:token:`LetList` apply to the statements in that scope. + +The field names in the :token:`LetList` must name fields in classes inherited by +the classes and records defined in the statements. The field values are +applied to the classes and records *after* the records inherit all the fields from +their superclasses. So the ``let`` acts to override inherited field +values. A ``let`` cannot override the value of a template argument. + +Top-level ``let`` statements are often useful when a few fields need to be +overriden in several records. Here are two examples. Note that ``let`` +statements can be nested. + +.. code-block:: text + + let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in + def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; + + let isCall = 1 in + // All calls clobber the non-callee saved registers... + let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2, + XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { + def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst, variable_ops), + "call\t${dst:call}", []>; + def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), + "call\t{*}$dst", [(X86call GR32:$dst)]>; + def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), + "call\t{*}$dst", []>; + } + +Note that a top-level ``let`` will not override fields defined in the classes or records +themselves. + + +``multiclass`` --- define multiple records +------------------------------------------ + +While classes with template arguments are a good way to factor out commonality +between multiple records, multiclasses allow a convenient method for +defining multiple records at once. For example, consider a 3-address +instruction architecture whose instructions come in two formats: ``reg = reg +op reg`` and ``reg = reg op imm`` (e.g., SPARC). We would like to specify in +one place that these two common formats exist, then in a separate place +specify what all the operations are. The ``multiclass`` and ``defm`` +statements accomplish this goal. You can think of a multiclass as a macro or +template that expands into multiple records. + +.. productionlist:: + MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] + : [":" `ParentMultiClassList`] + : "{" `Statement`+ "}" + ParentMultiClassList: `MultiClassID` ("," `MultiClassID`)* + MultiClassID: `TokIdentifier` + +As with regular classes, the multiclass has a name and can accept template +arguments. The body of the multiclass contains a series of statements that +define records, using :token:`Def` and :token:`Defm`. In addition, +:token:`Defvar`, :token:`Foreach`, and :token:`Let` +statements can be used to factor out even more common elements. + +Also as with regular classes, the multiclass has the implicit template +argument ``NAME`` (see NAME_). When a named (non-anonymous) record is +defined in a multiclass and the record's name does not contain a use of the +template argument ``NAME``, such a use is automatically prepended +to the name. That is, the following are equivalent inside a multiclass:: + + def Foo ... + def NAME#Foo ... + +The records defined in a multiclass are instantiated when the multiclass is +"invoked" by a ``defm`` statement outside the multiclass definition. Each +``def`` statement produces a record. As with top-level ``def`` statements, +these definitions can inherit from multiple superclasses. + +See `Examples: multiclasses and defms`_ for examples. + + +``defm`` --- invoke multiclasses to define multiple records +----------------------------------------------------------- + +Once multiclasses have been defined, you use the ``defm`` statement to +"invoke" multiclasses and process the multiple record definitions in those +multiclasses. Those record definitions are specified by ``def`` +statements in the multiclasses, and indirectly by ``defm`` statements. + +.. productionlist:: + Defm: "defm" [`NameValue`] `ParentClassList` ";" + +The optional :token:`NameValue` is formed in the same way as the name of a +``def``. The :token:`ParentClassList` is a colon followed by a list of at least one +multiclass and any number of regular classes. The multiclasses must +precede the regular classes. Note that the ``defm`` does not have a body. + +This statement instantiates all the records defined in all the specified +multiclasses, either directly by ``def`` statements or indirectly by +``defm`` statements. These records also receive the fields defined in any +regular classes included in the parent class list. This is useful for adding +a common set of fields to all the records created by the ``defm``. + +The name is parsed in the same special mode used by ``def``. If the name is +not included, a globally unique name is provided. That is, the following +examples end up with different names:: + + defm : SomeMultiClass<...>; // A globally unique name. + defm "" : SomeMultiClass<...>; // An empty name. + +The ``defm`` statement can be used in a multiclass body. When this occurs, +the second variant is equivalent to:: + + defm NAME : SomeMultiClass<...>; + +More generally, when ``defm`` occurs in a multiclass and its name does not +include a use of the implicit template argument ``NAME``, then ``NAME`` will +be prepended automatically. That is, the following are equivalent inside a +multiclass:: + + defm Foo : SomeMultiClass<...>; + defm NAME#Foo : SomeMultiClass<...>; + +See `Examples: multiclasses and defms`_ for examples. + +Examples: multiclasses and defms +-------------------------------- + +Here is a simple example using ``multiclass`` and ``defm``. Consider a +3-address instruction architecture whose instructions come in two formats: +``reg = reg op reg`` and ``reg = reg op imm`` (immediate). The SPARC is an +example of such an architecture. + +.. code-block:: text + + def ops; + def GPR; + def Imm; + class inst ; + + multiclass ri_inst { + def _rr : inst; + def _ri : inst; + } + + // Define records for each instruction in the RR and RI formats. + defm ADD : ri_inst<0b111, "add">; + defm SUB : ri_inst<0b101, "sub">; + defm MUL : ri_inst<0b100, "mul">; + +Each use of the ``ri_inst`` multiclass defines two records, one with the +``_rr`` suffix and one with ``_ri``. Recall that the name of the ``defm`` +that uses a multiclass is prepended to the names of the records defined in +that multiclass. So the resulting definitions are named:: + + ADD_rr, ADD_ri + SUB_rr, SUB_ri + MUL_rr, MUL_ri + +Without the ``multiclass`` feature, the instructions would have to be +defined as follows. + +.. code-block:: text + + def ops; + def GPR; + def Imm; + class inst ; + + class rrinst + : inst; + + class riinst + : inst; + + // Define records for each instruction in the RR and RI formats. + def ADD_rr : rrinst<0b111, "add">; + def ADD_ri : riinst<0b111, "add">; + def SUB_rr : rrinst<0b101, "sub">; + def SUB_ri : riinst<0b101, "sub">; + def MUL_rr : rrinst<0b100, "mul">; + def MUL_ri : riinst<0b100, "mul">; + +A ``defm`` can be used in a multiclass to "invoke" other multiclasses and +create the records defined in those multiclasses in addition to the records +defined in the current multiclass. In the following example, the ``basic_s`` +and ``basic_p`` multiclasses contain ``defm`` statements that refer to the +``basic_r`` multiclass. The ``basic_r`` multiclass contains only ``def`` +statements. + +.. code-block:: text + + class Instruction opc, string Name> { + bits<4> opcode = opc; + string name = Name; + } + + multiclass basic_r opc> { + def rr : Instruction; + def rm : Instruction; + } + + multiclass basic_s opc> { + defm SS : basic_r; + defm SD : basic_r; + def X : Instruction; + } + + multiclass basic_p opc> { + defm PS : basic_r; + defm PD : basic_r; + def Y : Instruction; + } + + defm ADD : basic_s<0xf>, basic_p<0xf>; + +The final ``defm`` creates the following records, five from the ``basic_s`` +multiclass and five from the ``basic_p`` multiclass:: + + ADDSSrr, ADDSSrm + ADDSDrr, ADDSDrm + ADDX + ADDPSrr, ADDPSrm + ADDPDrr, ADDPDrm + ADDY + +A ``defm`` statement, both at top level and in a multiclass, can inherit +from regular classes in addition to multiclasses. The rule is that the +regular classes must be listed after the multiclasses, and there must be at least +one multiclass. + +.. code-block:: text + + class XD { + bits<4> Prefix = 11; + } + class XS { + bits<4> Prefix = 12; + } + class I op> { + bits<4> opcode = op; + } + + multiclass R { + def rr : I<4>; + def rm : I<2>; + } + + multiclass Y { + defm SS : R, XD; // First multiclass R, then regular class XD. + defm SD : R, XS; + } + + defm Instr : Y; + +This example will create four records, shown here in alphabetical order with +their fields. + +.. code-block:: text + + def InstrSDrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + + def InstrSDrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + + def InstrSSrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + + def InstrSSrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + +It's also possible to use ``let`` statements inside multiclasses, providing +another way to factor out commonality from the records, especially when +using several levels of multiclass instantiations. + +.. code-block:: text + + multiclass basic_r opc> { + let Predicates = [HasSSE2] in { + def rr : Instruction; + def rm : Instruction; + } + let Predicates = [HasSSE3] in + def rx : Instruction; + } + + multiclass basic_ss opc> { + let IsDouble = 0 in + defm SS : basic_r; + + let IsDouble = 1 in + defm SD : basic_r; + } + + defm ADD : basic_ss<0xf>; + + +``defset`` --- create a definition set +-------------------------------------- + +The ``defset`` statement is used to collect a set of records into a global +list of records. + +.. productionlist:: + Defset: "defset" `Type` `TokIdentifier` "=" "{" `Statement`* "}" + +All records defined inside the braces via ``def`` and ``defm`` are defined +as usual, and they are also collected in a global list of the given name +(:token:`TokIdentifier`). + +The specified type must be ``list<``\ *class*\ ``>``, where *class* is some +record class. The ``defset`` statement establishes a scope for its +statements. It is an error to define a record in the scope of the +``defset`` that is not of type *class*. + +The ``defset`` statement can be nested. The inner ``defset`` adds the +records to its own set, and all those records are also added to the outer +set. + +Anonymous records created inside initialization expressions using the +``ClassID<...>`` syntax are not collected in the set. + + +``defvar`` --- define a variable +-------------------------------- + +A ``defvar`` statement defines a global variable. Its value can be used +throughout the statements that follow the definition. + +.. productionlist:: + Defvar: "defvar" `TokIdentifier` "=" `Value` ";" + +The identifier on the left of the ``=`` is defined to be a global variable +whose value is given by the value expression on the right of the ``=``. The +type of the variable is automatically inferred. + +Once a variable has been defined, it cannot be set to another value. + +Variables defined in a top-level ``foreach`` go out of scope at the end of +each loop iteration, so their value in one iteration is not available in +the next iteration. The following ``defvar`` will not work:: + + defvar i = !add(i, 1) + +Variables can also be defined with ``defvar`` in a record body. See +`Defvar in Record Body`_ for more details. + +``foreach`` --- iterate over a sequence +--------------------------------------- + +The ``foreach`` statement iterates over a series of statements, varying a +variable over a sequence of values. + +.. productionlist:: + Foreach: "foreach" `ForeachIterator` "in" "{" `Statement`* "}" + :| "foreach" `ForeachIterator` "in" `Statement` + ForeachIterator: `TokIdentifier` "=" ("{" `RangeList` "}" | `RangePiece` | `Value`) + +The body of the ``foreach`` is a series of statements in braces or a +single statement with no braces. The statements are re-evaluated once for +each value in the range list, range piece, or single value. On each +iteration, the :token:`TokIdentifier` variable is set to the value and can +be used in the statements. + +The statement list establishes an inner scope. Variables local to a +``foreach`` go out of scope at the end of each loop iteration, so their +values do not carry over from one iteration to the next. Foreach loops may +be nested. + +The ``foreach`` statement can also be used in a record :token:`Body`. + +.. Note that the productions involving RangeList and RangePiece have precedence + over the more generic value parsing based on the first token. + +.. code-block:: text + + foreach i = [0, 1, 2, 3] in { + def R#i : Register<...>; + def F#i : Register<...>; + } + +This loop defines records named ``R0``, ``R1``, ``R2``, and ``R3``, along +with ``F0``, ``F1``, ``F2``, and ``F3``. + + +``if`` --- select statements based on a test +-------------------------------------------- + +The ``if`` statement allows one of two statement groups to be selected based +on the value of an expression. + +.. productionlist:: + If: "if" `Value` "then" `IfBody` + :| "if" `Value` "then" `IfBody` "else" `IfBody` + IfBody: "{" `Statement`* "}" | `Statement` + +The value expression is evaluated. If it evaluates to true (in the same +sense used by the bang operators), then the statements following the +``then`` reserved word are processed. Otherwise, if there is an ``else`` +reserved word, the statements following the ``else`` are processed. If the +value is false and there is no ``else`` arm, no statements are processed. + +Because the braces around the ``then`` statements are optional, this grammar rule +has the usual ambiguity with "dangling else" clauses, and it is resolved in +the usual way: in a case like ``if v1 then if v2 then {...} else {...}``, the +``else`` associates with the inner ``if`` rather than the outer one. + +The :token:`IfBody` of the then and else arms of the ``if`` establish an +inner scope. Any ``defvar`` variables defined in the bodies go out of scope +when the bodies are finished (see `Defvar in Record Body`_ for more details). + +The ``if`` statement can also be used in a record :token:`Body`. + + +Additional Details +================== + +Defvar in record body +--------------------- + +In addition to defining global variables, the ``defvar`` statement can +be used inside the :token:`Body` of a class or record definition to define +local variables. The scope of the variable extends from the ``defvar`` +statement to the end of the body. It cannot be set to a different value +within its scope. The ``defvar`` statement can also be used in the statement +list of a ``foreach``, which establishes a scope. + +A variable named ``V`` in an inner scope shadows (hides) any variables ``V`` +in outer scopes. In particular, ``V`` in a record body shadows a global +``V``, and ``V`` in a ``foreach`` statement list shadows any ``V`` in +surrounding global or record scopes. + +Variables defined in a ``foreach`` go out of scope at the end of +each loop iteration, so their value in one iteration is not available in +the next iteration. The following ``defvar`` will not work:: + + defvar i = !add(i, 1) + +How records are built +--------------------- + +The following steps are taken by TableGen when a record is built. Classes are simply +abstract records and so go through the same steps. + +1. Build the record name (:token:`NameValue`) and create an empty record. + +2. Parse the superclasses in the :token:`ParentClassList` from left to + right, visiting each superclass's ancestor classes from top to bottom. + + a. Add the fields from the superclass to the record. + b. Substitute the template arguments into those fields. + c. Add the superclass to the record's list of inherited classes. + +3. Apply any top-level ``let`` bindings to the record. Recall that top-level + bindings only apply to inherited fields. + +4. Parse the body of the record. + + * Add any fields to the record. + * Modify the values of fields according to local ``let`` statements. + * Define any ``defvar`` variables. + +5. Make a pass over all the fields to resolve any inter-field references. + +6. Add the record to the master record list. + + +Because references between fields are resolved (step 5) after ``let`` bindings are +applied (step 3), the ``let`` statement has unusual power. For example: + +.. code-block:: text + + class C { + int Y = x; + int Yplus1 = !add(Y, 1); + int xplus1 = !add(x, 1); + } + + let Y = 10 in { + def rec1 : C<5> { + } + } + + def rec2 : C<5> { + let Y = 10; + } + +In both cases, one where a top-level ``let`` is used to bind ``Y`` and one +where a local ``let`` does the same thing, the results are: + +.. code-block:: text + + def rec1 { // C + int Y = 10; + int Yplus1 = 11; + int xplus1 = 6; + } + def rec2 { // C + int Y = 10; + int Yplus1 = 11; + int xplus1 = 6; + } + +``Yplus1`` is 11 because the ``let Y`` is performed before the ``!add(Y, +1)`` is resolved. Use this power wisely. + + +Preprocessing Facilities +======================== + +The preprocessor embedded in TableGen is intended only for simple +conditional compilation. It supports the following directives, which are +specified somewhat informally. + +.. productionlist:: + LineBegin: beginning of line + LineEnd: newline | return | EOF + WhiteSpace: space | tab + CComment: "/*" ... "*/" + BCPLComment: "//" ... `LineEnd` + WhiteSpaceOrCComment: `WhiteSpace` | `CComment` + WhiteSpaceOrAnyComment: `WhiteSpace` | `CComment` | `BCPLComment` + MacroName: `ualpha` (`ualpha` | "0"..."9")* + PreDefine: `LineBegin` (`WhiteSpaceOrCComment`)* + : "#define" (`WhiteSpace`)+ `MacroName` + : (`WhiteSpaceOrAnyComment`)* `LineEnd` + PreIfdef: `LineBegin` (`WhiteSpaceOrCComment`)* + : ("#ifdef" | "#ifndef") (`WhiteSpace`)+ `MacroName` + : (`WhiteSpaceOrAnyComment`)* `LineEnd` + PreElse: `LineBegin` (`WhiteSpaceOrCComment`)* + : "#else" (`WhiteSpaceOrAnyComment`)* `LineEnd` + PreEndif: `LineBegin` (`WhiteSpaceOrCComment`)* + : "#endif" (`WhiteSpaceOrAnyComment`)* `LineEnd` + +.. + PreRegContentException: `PreIfdef` | `PreElse` | `PreEndif` | EOF + PreRegion: .* - `PreRegContentException` + :| `PreIfdef` + : (`PreRegion`)* + : [`PreElse`] + : (`PreRegion`)* + : `PreEndif` + +A :token:`MacroName` can be defined anywhere in a TableGen file. The name has +no value; it can only be tested to see whether it is defined. + +A macro test region begins with an ``#ifdef`` or ``#ifndef`` directive. If +the macro name is defined (``#ifdef``) or undefined (``#ifndef``), then the +source code between the directive and the corresponding ``#else`` or +``#endif`` is processed. If the test fails but there is an ``#else`` +portion, the source code between the ``#else`` and the ``#endif`` is +processed. If the test fails and there is no ``#else`` portion, then no +source code in the test region is processed. + +Test regions may be nested, but they must be properly nested. A region +started in a file must end in that file; that is, must have its +``#endif`` in the same file. + +A :token:`MacroName` may be defined externally using the ``-D`` option on the +``llvm-tblgen`` command line:: + + llvm-tblgen self-reference.td -Dmacro1 -Dmacro3 + +Appendix A: Bang Operators +========================== + +Bang operators act as functions in value expressions. A bang operator takes +one or more arguments, operates on them, and produces a result. If the +operator produces a boolean result, the result value will be 1 for true or 0 +for false. When an operator tests a boolean argument, it interprets 0 as false +and non-0 as true. + +``!add(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator adds *a*, *b*, etc., and produces the sum. + +``!and(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator does a bitwise AND on *a*, *b*, etc., and produces the + result. + +``!cast<``\ *type*\ ``>(``\ *a*\ ``)`` + This operator performs a cast on *a* and produces the result. + If *a* is not a string, then a straightforward cast is performed, say + between an ``int`` and a ``bit``, or between record types. This allows + casting a record to a class. If a record is cast to ``string``, the + record's name is produced. + + If *a* is a string, then it is treated as a record name and looked up in + the list of all defined records. The resulting record is expected to be of + the specified *type*. + + For example, if ``!cast<``\ *type*\ ``>(``\ *name*\ ``)`` + appears in a multiclass definition, or in a + class instantiated inside a multiclass definition, and the *name* does not + reference any template arguments of the multiclass, then a record by + that name must have been instantiated earlier + in the source file. If *name* does reference + a template argument, then the lookup is delayed until ``defm`` statements + instantiating the multiclass (or later, if the defm occurs in another + multiclass and template arguments of the inner multiclass that are + referenced by *name* are substituted by values that themselves contain + references to template arguments of the outer multiclass). + + If the type of *a* does not match *type*, TableGen raises an error. + +``!con(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator concatenates the DAG nodes *a*, *b*, etc. Their operations + must equal. + + ``!con((op a1:$name1, a2:$name2), (op b1:$name3))`` + + results in the DAG node ``(op a1:$name1, a2:$name2, b1:$name3)``. + +``!cond(``\ *cond1* ``:`` *val1*\ ``,`` *cond2* ``:`` *val2*\ ``, ...,`` *condn* ``:`` *valn*\ ``)`` + This operator tests *cond1* and returns *val1* if the result is true. + If false, the operator tests *cond2* and returns *val2* if the result is + true. And so forth. An error is reported if no conditions are true. + + This example produces the sign word for an integer:: + + !cond(!lt(x, 0) : "negative", !eq(x, 0) : "zero", 1 : "positive") + +``!dag(``\ *op*\ ``,`` *children*\ ``,`` *names*\ ``)`` + This operator creates a DAG node. + The *children* and *names* arguments must be lists + of equal length or uninitialized (``?``). The *names* argument + must be of type ``list``. + + Due to limitations of the type system, *children* must be a list of items + of a common type. In practice, this means that they should either have the + same type or be records with a common superclass. Mixing ``dag`` and + non-``dag`` items is not possible. However, ``?`` can be used. + + Example: ``!dag(op, [a1, a2, ?], ["name1", "name2", "name3"])`` results in + ``(op a1:$name1, a2:$name2, ?:$name3)``. + +``!empty(``\ *list*\ ``)`` + This operator produces 1 if the *list* is empty; 0 otherwise. + +``!eq(`` *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``int``, or ``string`` values. + Use ``!cast`` to compare other types of objects. + +``!foldl(``\ *start*\ ``,`` *list*\ ``,`` *a*\ ``,`` *b*\ ``,`` *expr*\ ``)`` + This operator performs a left-fold over the items in *list*. The + variable *a* acts as the accumulator and is initialized to *start*. + The variable *b* is bound to each element in the *list*. The *expr* + expression is evaluated for each element and presumably uses *a* and *b* + to calculate the accumulated value, which ``!foldl`` stores in *a*. The + type of *a* is the same as *start*; the type of *b* is the same as the + elements of *list*; *expr* must have the same type as *start*. + + The following example computes the total of the ``Number`` field in the + list of records in ``RecList``:: + + int x = !foldl(0, RecList, total, rec, !add(total, rec.Number)); + +``!foreach(``\ *var*\ ``,`` *seq*\ ``,`` *form*\ ``)`` + This operator creates a new ``list``/``dag`` in which each element is a + function of the corresponding element in the *seq* ``list``/``dag``. To + perform the function, TableGen binds the variable *var* to an element and + then evaluates the *form* expression. The form presumably refers to the + variable *var* and calculates the result value. + +``!ge(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is greater than or equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``int``, or ``string`` values. + Use ``!cast`` to compare other types of objects. + +``!getop(``\ *dag*\ ``)`` --or-- ``!getop<``\ *type*\ ``>(``\ *dag*\ ``)`` + This operator produces the operator of the given *dag* node. + Example: ``!getop((foo 1, 2))`` results in ``foo``. + + The result of ``!getop`` can be used directly in a context where + any record value at all is acceptable (typically placing it into + another dag value). But in other contexts, it must be explicitly + cast to a particular class type. The ``<``\ *type*\ ``>`` syntax is + provided to make this easy. + + For example, to assign the result to a value of type ``BaseClass``, you + could write either of these:: + + BaseClass b = !getop(someDag); + BaseClass b = !cast(!getop(someDag)); + + But to create a new DAG node that reuses the operator from another, no + cast is necessary:: + + dag d = !dag(!getop(someDag), args, names); + +``!gt(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is greater than *b*; 0 otherwise. + The arguments must be ``bit``, ``int``, or ``string`` values. + Use ``!cast`` to compare other types of objects. + +``!head(``\ *a*\ ``)`` + This operator produces the zeroth element of the list *a*. + (See also ``!tail``.) + +``!if(``\ *test*\ ``,`` *then*\ ``,`` *else*\ ``)`` + This operator evaluates the *test*, which must produce a ``bit`` or + ``int``. If the result is not 0, the *then* expression is produced; otherwise + the *else* expression is produced. + +``!isa<``\ *type*\ ``>(``\ *a*\ ``)`` + This operator produces 1 if the type of *a* is a subtype of the given *type*; 0 + otherwise. + +``!le(``\ *a*\ ``,`` *b*\ ``)`` + This operator produces 1 if *a* is less than or equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``int``, or ``string`` values. + Use ``!cast`` to compare other types of objects. + +``!listconcat(``\ *list1*\ ``,`` *list2*\ ``, ...)`` + This operator concatenates the list arguments *list1*, *list2*, etc., and + produces the resulting list. The lists must have the same element type. + +``!listsplat(``\ *value*\ ``,`` *count*\ ``)`` + This operator produces a list of length *count* whose elements are all + equal to the *value*. For example, ``!listsplat(42, 3)`` results in + ``[42, 42, 42]``. + +``!lt(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is less than *b*; 0 otherwise. + The arguments must be ``bit``, ``int``, or ``string`` values. + Use ``!cast`` to compare other types of objects. + +``!mul(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator multiplies *a*, *b*, etc., and produces the product. + +``!ne(``\ *a*\ `,` *b*\ ``)`` + This operator produces 1 if *a* is not equal to *b*; 0 otherwise. + The arguments must be ``bit``, ``int``, or ``string`` values. + Use ``!cast`` to compare other types of objects. + +``!or(``\ *a*\ ``,`` *b*\ ``, ...)`` + This operator does a bitwise OR on *a*, *b*, etc., and produces the + result. + +``!setop(``\ *dag*\ ``,`` *op*\ ``)`` + This operator produces a DAG node with the same arguments as *dag*, but with its + operator replaced with *op*. + + Example: ``!setop((foo 1, 2), bar)`` results in ``(bar 1, 2)``. + +``!shl(``\ *a*\ ``,`` *count*\ ``)`` + This operator shifts *a* left logically by *count* bits and produces the resulting + value. The operation is performed on a 64-bit integer; the result + is undefined for shift counts outside 0..63. + +``!size(``\ *a*\ ``)`` + This operator produces the number of elements in the list *a*. + +``!sra(``\ *a*\ ``,`` *count*\ ``)`` + This operator shifts *a* right arithmetically by *count* bits and produces the resulting + value. The operation is performed on a 64-bit integer; the result + is undefined for shift counts outside 0..63. + +``!srl(``\ *a*\ ``,`` *count*\ ``)`` + This operator shifts *a* right logically by *count* bits and produces the resulting + value. The operation is performed on a 64-bit integer; the result + is undefined for shift counts outside 0..63. + +``!strconcat(``\ *str1*\ ``,`` *str2*\ ``, ...)`` + This operator concatenates the string arguments *str1*, *str2*, etc., and + produces the resulting string. + +*str1*\ ``#``\ *str2* + The paste operator (``#``) is a shorthand for + ``!strconcat`` with two arguments. It can be used to concatenate operands that + are not strings, in which + case an implicit ``!cast`` is done on those operands. + +``!subst(``\ *target*\ ``,`` *repl*\ ``,`` *value*\ ``)`` + This operator replaces all occurrences of the *target* in the *value* with + the *repl* and produces the resulting value. For strings, this is straightforward. + + If the arguments are record names, the function produces the *repl* + record if the *target* record name equals the *value* record name; otherwise it + produces the *value*. + +``!tail(``\ *a*\ ``)`` + This operator produces a new list with all the elements + of the list *a* except for the zeroth one. (See also ``!head``.) + + +Appendix B: Sample Record +========================= + +One target machine supported by LLVM is the Intel x86. The following output +from TableGen shows the record that is created to represent the 32-bit +register-to-register ADD instruction. + +.. code-block:: text + + def ADD32rr { // InstructionEncoding Instruction X86Inst I ITy Sched BinOpRR BinOpRR_RF + int Size = 0; + string DecoderNamespace = ""; + list Predicates = []; + string DecoderMethod = ""; + bit hasCompleteDecoder = 1; + string Namespace = "X86"; + dag OutOperandList = (outs GR32:$dst); + dag InOperandList = (ins GR32:$src1, GR32:$src2); + string AsmString = "add{l} {$src2, $src1|$src1, $src2}"; + EncodingByHwMode EncodingInfos = ?; + list Pattern = [(set GR32:$dst, EFLAGS, (X86add_flag GR32:$src1, GR32:$src2))]; + list Uses = []; + list Defs = [EFLAGS]; + int CodeSize = 3; + int AddedComplexity = 0; + bit isPreISelOpcode = 0; + bit isReturn = 0; + bit isBranch = 0; + bit isEHScopeReturn = 0; + bit isIndirectBranch = 0; + bit isCompare = 0; + bit isMoveImm = 0; + bit isMoveReg = 0; + bit isBitcast = 0; + bit isSelect = 0; + bit isBarrier = 0; + bit isCall = 0; + bit isAdd = 0; + bit isTrap = 0; + bit canFoldAsLoad = 0; + bit mayLoad = ?; + bit mayStore = ?; + bit mayRaiseFPException = 0; + bit isConvertibleToThreeAddress = 1; + bit isCommutable = 1; + bit isTerminator = 0; + bit isReMaterializable = 0; + bit isPredicable = 0; + bit isUnpredicable = 0; + bit hasDelaySlot = 0; + bit usesCustomInserter = 0; + bit hasPostISelHook = 0; + bit hasCtrlDep = 0; + bit isNotDuplicable = 0; + bit isConvergent = 0; + bit isAuthenticated = 0; + bit isAsCheapAsAMove = 0; + bit hasExtraSrcRegAllocReq = 0; + bit hasExtraDefRegAllocReq = 0; + bit isRegSequence = 0; + bit isPseudo = 0; + bit isExtractSubreg = 0; + bit isInsertSubreg = 0; + bit variadicOpsAreDefs = 0; + bit hasSideEffects = ?; + bit isCodeGenOnly = 0; + bit isAsmParserOnly = 0; + bit hasNoSchedulingInfo = 0; + InstrItinClass Itinerary = NoItinerary; + list SchedRW = [WriteALU]; + string Constraints = "$src1 = $dst"; + string DisableEncoding = ""; + string PostEncoderMethod = ""; + bits<64> TSFlags = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0 }; + string AsmMatchConverter = ""; + string TwoOperandAliasConstraint = ""; + string AsmVariantName = ""; + bit UseNamedOperandTable = 0; + bit FastISelShouldIgnore = 0; + bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; + Format Form = MRMDestReg; + bits<7> FormBits = { 0, 1, 0, 1, 0, 0, 0 }; + ImmType ImmT = NoImm; + bit ForceDisassemble = 0; + OperandSize OpSize = OpSize32; + bits<2> OpSizeBits = { 1, 0 }; + AddressSize AdSize = AdSizeX; + bits<2> AdSizeBits = { 0, 0 }; + Prefix OpPrefix = NoPrfx; + bits<3> OpPrefixBits = { 0, 0, 0 }; + Map OpMap = OB; + bits<3> OpMapBits = { 0, 0, 0 }; + bit hasREX_WPrefix = 0; + FPFormat FPForm = NotFP; + bit hasLockPrefix = 0; + Domain ExeDomain = GenericDomain; + bit hasREPPrefix = 0; + Encoding OpEnc = EncNormal; + bits<2> OpEncBits = { 0, 0 }; + bit HasVEX_W = 0; + bit IgnoresVEX_W = 0; + bit EVEX_W1_VEX_W0 = 0; + bit hasVEX_4V = 0; + bit hasVEX_L = 0; + bit ignoresVEX_L = 0; + bit hasEVEX_K = 0; + bit hasEVEX_Z = 0; + bit hasEVEX_L2 = 0; + bit hasEVEX_B = 0; + bits<3> CD8_Form = { 0, 0, 0 }; + int CD8_EltSize = 0; + bit hasEVEX_RC = 0; + bit hasNoTrackPrefix = 0; + bits<7> VectSize = { 0, 0, 1, 0, 0, 0, 0 }; + bits<7> CD8_Scale = { 0, 0, 0, 0, 0, 0, 0 }; + string FoldGenRegForm = ?; + string EVEX2VEXOverride = ?; + bit isMemoryFoldable = 1; + bit notEVEX2VEXConvertible = 0; + } + +On the first line of the record, you can see that the ``ADD32rr`` record +inherited from eight classes. Although the inheritance hierarchy is complex, +using superclasses is much simpler than specifying the 109 individual fields for each +instruction. + +Here is the code fragment used to define ``ADD32rr`` and multiple other +``ADD`` instructions: + +.. code-block:: text + + defm ADD : ArithBinOp_RF<0x00, 0x02, 0x04, "add", MRM0r, MRM0m, + X86add_flag, add, 1, 1, 1>; + +The ``defm`` statement tells TableGen that ``ArithBinOp_RF`` is a +multiclass, which contains multiple concrete record definitions that inherit +from ``BinOpRR_RF``. That class, in turn, inherits from ``BinOpRR``, which +inherits from ``ITy`` and ``Sched``, and so forth. The fields are inherited +from all the parent classes; for example, ``IsIndirectBranch`` is inherited +from the ``Instruction`` class. diff --git a/llvm/docs/TableGen/index.rst b/llvm/docs/TableGen/index.rst index 6100c13..6c9ba9e 100644 --- a/llvm/docs/TableGen/index.rst +++ b/llvm/docs/TableGen/index.rst @@ -1,6 +1,6 @@ -======== -TableGen -======== +================= +TableGen Overview +================= .. contents:: :local: @@ -9,8 +9,7 @@ TableGen :hidden: BackEnds - LangRef - LangIntro + ProgRef Deficiencies Introduction @@ -25,10 +24,12 @@ it easier to structure domain specific information. The core part of TableGen parses a file, instantiates the declarations, and hands the result off to a domain-specific `backend`_ for processing. +See the :doc:`TableGen Programmer's Reference <./ProgRef>` for an in-depth +description of TableGen. -The current major users of TableGen are :doc:`../CodeGenerator` -and the -`Clang diagnostics and attributes `_. +The current major users of TableGen are :doc:`The LLVM Target-Independent +Code Generator <../CodeGenerator>` and the `Clang diagnostics and attributes +`_. Note that if you work on TableGen much, and use emacs or vim, that you can find an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and @@ -249,13 +250,10 @@ in the current multiclass. !subst(SHIFT, imm_eq0, decls.pattern)), i8>; +See the :doc:`TableGen Programmer's Reference <./ProgRef>` for an in-depth +description of TableGen. -See the :doc:`TableGen Language Introduction ` for more generic -information on the usage of the language, and the -:doc:`TableGen Language Reference ` for more in-depth description -of the formal language specification. - .. _backend: .. _backends: @@ -300,5 +298,4 @@ more powerful DSLs designed with specific purposes, or even re-using existing DSLs. Either way, this is a discussion that will likely span across several years, -if not decades. You can read more in the `TableGen Deficiencies `_ -document. +if not decades. You can read more in :doc:`TableGen Deficiencies <./Deficiencies>`. -- 2.7.4