Print all records to standard output (default).
+.. option:: -dump-json
+
+ Print a JSON representation of all records, suitable for further
+ automated processing.
+
.. option:: -print-enums
Print enumeration values for a class.
**Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is
used for documenting user-facing attributes.
+General BackEnds
+================
+
+JSON
+----
+
+**Purpose**: Output all the values in every ``def``, as a JSON data
+structure that can be easily parsed by a variety of languages. Useful
+for writing custom backends without having to modify TableGen itself,
+or for performing auxiliary analysis on the same TableGen data passed
+to a built-in backend.
+
+**Output**:
+
+The root of the output file is a JSON object (i.e. dictionary),
+containing the following fixed keys:
+
+* ``!tablegen_json_version``: a numeric version field that will
+ increase if an incompatible change is ever made to the structure of
+ this data. The format described here corresponds to version 1.
+
+* ``!instanceof``: a dictionary whose keys are the class names defined
+ in the TableGen input. For each key, the corresponding value is an
+ array of strings giving the names of ``def`` records that derive
+ from that class. So ``root["!instanceof"]["Instruction"]``, for
+ example, would list the names of all the records deriving from the
+ class ``Instruction``.
+
+For each ``def`` record, the root object also has a key for the record
+name. The corresponding value is a subsidiary object containing the
+following fixed keys:
+
+* ``!superclasses``: an array of strings giving the names of all the
+ classes that this record derives from.
+
+* ``!fields``: an array of strings giving the names of all the variables
+ in this record that were defined with the ``field`` keyword.
+
+* ``!name``: a string giving the name of the record. This is always
+ identical to the key in the JSON root object corresponding to this
+ record's dictionary. (If the record is anonymous, the name is
+ arbitrary.)
+
+* ``!anonymous``: a boolean indicating whether the record's name was
+ specified by the TableGen input (if it is ``false``), or invented by
+ TableGen itself (if ``true``).
+
+For each variable defined in a record, the ``def`` object for that
+record also has a key for the variable name. The corresponding value
+is a translation into JSON of the variable's value, using the
+conventions described below.
+
+Some TableGen data types are translated directly into the
+corresponding JSON type:
+
+* A completely undefined value (e.g. for a variable declared without
+ initializer in some superclass of this record, and never initialized
+ by the record itself or any other superclass) is emitted as the JSON
+ ``null`` value.
+
+* ``int`` and ``bit`` values are emitted as numbers. Note that
+ TableGen ``int`` values are capable of holding integers too large to
+ be exactly representable in IEEE double precision. The integer
+ literal in the JSON output will show the full exact integer value.
+ So if you need to retrieve large integers with full precision, you
+ should use a JSON reader capable of translating such literals back
+ into 64-bit integers without losing precision, such as Python's
+ standard ``json`` module.
+
+* ``string`` and ``code`` values are emitted as JSON strings.
+
+* ``list<T>`` values, for any element type ``T``, are emitted as JSON
+ arrays. Each element of the array is represented in turn using these
+ same conventions.
+
+* ``bits`` values are also emitted as arrays. A ``bits`` array is
+ ordered from least-significant bit to most-significant. So the
+ element with index ``i`` corresponds to the bit described as
+ ``x{i}`` in TableGen source. However, note that this means that
+ scripting languages are likely to *display* the array in the
+ opposite order from the way it appears in the TableGen source or in
+ the diagnostic ``-print-records`` output.
+
+All other TableGen value types are emitted as a JSON object,
+containing two standard fields: ``kind`` is a discriminator describing
+which kind of value the object represents, and ``printable`` is a
+string giving the same representation of the value that would appear
+in ``-print-records``.
+
+* A reference to a ``def`` object has ``kind=="def"``, and has an
+ extra field ``def`` giving the name of the object referred to.
+
+* A reference to another variable in the same record has
+ ``kind=="var"``, and has an extra field ``var`` giving the name of
+ the variable referred to.
+
+* A reference to a specific bit of a ``bits``-typed variable in the
+ same record has ``kind=="varbit"``, and has two extra fields:
+ ``var`` gives the name of the variable referred to, and ``index``
+ gives the index of the bit.
+
+* A value of type ``dag`` has ``kind=="dag"``, and has two extra
+ fields. ``operator`` gives the initial value after the opening
+ parenthesis of the dag initializer; ``args`` is an array giving the
+ following arguments. The elements of ``args`` are arrays of length
+ 2, giving the value of each argument followed by its colon-suffixed
+ name (if any). For example, in the JSON representation of the dag
+ value ``(Op 22, "hello":$foo)`` (assuming that ``Op`` is the name of
+ a record defined elsewhere with a ``def`` statement):
+
+ * ``operator`` will be an object in which ``kind=="def"`` and
+ ``def=="Op"``
+
+ * ``args`` will be the array ``[[22, null], ["hello", "foo"]]``.
+
+* If any other kind of value or complicated expression appears in the
+ output, it will have ``kind=="complex"``, and no additional fields.
+ These values are not expected to be needed by backends. The standard
+ ``printable`` field can be used to extract a representation of them
+ in TableGen source syntax if necessary.
+
How to write a back-end
=======================
ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr,
ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ...
-The default backend prints out all of the records.
+The default backend prints out all of the records. There is also a general
+backend which outputs all the records as a JSON data structure, enabled using
+the `-dump-json` option.
If you plan to use TableGen, you will most likely have to write a `backend`_
that extracts the information specific to what you need and formats it in the
-appropriate way.
+appropriate way. You can do this by extending TableGen itself in C++, or by
+writing a script in any language that can consume the JSON output.
Example
-------
Init *resolve(Init *VarName) override;
};
+void EmitJSON(RecordKeeper &RK, raw_ostream &OS);
+
} // end namespace llvm
#endif // LLVM_TABLEGEN_RECORD_H
add_llvm_library(LLVMTableGen
Error.cpp
+ JSONBackend.cpp
Main.cpp
Record.cpp
SetTheory.cpp
--- /dev/null
+//===- JSONBackend.cpp - Generate a JSON dump of all records. -*- C++ -*-=====//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This TableGen back end generates a machine-readable representation
+// of all the classes and records defined by the input, in JSON format.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/ADT/BitVector.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/TableGen/Error.h"
+#include "llvm/TableGen/Record.h"
+#include "llvm/TableGen/TableGenBackend.h"
+#include "llvm/Support/JSON.h"
+
+#define DEBUG_TYPE "json-emitter"
+
+using namespace llvm;
+
+namespace {
+
+class JSONEmitter {
+private:
+ RecordKeeper &Records;
+
+ json::Value translateInit(const Init &I);
+ json::Array listSuperclasses(const Record &R);
+
+public:
+ JSONEmitter(RecordKeeper &R);
+
+ void run(raw_ostream &OS);
+};
+
+} // end anonymous namespace
+
+JSONEmitter::JSONEmitter(RecordKeeper &R) : Records(R) {}
+
+json::Value JSONEmitter::translateInit(const Init &I) {
+
+ // Init subclasses that we return as JSON primitive values of one
+ // kind or another.
+
+ if (isa<UnsetInit>(&I)) {
+ return nullptr;
+ } else if (auto *Bit = dyn_cast<BitInit>(&I)) {
+ return Bit->getValue() ? 1 : 0;
+ } else if (auto *Bits = dyn_cast<BitsInit>(&I)) {
+ json::Array array;
+ for (unsigned i = 0, limit = Bits->getNumBits(); i < limit; i++)
+ array.push_back(translateInit(*Bits->getBit(i)));
+ return array;
+ } else if (auto *Int = dyn_cast<IntInit>(&I)) {
+ return Int->getValue();
+ } else if (auto *Str = dyn_cast<StringInit>(&I)) {
+ return Str->getValue();
+ } else if (auto *Code = dyn_cast<CodeInit>(&I)) {
+ return Code->getValue();
+ } else if (auto *List = dyn_cast<ListInit>(&I)) {
+ json::Array array;
+ for (auto val : *List)
+ array.push_back(translateInit(*val));
+ return array;
+ }
+
+ // Init subclasses that we return as JSON objects containing a
+ // 'kind' discriminator. For these, we also provide the same
+ // translation back into TableGen input syntax that -print-records
+ // would give.
+
+ json::Object obj;
+ obj["printable"] = I.getAsString();
+
+ if (auto *Def = dyn_cast<DefInit>(&I)) {
+ obj["kind"] = "def";
+ obj["def"] = Def->getDef()->getName();
+ return obj;
+ } else if (auto *Var = dyn_cast<VarInit>(&I)) {
+ obj["kind"] = "var";
+ obj["var"] = Var->getName();
+ return obj;
+ } else if (auto *VarBit = dyn_cast<VarBitInit>(&I)) {
+ if (auto *Var = dyn_cast<VarInit>(VarBit->getBitVar())) {
+ obj["kind"] = "varbit";
+ obj["var"] = Var->getName();
+ obj["index"] = VarBit->getBitNum();
+ return obj;
+ }
+ } else if (auto *Dag = dyn_cast<DagInit>(&I)) {
+ obj["kind"] = "dag";
+ obj["operator"] = translateInit(*Dag->getOperator());
+ if (auto name = Dag->getName())
+ obj["name"] = name->getAsUnquotedString();
+ json::Array args;
+ for (unsigned i = 0, limit = Dag->getNumArgs(); i < limit; ++i) {
+ json::Array arg;
+ arg.push_back(translateInit(*Dag->getArg(i)));
+ if (auto argname = Dag->getArgName(i))
+ arg.push_back(argname->getAsUnquotedString());
+ else
+ arg.push_back(nullptr);
+ args.push_back(std::move(arg));
+ }
+ obj["args"] = std::move(args);
+ return obj;
+ }
+
+ // Final fallback: anything that gets past here is simply given a
+ // kind field of 'complex', and the only other field is the standard
+ // 'printable' representation.
+
+ assert(!I.isConcrete());
+ obj["kind"] = "complex";
+ return obj;
+}
+
+void JSONEmitter::run(raw_ostream &OS) {
+ json::Object root;
+
+ root["!tablegen_json_version"] = 1;
+
+ // Prepare the arrays that will list the instances of every class.
+ // We mostly fill those in by iterating over the superclasses of
+ // each def, but we also want to ensure we store an empty list for a
+ // class with no instances at all, so we do a preliminary iteration
+ // over the classes, invoking std::map::operator[] to default-
+ // construct the array for each one.
+ std::map<std::string, json::Array> instance_lists;
+ for (const auto &C : Records.getClasses()) {
+ auto &Name = C.second->getNameInitAsString();
+ (void)instance_lists[Name];
+ }
+
+ // Main iteration over the defs.
+ for (const auto &D : Records.getDefs()) {
+ auto &Name = D.second->getNameInitAsString();
+ auto &Def = *D.second;
+
+ json::Object obj;
+ json::Array fields;
+
+ for (const RecordVal &RV : Def.getValues()) {
+ if (!Def.isTemplateArg(RV.getNameInit())) {
+ auto Name = RV.getNameInitAsString();
+ if (RV.getPrefix())
+ fields.push_back(Name);
+ obj[Name] = translateInit(*RV.getValue());
+ }
+ }
+
+ obj["!fields"] = std::move(fields);
+
+ json::Array superclasses;
+ for (const auto &SuperPair : Def.getSuperClasses())
+ superclasses.push_back(SuperPair.first->getNameInitAsString());
+ obj["!superclasses"] = std::move(superclasses);
+
+ obj["!name"] = Name;
+ obj["!anonymous"] = Def.isAnonymous();
+
+ root[Name] = std::move(obj);
+
+ // Add this def to the instance list for each of its superclasses.
+ for (const auto &SuperPair : Def.getSuperClasses()) {
+ auto SuperName = SuperPair.first->getNameInitAsString();
+ instance_lists[SuperName].push_back(Name);
+ }
+ }
+
+ // Make a JSON object from the std::map of instance lists.
+ json::Object instanceof;
+ for (auto kv: instance_lists)
+ instanceof[kv.first] = std::move(kv.second);
+ root["!instanceof"] = std::move(instanceof);
+
+ // Done. Write the output.
+ OS << json::Value(std::move(root)) << "\n";
+}
+
+namespace llvm {
+
+void EmitJSON(RecordKeeper &RK, raw_ostream &OS) { JSONEmitter(RK).run(OS); }
+} // end namespace llvm
--- /dev/null
+#!/usr/bin/env python
+
+import sys
+import subprocess
+import traceback
+import json
+
+data = json.load(sys.stdin)
+testfile = sys.argv[1]
+
+prefix = "CHECK: "
+
+fails = 0
+passes = 0
+with open(testfile) as testfh:
+ lineno = 0
+ for line in iter(testfh.readline, ""):
+ lineno += 1
+ line = line.rstrip("\r\n")
+ try:
+ prefix_pos = line.index(prefix)
+ except ValueError:
+ continue
+ check_expr = line[prefix_pos + len(prefix):]
+
+ try:
+ exception = None
+ result = eval(check_expr, {"data":data})
+ except Exception:
+ result = False
+ exception = traceback.format_exc().splitlines()[-1]
+
+ if exception is not None:
+ sys.stderr.write(
+ "{file}:{line:d}: check threw exception: {expr}\n"
+ "{file}:{line:d}: exception was: {exception}\n".format(
+ file=testfile, line=lineno,
+ expr=check_expr, exception=exception))
+ fails += 1
+ elif not result:
+ sys.stderr.write(
+ "{file}:{line:d}: check returned False: {expr}\n".format(
+ file=testfile, line=lineno, expr=check_expr))
+ fails += 1
+ else:
+ passes += 1
+
+if fails != 0:
+ sys.exit("{} checks failed".format(fails))
+else:
+ sys.stdout.write("{} checks passed\n".format(passes))
--- /dev/null
+// RUN: llvm-tblgen -dump-json %s | %python %S/JSON-check.py %s
+
+// CHECK: data['!tablegen_json_version'] == 1
+
+// CHECK: all(data[s]['!name'] == s for s in data if not s.startswith("!"))
+
+class Base {}
+class Intermediate : Base {}
+class Derived : Intermediate {}
+
+def D : Intermediate {}
+// CHECK: 'D' in data['!instanceof']['Base']
+// CHECK: 'D' in data['!instanceof']['Intermediate']
+// CHECK: 'D' not in data['!instanceof']['Derived']
+// CHECK: 'Base' in data['D']['!superclasses']
+// CHECK: 'Intermediate' in data['D']['!superclasses']
+// CHECK: 'Derived' not in data['D']['!superclasses']
+
+def ExampleDagOp;
+
+def FieldKeywordTest {
+ int a;
+ field int b;
+ // CHECK: 'a' not in data['FieldKeywordTest']['!fields']
+ // CHECK: 'b' in data['FieldKeywordTest']['!fields']
+}
+
+class Variables {
+ int i;
+ string s;
+ bit b;
+ bits<8> bs;
+ code c;
+ list<int> li;
+ Base base;
+ dag d;
+}
+def VarNull : Variables {
+ // A variable not filled in at all has its value set to JSON
+ // 'null', which translates to Python None
+ // CHECK: data['VarNull']['i'] is None
+}
+def VarPrim : Variables {
+ // Test initializers that map to primitive JSON types
+
+ int i = 3;
+ // CHECK: data['VarPrim']['i'] == 3
+
+ // Integer literals should be emitted in the JSON at full 64-bit
+ // precision, for the benefit of JSON readers that preserve that
+ // much information. Python's is one such.
+ int enormous_pos = 9123456789123456789;
+ int enormous_neg = -9123456789123456789;
+ // CHECK: data['VarPrim']['enormous_pos'] == 9123456789123456789
+ // CHECK: data['VarPrim']['enormous_neg'] == -9123456789123456789
+
+ string s = "hello, world";
+ // CHECK: data['VarPrim']['s'] == 'hello, world'
+
+ bit b = 0;
+ // CHECK: data['VarPrim']['b'] == 0
+
+ // bits<> arrays are stored in logical order (array[i] is the same
+ // bit identified in .td files as bs{i}), which means the _visual_
+ // order of the list (in default rendering) is reversed.
+ bits<8> bs = { 0,0,0,1,0,1,1,1 };
+ // CHECK: data['VarPrim']['bs'] == [ 1,1,1,0,1,0,0,0 ]
+
+ code c = [{ \" }];
+ // CHECK: data['VarPrim']['c'] == r' \" '
+
+ list<int> li = [ 1, 2, 3, 4 ];
+ // CHECK: data['VarPrim']['li'] == [ 1, 2, 3, 4 ]
+}
+def VarObj : Variables {
+ // Test initializers that map to JSON objects containing a 'kind'
+ // discriminator
+
+ Base base = D;
+ // CHECK: data['VarObj']['base']['kind'] == 'def'
+ // CHECK: data['VarObj']['base']['def'] == 'D'
+ // CHECK: data['VarObj']['base']['printable'] == 'D'
+
+ dag d = (ExampleDagOp 22, "hello":$foo);
+ // CHECK: data['VarObj']['d']['kind'] == 'dag'
+ // CHECK: data['VarObj']['d']['operator']['kind'] == 'def'
+ // CHECK: data['VarObj']['d']['operator']['def'] == 'ExampleDagOp'
+ // CHECK: data['VarObj']['d']['operator']['printable'] == 'ExampleDagOp'
+ // CHECK: data['VarObj']['d']['args'] == [[22, None], ["hello", "foo"]]
+ // CHECK: data['VarObj']['d']['printable'] == '(ExampleDagOp 22, "hello":$foo)'
+
+ int undef_int;
+ field int ref_int = undef_int;
+ // CHECK: data['VarObj']['ref_int']['kind'] == 'var'
+ // CHECK: data['VarObj']['ref_int']['var'] == 'undef_int'
+ // CHECK: data['VarObj']['ref_int']['printable'] == 'undef_int'
+
+ bits<2> undef_bits;
+ bits<4> ref_bits;
+ let ref_bits{3-2} = 0b10;
+ let ref_bits{1-0} = undef_bits{1-0};
+ // CHECK: data['VarObj']['ref_bits'][3] == 1
+ // CHECK: data['VarObj']['ref_bits'][2] == 0
+ // CHECK: data['VarObj']['ref_bits'][1]['kind'] == 'varbit'
+ // CHECK: data['VarObj']['ref_bits'][1]['var'] == 'undef_bits'
+ // CHECK: data['VarObj']['ref_bits'][1]['index'] == 1
+ // CHECK: data['VarObj']['ref_bits'][1]['printable'] == 'undef_bits{1}'
+ // CHECK: data['VarObj']['ref_bits'][0]['kind'] == 'varbit'
+ // CHECK: data['VarObj']['ref_bits'][0]['var'] == 'undef_bits'
+ // CHECK: data['VarObj']['ref_bits'][0]['index'] == 0
+ // CHECK: data['VarObj']['ref_bits'][0]['printable'] == 'undef_bits{0}'
+
+ field int complex_ref_int = !add(undef_int, 2);
+ // CHECK: data['VarObj']['complex_ref_int']['kind'] == 'complex'
+ // CHECK: data['VarObj']['complex_ref_int']['printable'] == '!add(undef_int, 2)'
+}
+
+// Test the !anonymous member. This is tricky because when a def is
+// anonymous, almost by definition, the test can't reliably predict
+// the name it will be stored under! So we have to search all the defs
+// in the JSON output looking for the one that has the test integer
+// field set to the right value.
+
+def Named { int AnonTestField = 1; }
+// CHECK: data['Named']['AnonTestField'] == 1
+// CHECK: data['Named']['!anonymous'] is False
+
+def { int AnonTestField = 2; }
+// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 2)['!anonymous'] is True
+
+multiclass AnonTestMulticlass<int base> {
+ def _plus_one { int AnonTestField = !add(base,1); }
+ def { int AnonTestField = !add(base,2); }
+}
+
+defm NamedDefm : AnonTestMulticlass<10>;
+// CHECK: data['NamedDefm_plus_one']['!anonymous'] is False
+// CHECK: data['NamedDefm_plus_one']['AnonTestField'] == 11
+// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 12)['!anonymous'] is True
+
+// D47431 clarifies that a named def inside a multiclass gives a
+// *non*-anonymous output record, even if the defm that instantiates
+// that multiclass is anonymous.
+defm : AnonTestMulticlass<20>;
+// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 21)['!anonymous'] is False
+// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 22)['!anonymous'] is True
enum ActionType {
PrintRecords,
+ DumpJSON,
GenEmitter,
GenRegisterInfo,
GenInstrInfo,
Action(cl::desc("Action to perform:"),
cl::values(clEnumValN(PrintRecords, "print-records",
"Print all records to stdout (default)"),
+ clEnumValN(DumpJSON, "dump-json",
+ "Dump all records as machine-readable JSON"),
clEnumValN(GenEmitter, "gen-emitter",
"Generate machine code emitter"),
clEnumValN(GenRegisterInfo, "gen-register-info",
case PrintRecords:
OS << Records; // No argument, dump all contents
break;
+ case DumpJSON:
+ EmitJSON(Records, OS);
+ break;
case GenEmitter:
EmitCodeEmitter(Records, OS);
break;