2 <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
4 <!ENTITY % version-entities SYSTEM "version.entities">
6 <!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
8 <refentry id="orc-concepts" revision="29 may 2009">
10 <refentrytitle>Orc Concepts</refentrytitle>
11 <manvolnum>3</manvolnum>
12 <refmiscinfo>Orc</refmiscinfo>
16 <refname>Orc Concepts</refname>
18 High-level view of what Orc does.
23 <title>Orc Concepts</title>
26 Orc is a compiler for a simple assembly-like language. Unlike
27 most compilers, Orc is primarily a library, which means that
28 all its features can be controlled from any application that
29 uses it. Also unlike most compilers, Orc creates code that
30 can be immediately exectued by the application.
34 Orc is mainly useful for generating code that performs simple
35 mathematical operations on continguous arrays. An example Orc
36 function, translated to C, might look like:
39 void function (int *dest, int *src1, int *src2, int n)
42 for (i = 0; i < n; i++) {
43 dest[i] = (src1[i] + src2[i] + 1) >> 1;
51 Orc is primarily targetted toward generating code for vector
52 CPU extensions such as SSE, Altivec, and NEON.
56 Possible usage patterns:
60 The application generates Orc code programmatically.
61 Generate Orc programs programmatically at runtime, compile at
62 runtime, and execute. This is what many of the Orc test programs
63 do, and is the most flexible and well-developed method at this
64 time. This requires depending on the Orc library at runtime.
68 The application developer uses Orc to produce assembly source
69 code that is then compiled into the application. This requires
70 the developer to have Orc installed at build time. The advantage
71 of this method is no Orc dependency at runtime. Disadvantages
72 are a more complex build process, potential for compiler
73 incompatibilities with generated assembly source code, and any
74 Orc improvements require the application to be recompiled.
78 The application developer writes Orc source files, and compiles
79 them into Orc bytecode to be included in the application. At
80 runtime, Orc compiles the bytecode into executable code. This
81 has the advantage of being easily editable. This method is
82 still somewhat experimental.
86 A wide variety of additional workflows are possible, although
87 tools are not yet available to make it convenient.
99 <title>Concepts</title>
102 The OrcProgram is the primary object that applications use when
103 using Orc to create code. It contains all the information related to
104 what is essentially a function definition in C. Orc programs can
105 be compiled into assembly source code, or directly into binary code
106 that can be executed as part of the running process. On CPUs that
107 are not supported, programs can also be executed via emulation. Orc
108 programs can also be compiled into C source code.
112 A program contains one or more instructions and operates on one or
113 more source and destination arrays, and may use scalar parameters.
114 When compiled and executed, or emulated, the instructions define
115 the operations performed on each source array member, and the results
116 are placed in the destination array. Another way of thinking about
117 it is that the compiler generates code that iterates over the
118 destination array, calculating the value of each members based on
119 the program instructions and the corresponding values in the source
120 arrays and scalar parameters.
124 The form of programs is strictly limited so that they may be compiled
125 into vector instructions effectively. It is anticipated that future
126 versions of Orc will allow more complex programs.
130 The arrays that Orc programs operate on must be contiguous.
134 Some example operations are "addw" which adds two 16-bit integers,
135 "convsbw" which converts a signed byte to a signed 16-bit integer,
136 and "minul" which selects the lesser of two 32-bit unsigned
137 integers. Orc only checks that the size of the operand matches
138 the size of the variable. Thus, the compiler will not warn against
139 using "minul" with signed 32-bit integers, because it does not know
140 that the variables are signed or unsigned.
144 Orc has a main set of opcodes, that is, an OrcOpcodeSet, with the
145 name "sys". These opcodes are always available. They cover most
146 common arithmetic and conversion instructions for 8, 16, and 32-bit
147 integers. There are two auxiliary libraries that provide additional
148 opcode sets, the liborc-float library that contains the "float"
149 opcode set for 32 and 64-bit floating point operations, and the
150 liborc-pixel library containing the "pixel" opcode set for operations
151 on 32-bit RGBA pixels.
155 Orc programs are compiled using the function orc_program_compile().
156 The compiled code will be targetted for the current processor, which
157 is useful for compiling code that will be immediately executed.
158 Compiling for other processor families or processor family variants,
159 in order to produce assembly source code, can be accomplished using
160 one of the orc_program_compile variants.
164 Once an Orc program is compiled, it can be executed by creating
165 an OrcExecutor structure, linking it to the program to be executed,
166 setting the arrays and parameters, and setting the iteration count.
167 Orc executors are the equivalent of stack frames in a called function
168 in normal C code. However, all Orc programs use the same OrcExecutor
169 structure, which makes code that manipulates executors simpler in
170 respect to those that manipulate stack frames. Executors can be
175 An OrcTarget represents a particular instruction set or CPU family
176 for which code can be generated. Current targets include MMX, SSE,
177 Altivec, NEON, and ARM. There is also a special target that generates C
178 source code, but is not capable of producing executable code at
179 runtime. In most cases, the default target is the most appropriate
180 target for the current CPU.
184 Individual Orc targets may have various options that control code
185 generation for that target. For example, the various CPUs handled
186 by the SSE target have different subsets of SSE instructions that
187 are supported. The target flags for SSE enable generation of the
188 different subsets of SSE instructions.
192 In order to produce target code, the Orc compiler finds an appropriate
193 OrcRule to translate the instruction to target code. An OrcRuleSet
194 is an array of rules that all have the required target flags, and
195 a target may have one or more rule sets that can be enabled or
196 disabled based on the target flags. In many cases, Orc instructions
197 can be translated into one or two target instructions, which generates
198 fast code. In other cases, the CPU indicated by the target and target
199 flags does not have a fast method of performing the Orc instruction,
200 and a slower method is chosen. This is indicated in the value returned
201 by the compiling function call. In yet other cases, there is no
202 implemented rule to translate an Orc instruction to target code, so
207 Compilation can fail for one of two main reasons. One reason is that
208 the compiler was unable to parse the correct meaning, such as an
209 unknown opcode, undeclared variable, or a size mismatch. These are
210 uncorrectible errors, and the program cannot be executed or emulated.
211 The other reason for a compilation failure is that target code could
212 not be generated for a variety of reasons, including missing rules
213 or unimplemented features. In this case, the program can be emulated.
214 This process occurs automatically.
218 Emulation is generally slower than corresponding C code. Since the
219 Orc compiler can produce C source code, it is possible to generate
220 and compile backup C code for programs. This process is not yet
227 <title>Extending Orc</title>
230 Developers can extend Orc primarily by adding new opcode sets, adding
231 new targets, and by adding new target rules.
235 Additional opcode sets can be created and registered in a manner
236 similar to how the liborc-float and liborc-pixel libraries. In order
237 to make full use of new opcode sets, one must also define rules for
238 translating these opcodes into target code. The example libraries
239 do this by registering rule sets for various targets (mainly SSE)
240 for their opcode sets. Orc provides low-level API for generating
241 target code. Not all possible target instructions can be generated
242 with the target API, so developers may need to modify and add
243 functions to the main Orc library as necessary to generate target