Xiang, Haihao [Wed, 31 Oct 2012 08:10:20 +0000 (16:10 +0800)]
bump version to 1.3
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Homer Hsing [Tue, 23 Oct 2012 01:21:15 +0000 (09:21 +0800)]
Fix typo. "donesn't" -> "doesn't"
Zhao Yakui [Mon, 22 Oct 2012 20:13:51 +0000 (16:13 -0400)]
Add the CRE enginee for HSW+
This is also for media encoding like VME, which can do
the operation of check & refinement.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Gwenole Beauchesne [Mon, 22 Oct 2012 20:13:51 +0000 (16:13 -0400)]
Fix JMPI encoding for Haswell.
It uses the byte-aligned jump instead of 64-bit units.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Gwenole Beauchesne [Mon, 22 Oct 2012 20:13:51 +0000 (16:13 -0400)]
Add initial support for Haswell.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Gwenole Beauchesne [Mon, 22 Oct 2012 20:13:51 +0000 (16:13 -0400)]
Allow Gen version decimals.
This is preparatory work for Haswell (Gen 7.5).
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Gwenole Beauchesne [Mon, 22 Oct 2012 20:13:51 +0000 (16:13 -0400)]
Bump gen_level to multiple of tens.
Add new helper macros to check versions:
- IS_GENp() meant to match Gen X and above
- IS_GENx() meant to match Gen X exactly.
Patch mechanically generated. No stale "gen_level" usage.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Homer Hsing [Fri, 19 Oct 2012 03:18:23 +0000 (11:18 +0800)]
Fix Gen7 JMPI compilation
Gen7 JMPI Restrictions in bspec:
The JIP data type must be Signed DWord
Homer Hsing [Thu, 18 Oct 2012 04:37:31 +0000 (12:37 +0800)]
Fix sub-register number of an address register encoding
The AddrSubRegNum field in the instruction binary code should be:
code value(advanced_flag==0) value(advanced_flag==1)
a0.0 0 0
a0.1 invalid input 1
a0.2 1 2
a0.3 invalid input 3
a0.4 2 4
a0.5 invalid input 5
a0.6 3 6
a0.7 invalid input 7
a0.8 4 invalid input
a0.10 5 invalid input
a0.12 6 invalid input
a0.14 7 invalid input
Homer Hsing [Tue, 16 Oct 2012 06:14:25 +0000 (14:14 +0800)]
Fix symbol register subreg number calculation rule symbol_reg_p
When in normal mode, subreg_nr should not be divided by type_size.
This patch fixes such bug.
Homer Hsing [Fri, 28 Sep 2012 06:10:00 +0000 (14:10 +0800)]
Show warning when compiling the grammar parser
Homer Hsing [Fri, 28 Sep 2012 06:05:51 +0000 (14:05 +0800)]
Support Gen6 WHILE instruction
Homer Hsing [Fri, 28 Sep 2012 06:02:25 +0000 (14:02 +0800)]
Make sure Gen6 IF works
Homer Hsing [Fri, 28 Sep 2012 05:46:21 +0000 (13:46 +0800)]
Make sure Gen6 ENDIF work
Homer Hsing [Fri, 28 Sep 2012 05:43:44 +0000 (13:43 +0800)]
Fix JIP position for Gen6 JMPI
Homer Hsing [Thu, 27 Sep 2012 08:20:39 +0000 (16:20 +0800)]
Fix Gen6 ELSE instructions code logic according to bspec.
Homer Hsing [Thu, 27 Sep 2012 07:44:15 +0000 (15:44 +0800)]
Make sure BREAK/CONT/HALT work on Gen6.
Homer Hsing [Thu, 27 Sep 2012 07:39:28 +0000 (15:39 +0800)]
Support Gen6 RET instruction.
Homer Hsing [Thu, 27 Sep 2012 07:31:56 +0000 (15:31 +0800)]
Support Gen6 CALL instruction.
Homer Hsing [Thu, 27 Sep 2012 06:56:30 +0000 (14:56 +0800)]
Replace variable init code in WAIT by src_null_reg
Homer Hsing [Thu, 27 Sep 2012 06:48:14 +0000 (14:48 +0800)]
Let ip_dst and ip_src become local const variable, so as to reduce replicated code.
Homer Hsing [Thu, 27 Sep 2012 06:20:32 +0000 (14:20 +0800)]
Support Gen6 three-source-operand instructions.
Add bits1.three_src.gen6.dest_reg_file according to Gen6 spec
Homer Hsing [Thu, 27 Sep 2012 05:51:33 +0000 (13:51 +0800)]
Compile ELSE and WHILE in Gen5 as same way as in Gen4
Homer Hsing [Mon, 24 Sep 2012 08:39:36 +0000 (16:39 +0800)]
Fix reloc_target_offset computing logic
Homer Hsing [Mon, 24 Sep 2012 02:12:26 +0000 (10:12 +0800)]
Fully support Gen7 branching instructions
Also fix integer argument parsing rule for JMPI, IF and WHILE
Fix shift/reduce conflicts in relativelocation
Homer Hsing [Mon, 24 Sep 2012 02:06:35 +0000 (10:06 +0800)]
Supporting multi-branch instructios BRD & BRC
brd: redirect channels to branches
brc: let channels converging together
also rewrite code converting label to offset
Homer Hsing [Fri, 21 Sep 2012 04:35:35 +0000 (12:35 +0800)]
Use right-recursing in parser rule inst_option_list
This recursing cost less memory. It is recommended by Bison.
Homer Hsing [Fri, 21 Sep 2012 04:33:13 +0000 (12:33 +0800)]
Support subroutine instructions, CALL & RET
Homer Hsing [Fri, 21 Sep 2012 02:14:31 +0000 (10:14 +0800)]
Merge replicative code in gram.y
Homer Hsing [Fri, 21 Sep 2012 02:06:20 +0000 (10:06 +0800)]
Reduce replicative code in gram.y by reloc_target field in src_operand
Bspec says JIP and UIP should be the source operands. It is better if
src_operand has a field "reloc_target" according to bspec.
The replicative code in JMPI and branchloop rules can be merged into one.
Homer Hsing [Fri, 21 Sep 2012 01:51:55 +0000 (09:51 +0800)]
Restrict type of relativelocation2 to int
Original rule set it to EXP | NUMBER, then YYERROR if it is NUMBER.
This patch set it directly to EXP, restricting its type to int.
Homer Hsing [Fri, 21 Sep 2012 01:37:06 +0000 (09:37 +0800)]
Rewrite label matching code. Collect labels in a linked list.
Label matching is faster because of searching only in a small list,
rather than searching a label in all instructions.
Homer Hsing [Fri, 21 Sep 2012 00:39:57 +0000 (08:39 +0800)]
Add second_reloc_target in the data structure.
Since Gen6+, some branching instructions have two relocation targets.
Homer Hsing [Thu, 20 Sep 2012 06:06:06 +0000 (14:06 +0800)]
Add test case for ".declare" overriding feature.
Later same name .declare pragma will override previously defined
one. This patch add a test case for that feature.
Homer Hsing [Thu, 20 Sep 2012 06:04:20 +0000 (14:04 +0800)]
Fix memory leaking in the parser
STRING has been malloc'ed by strdup in src/lex.l but forgotten to
be freed in src/gram.y.
Homer Hsing [Thu, 20 Sep 2012 05:09:15 +0000 (13:09 +0800)]
Fix field length of JIP for one-offset-branch in Gen6
Such JIP has 25 bits length in Gen6.
Homer Hsing [Wed, 19 Sep 2012 01:34:58 +0000 (09:34 +0800)]
Automatically run all test cases.
In the past test/run-test.sh run only one test case per call.
This patch let it automatically run all test cases.
Homer Hsing [Tue, 18 Sep 2012 08:44:45 +0000 (16:44 +0800)]
Fix missing environment variables problem in test/run-test.sh
Currently test/run-test.sh cannot get the value of ${srcdir} and
${top_builddir}. Thus we cannot run any test case. This patch uses
$0 to get the absolute path of run-test.sh. Now test cases work.
Homer Hsing [Tue, 18 Sep 2012 08:32:39 +0000 (16:32 +0800)]
Add a generic hash table algorithm. Reuse for declared_reg_table and label_table in the future.
Rewrite find_register() and insert_register(). The hash table code
has been extracted. We may use those code for label table in the future.
Homer Hsing [Tue, 18 Sep 2012 08:28:27 +0000 (16:28 +0800)]
Add a test case for ".declare" pragma
Homer Hsing [Tue, 18 Sep 2012 05:57:20 +0000 (13:57 +0800)]
Rename brw_instruction.bits3.if_else to branch
Because that field will be used for all branch instructions
Homer Hsing [Tue, 18 Sep 2012 05:47:22 +0000 (13:47 +0800)]
According to BSPEC, put PLN & BFI1 to binaryop, put SUBB to binaryaccop
bspec: BFI1 should not access accumulator. PLN should not use accumulator
as source.
future work in gram.y: show warning if acc is used as dest for
ADDC/SUBB/CMP/CMPN/SHL/BFI1.
Homer Hsing [Tue, 18 Sep 2012 05:25:53 +0000 (13:25 +0800)]
Explain the difference between binaryinstruction and binaryaccinstruction
Developers may add new instructions in wrong place in the future
if they don't know the difference between binaryinstruction and
binaryaccinstruction.
Homer Hsing [Tue, 18 Sep 2012 05:12:50 +0000 (13:12 +0800)]
Renaming according to BSPEC: jump_count -> JIP; pop_count -> UIP.
Since bspec SNB+, jump_count and pop_count is renamed to JIP and uIP.
Homer Hsing [Mon, 17 Sep 2012 08:11:49 +0000 (16:11 +0800)]
Use bits3.if_else.jump_count instead of bits3.ud for readability
Homer Hsing [Mon, 17 Sep 2012 08:01:16 +0000 (16:01 +0800)]
Pad NOP instructions instead of the ILLEGAL instruction for entry
If a label is an entry, the assembler will pad empty instruction
before the label until offset % 4 == 0. In the past, the ILLEGAL
instructions are padded. It may raise exceptions. We use the NOP
instructions instead.
Homer Hsing [Mon, 17 Sep 2012 05:34:38 +0000 (13:34 +0800)]
Merge same if branches in declare_pragma section in gram.y
Homer Hsing [Fri, 14 Sep 2012 07:27:19 +0000 (15:27 +0800)]
Reduce memory cost in entry_table
Original code double entry table space if there is no space. It may
waste 50% memory of the entry table. Now we use a link list to store
entry items.
Homer Hsing [Fri, 14 Sep 2012 05:40:08 +0000 (13:40 +0800)]
Make the entry point padding code logic looks nicer
Homer Hsing [Fri, 14 Sep 2012 02:50:09 +0000 (10:50 +0800)]
Fix a typo in src/main.c: "in unit of type" -> "in unit of byte"
Homer Hsing [Fri, 14 Sep 2012 02:06:39 +0000 (10:06 +0800)]
Reduce hash value collision probability in src/main.c
Original code use "hash_value = *name++", which may produce
hash value collision for word permutations like "abc", "bac" and "cba".
Homer Hsing [Fri, 14 Sep 2012 02:02:53 +0000 (10:02 +0800)]
Move program_defaults init statement into variable declaration
In original code, the init value for "program_defaults.register_type"
is put inside main(), which may be hard to maintain.
Homer Hsing [Fri, 14 Sep 2012 01:42:30 +0000 (09:42 +0800)]
Better comment text. Change "c like" to "C style" in main.c
Homer Hsing [Fri, 14 Sep 2012 01:34:58 +0000 (09:34 +0800)]
Replace bzero by memset.
bzero has been removed from POSIX.1-2008. Should use memset instead.
Homer Hsing [Fri, 14 Sep 2012 01:02:01 +0000 (09:02 +0800)]
Supporting integer subtraction with borrow
subb: subtract unsigned integer src1 from src0. store the result
in dst and store the borrow (0 or 1) as a 32-bit value in acc.
Homer Hsing [Fri, 14 Sep 2012 00:56:36 +0000 (08:56 +0800)]
Supporting find first bit instructions
fbh: Find the first significant bit searching from the high bits
in src0 and store the result in dst.
fbl: Find the first 1 bit searching from the low bits in src0
and store the result in dst.
Homer Hsing [Fri, 14 Sep 2012 00:50:18 +0000 (08:50 +0800)]
Supporting half precision to single precision float convertion
The f16to32 instruction converts the half precision float
in src0 to single precision float and storing in dst.
The f32to16 instruction converts the single precision float
in src0 to half precision float and storing in the lower word
of each channel in dst.
Homer Hsing [Fri, 14 Sep 2012 00:41:16 +0000 (08:41 +0800)]
Supporting count bit set instruction
The cbit instruction counts component-wise the total bits set
in src0 and stores the resulting counts in dst.
Homer Hsing [Fri, 14 Sep 2012 00:32:12 +0000 (08:32 +0800)]
Supporting instruction "reverse bits"
The bfrev instruction component-wise reverses all the bits in src0
and stores the results in dst.
Homer Hsing [Fri, 14 Sep 2012 00:27:41 +0000 (08:27 +0800)]
Supporting instruction Bit Field Insert 1
The bfi1 instruction component-wise generates mask with control
from src0 and src1 and stores the results in dst.
Homer Hsing [Fri, 14 Sep 2012 00:24:54 +0000 (08:24 +0800)]
Delete an extra space character in brw_defines.h
Now the column is aligned and the code is nicer.
Homer Hsing [Fri, 14 Sep 2012 00:20:13 +0000 (08:20 +0800)]
Supporting addc instruction
The addc instruction performs component-wise addition of
src0 and src1 and stores the results in dst;
it also stores the carry into acc.
Homer Hsing [Thu, 13 Sep 2012 03:05:50 +0000 (11:05 +0800)]
Supporting bit field extract and bit field insert 2
Supporting two new operators, bfe and bfi2
bfe: Component-wise extracts a bit field from src2 using the bit field width from src0 and the bit field offset from src1.
bfi2: component-wise performs the bitfield insert operation on src1 and src2 based on the mask in src0.
Homer Hsing [Wed, 12 Sep 2012 05:04:49 +0000 (13:04 +0800)]
Supporting LRP: dest = src0 * src1 + (1-src0) * src2
Homer Hsing [Fri, 7 Sep 2012 06:38:13 +0000 (14:38 +0800)]
Support trinary source instruction "multiply add".
MAD (Multiply ADd) computes dst <- src1*src2 + src0.
Tried best to follow previous variable naming habit.
Also renamed "triinstruction" -> "trinaryinstruction" in grammar parser
for better readability.
Homer Hsing [Fri, 7 Sep 2012 01:53:17 +0000 (09:53 +0800)]
add data structure in src/brw_structs.h for supporting three-source-operator instruncions
Homer Hsing [Fri, 7 Sep 2012 01:20:50 +0000 (09:20 +0800)]
Comment magic words "da1", "da16", "ia1", and "ia16"
Homer Hsing [Thu, 6 Sep 2012 08:12:08 +0000 (16:12 +0800)]
close File yyin before calling yylex_destroy
This patch makes sure file handler yyin is closed.
yylex_destroy() calls yy_init_globals(), which reset yyin to 0.
Therefore if we do not close yyin before yylex_destroy(), yyin
will not be closed anymore.
Homer Hsing [Thu, 6 Sep 2012 07:55:54 +0000 (15:55 +0800)]
Call yylex_destroy() to free memory after yyparse()
Homer Hsing [Thu, 6 Sep 2012 07:33:41 +0000 (15:33 +0800)]
Close input file handler yyin after yyparse
Homer Hsing [Thu, 6 Sep 2012 02:31:22 +0000 (10:31 +0800)]
Fix a typo ... lable -> label
Lu Guanqun [Wed, 22 Aug 2012 01:09:36 +0000 (09:09 +0800)]
fix the label checking logics
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Xiang, Haihao [Tue, 17 Jul 2012 08:16:11 +0000 (16:16 +0800)]
Waring if both predication and conditional modifier are enabled but use different flag registers
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Tue, 17 Jul 2012 07:05:31 +0000 (15:05 +0800)]
Add support for flag register f1 on Ivy bridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Tue, 17 Jul 2012 06:18:54 +0000 (14:18 +0800)]
s/flag_reg_nr/flag_subreg_nr for an instruction
s/flagreg/flag_subreg_nr for a condition
They are flag subregister number indeed
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Tue, 17 Jul 2012 06:01:54 +0000 (14:01 +0800)]
Remove flag_reg_nr from the DW3 of an instruction
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Tue, 17 Jul 2012 05:46:59 +0000 (13:46 +0800)]
Change the rule for flag register
The shift/reduce conflict mentioned in the comment has been fixed, so
flagreg can return the reg number in the lvalue now. In addition, it will
be easy to add support for flag register f1 on Ivy bridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Fri, 29 Jun 2012 08:47:10 +0000 (16:47 +0800)]
Accept symbol register as the leading register of the request
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Ben Widawsky [Sun, 24 Jun 2012 22:03:28 +0000 (15:03 -0700)]
disasm: decode SENDC like SEND
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Ben Widawsky [Sun, 24 Jun 2012 22:01:57 +0000 (15:01 -0700)]
disasm: add gen6 style send decoding
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Ben Widawsky [Sun, 24 Jun 2012 21:43:45 +0000 (14:43 -0700)]
disasm: add sendc
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Ben Widawsky [Sun, 24 Jun 2012 02:36:48 +0000 (19:36 -0700)]
disasm: add pln instruction
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Xiang, Haihao [Thu, 11 Aug 2011 07:35:14 +0000 (15:35 +0800)]
A new syntax of SEND intruction on Ivybridge
[(<pred>)] send (<exec_size>) reg greg imm6 reg32a
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Tue, 21 Jun 2011 03:12:13 +0000 (11:12 +0800)]
bump version to 1.2
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Tue, 31 May 2011 05:36:03 +0000 (13:36 +0800)]
Support VME on Ivybridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Fri, 10 Jun 2011 08:04:30 +0000 (16:04 +0800)]
Support DP for sampler/render/constant/data cache
Since Sandybridge, DP supports cache select for read/write. Some write messages such as
OWord Block Write don't support render cache any more on Ivybridge. So introduce a
generic data_port messsage for Sandybridge+.
data_port(
cache_type, /* sampler, render, constant or data(on Ivybridge+) cache */
message_type, /* read or write type */
message_control,
binding_table_index,
write_commit_or_category, /* write commit on Sandybridge, category on Ivybridge+ */
header_present)
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Mon, 30 May 2011 08:30:48 +0000 (16:30 +0800)]
sampler/render/constant cache unit since Sandybridge
since Sandybrdige, there isn't a single function unit for data port read/write.
Instead sampler/render/constant cache unit is introduced, data port read/write
can be specified in a SEND instruction with different cache unit. To keep compatibility,
currently data port read always uses sampler cache unit however data port write
uses render cache unit
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Mon, 30 May 2011 08:00:12 +0000 (16:00 +0800)]
fix an error in commit cf76278
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Wed, 25 May 2011 06:29:14 +0000 (14:29 +0800)]
SEND uses GRFs instead of MRFs on Ivybridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Mon, 23 May 2011 05:45:04 +0000 (13:45 +0800)]
Add support for sample (00000) on Ivybridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Mon, 23 May 2011 05:32:32 +0000 (13:32 +0800)]
Add support for data port read/write on Ivybridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Mon, 23 May 2011 04:43:25 +0000 (12:43 +0800)]
Add -g 7 for Ivybridge
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Feng, Boqun [Tue, 19 Apr 2011 00:43:22 +0000 (08:43 +0800)]
Send instruction on PRE-ILK
[(<pred>)] send (<exec_size>) <pdst> <cdst> <src0> <desc>
Zhou Chang [Thu, 14 Apr 2011 03:51:29 +0000 (11:51 +0800)]
Add VME support in SEND
Ben Widawsky [Thu, 24 Mar 2011 05:08:39 +0000 (22:08 -0700)]
intel-gen4asm: add byte array style disasm
I previously added a byte array style output for intel-gen4asm, but
there was no way to disassemble here. Well here that is.
Ben Widawsky [Fri, 18 Mar 2011 01:57:59 +0000 (18:57 -0700)]
intel-gen4asm: have a C-like binary output
Have the assembler support a byte array output. This is useful for
writing blobs which can directly be linked code that wishes to upload to
the EU.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Xiang, Haihao [Tue, 1 Mar 2011 08:43:02 +0000 (16:43 +0800)]
fix the parameters of register region
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Thu, 17 Feb 2011 05:24:11 +0000 (13:24 +0800)]
send instruction on GEN6
[(<pred>)] send (<exec_size>) reg mreg imm6 imm32
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Wed, 16 Feb 2011 07:26:24 +0000 (15:26 +0800)]
fix notification count register
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Xiang, Haihao [Mon, 13 Dec 2010 08:07:16 +0000 (16:07 +0800)]
Support instructions which strictly follow the documents.
Previously some instructions parsed by this assembler don't follow the
documents.
Signed-off-by: Chen, Yangyang <yangyang.chen@intel.com>
Signed-off-by: Han, Haofu <haofu.han@intel.com>
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>