2 # Unit masks for the Intel "Ivy Bridge" micro architecture
4 # See http://ark.intel.com/ for help in identifying Ivy Bridge based CPUs
6 include:i386/arch_perfmon
7 name:ld_blocks type:mandatory default:0x2
8 0x2 extra: store_forward loads blocked by overlapping with store buffer that cannot be forwarded
9 name:misalign_mem_ref type:bitmask default:0x1
10 0x1 extra: loads Speculative cache line split load uops dispatched to L1 cache
11 0x2 extra: stores Speculative cache line split STA uops dispatched to L1 cache
12 name:ld_blocks_partial type:mandatory default:0x1
13 0x1 extra: address_alias False dependencies in MOB due to partial compare on address
14 name:dtlb_load_misses type:exclusive default:0x81
15 0x81 extra: demand_ld_miss_causes_a_walk Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.
16 0x82 extra: demand_ld_walk_completed Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.
17 0x84 extra: demand_ld_walk_duration Demand load cycles page miss handler (PMH) is busy with this walk.
18 name:int_misc type:exclusive default:recovery_cycles
19 0x3 extra:cmask=1 recovery_cycles Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)
20 0x3 extra:cmask=1,edge recovery_stalls_count Number of occurences waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)
21 name:uops_issued type:exclusive default:any
22 0x1 extra: any Uops that Resource Allocation Table (RAT) issues to Reservation Station (RS)
23 0x1 extra:cmask=1,inv stall_cycles Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread
24 0x1 extra:cmask=1,inv,any core_stall_cycles Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads
25 0x10 extra: flags_merge Number of flags-merge uops being allocated.
26 0x20 extra: slow_lea Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not.
27 0x40 extra: single_mul Number of Multiply packed/scalar single precision uops allocated
28 name:arith type:bitmask default:fpu_div_active
29 0x1 extra: fpu_div_active Cycles when divider is busy executing divide operations
30 0x4 extra:cmask=1,edge fpu_div Divide operations executed
31 name:l2_rqsts type:exclusive default:0x1
32 0x1 extra: demand_data_rd_hit Demand Data Read requests that hit L2 cache
33 0x3 extra: all_demand_data_rd Demand Data Read requests
34 0x4 extra: rfo_hit RFO requests that hit L2 cache
35 0x8 extra: rfo_miss RFO requests that miss L2 cache
36 0xc extra: all_rfo RFO requests to L2 cache
37 0x10 extra: code_rd_hit L2 cache hits when fetching instructions, code reads.
38 0x20 extra: code_rd_miss L2 cache misses when fetching instructions
39 0x30 extra: all_code_rd L2 code requests
40 0x40 extra: pf_hit Requests from the L2 hardware prefetchers that hit L2 cache
41 0x80 extra: pf_miss Requests from the L2 hardware prefetchers that miss L2 cache
42 0xc0 extra: all_pf Requests from L2 hardware prefetchers
43 name:l2_store_lock_rqsts type:exclusive default:0x1
44 0x1 extra: miss RFOs that miss cache lines
45 0x8 extra: hit_m RFOs that hit cache lines in M state
46 0xf extra: all RFOs that access cache lines in any state
47 name:l2_l1d_wb_rqsts type:exclusive default:0x1
48 0x1 extra: miss Count the number of modified Lines evicted from L1 and missed L2. (Non-rejected WBs from the DCU.)
49 0x4 extra: hit_e Not rejected writebacks from L1D to L2 cache lines in E state
50 0x8 extra: hit_m Not rejected writebacks from L1D to L2 cache lines in M state
51 0xf extra: all Not rejected writebacks from L1D to L2 cache lines in any state.
52 name:l1d_pend_miss type:exclusive default:pending_cycles
53 0x1 extra: pending L1D miss oustandings duration in cycles
54 0x1 extra:cmask=1 pending_cycles Cycles with L1D load Misses outstanding.
55 0x1 extra:cmask=1,edge occurences This event counts the number of L1D misses outstanding, using an edge detect to count transitions.
56 name:dtlb_store_misses type:bitmask default:0x1
57 0x1 extra: miss_causes_a_walk Store misses in all DTLB levels that cause page walks
58 0x2 extra: walk_completed Store misses in all DTLB levels that cause completed page walks
59 0x4 extra: walk_duration Cycles when PMH is busy with page walks
60 0x10 extra: stlb_hit Store operations that miss the first TLB level but hit the second and do not cause page walks
61 name:load_hit_pre type:bitmask default:0x1
62 0x1 extra: sw_pf Not software-prefetch load dispatches that hit forward buffer allocated for software prefetch
63 0x2 extra: hw_pf Not software-prefetch load dispatches that hit forward buffer allocated for hardware prefetch
64 name:l1d type:mandatory default:0x1
65 0x1 extra: replacement L1D data line replacements
66 name:move_elimination type:bitmask default:0x1
67 0x1 extra: int_not_eliminated Number of integer Move Elimination candidate uops that were not eliminated.
68 0x2 extra: simd_not_eliminated Number of SIMD Move Elimination candidate uops that were not eliminated.
69 0x4 extra: int_eliminated Number of integer Move Elimination candidate uops that were eliminated.
70 0x8 extra: simd_eliminated Number of SIMD Move Elimination candidate uops that were eliminated.
71 name:cpl_cycles type:exclusive default:ring0
72 0x1 extra: ring0 Unhalted core cycles when the thread is in ring 0
73 0x1 extra:cmask=1,edge ring0_trans Number of intervals between processor halts while thread is in ring 0
74 0x2 extra: ring123 Unhalted core cycles when thread is in rings 1, 2, or 3
75 name:rs_events type:mandatory default:0x1
76 0x1 extra: empty_cycles Cycles when Reservation Station (RS) is empty for the thread
77 name:tlb_access type:mandatory default:0x4
78 0x4 extra: load_stlb_hit Load operations that miss the first DTLB level but hit the second and do not cause page walks
79 name:offcore_requests_outstanding type:exclusive default:cycles_with_demand_data_rd
80 0x1 extra: demand_data_rd Offcore outstanding Demand Data Read transactions in uncore queue.
81 0x1 extra:cmask=1 cycles_with_demand_data_rd Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore
82 0x2 extra: demand_code_rd Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle
83 0x4 extra: demand_rfo Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore
84 0x4 extra:cmask=1 cycles_with_demand_rfo Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle
85 0x8 extra: all_data_rd Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore
86 0x8 extra:cmask=1 cycles_with_data_rd Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore
87 name:lock_cycles type:bitmask default:0x1
88 0x1 extra: split_lock_uc_lock_duration Cycles when L1 and L2 are locked due to UC or split lock
89 0x2 extra: cache_lock_duration Cycles when L1D is locked
90 name:idq type:exclusive default:empty
91 0x2 extra: empty Instruction Decode Queue (IDQ) empty cycles
92 0x4 extra: mite_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
93 0x4 extra:cmask=1 mite_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path
94 0x8 extra: dsb_uops Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path
95 0x8 extra:cmask=1 dsb_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path
96 0x10 extra: ms_dsb_uops Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
97 0x10 extra:cmask=1 ms_dsb_cycles Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
98 0x10 extra:cmask=1,edge ms_dsb_occur Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequenser (MS) is busy
99 0x18 extra:cmask=1 all_dsb_cycles_any_uops Cycles Decode Stream Buffer (DSB) is delivering any Uop
100 0x18 extra:cmask=4 all_dsb_cycles_4_uops Cycles Decode Stream Buffer (DSB) is delivering 4 Uops
101 0x20 extra: ms_mite_uops Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
102 0x24 extra:cmask=1 all_mite_cycles_any_uops Cycles MITE is delivering any Uop
103 0x24 extra:cmask=4 all_mite_cycles_4_uops Cycles MITE is delivering 4 Uops
104 0x30 extra: ms_uops Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
105 0x30 extra:cmask=1 ms_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
106 0x3c extra: mite_all_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
107 name:icache type:mandatory default:0x2
108 0x2 extra: misses Instruction cache, streaming buffer and victim cache misses
109 name:itlb_misses type:bitmask default:0x1
110 0x1 extra: miss_causes_a_walk Misses at all ITLB levels that cause page walks
111 0x2 extra: walk_completed Misses in all ITLB levels that cause completed page walks
112 0x4 extra: walk_duration Cycles when PMH is busy with page walks
113 0x10 extra: stlb_hit Operations that miss the first ITLB level but hit the second and do not cause any page walks
114 name:ild_stall type:bitmask default:0x1
115 0x1 extra: lcp Stalls caused by changing prefix length of the instruction.
116 0x4 extra: iq_full Stall cycles because IQ is full
117 name:br_inst_exec type:exclusive default:0x41
118 0x41 extra: nontaken_conditional Not taken macro-conditional branches
119 0x81 extra: taken_conditional Taken speculative and retired macro-conditional branches
120 0x82 extra: taken_direct_jump Taken speculative and retired macro-conditional branch instructions excluding calls and indirects
121 0x84 extra: taken_indirect_jump_non_call_ret Taken speculative and retired indirect branches excluding calls and returns
122 0x88 extra: taken_indirect_near_return Taken speculative and retired indirect branches with return mnemonic
123 0x90 extra: taken_direct_near_call Taken speculative and retired direct near calls
124 0xa0 extra: taken_indirect_near_call Taken speculative and retired indirect calls
125 0xc1 extra: all_conditional Speculative and retired macro-conditional branches
126 0xc2 extra: all_direct_jmp Speculative and retired macro-unconditional branches excluding calls and indirects
127 0xc4 extra: all_indirect_jump_non_call_ret Speculative and retired indirect branches excluding calls and returns
128 0xc8 extra: all_indirect_near_return Speculative and retired indirect return branches.
129 0xd0 extra: all_direct_near_call Speculative and retired direct near calls
130 0xff extra: all_branches Speculative and retired branches
131 name:br_misp_exec type:exclusive default:0x41
132 0x41 extra: nontaken_conditional Not taken speculative and retired mispredicted macro conditional branches
133 0x81 extra: taken_conditional Taken speculative and retired mispredicted macro conditional branches
134 0x84 extra: taken_indirect_jump_non_call_ret Taken speculative and retired mispredicted indirect branches excluding calls and returns
135 0x88 extra: taken_return_near Taken speculative and retired mispredicted indirect branches with return mnemonic
136 0xa0 extra: taken_indirect_near_call Taken speculative and retired mispredicted indirect calls
137 0xc1 extra: all_conditional Speculative and retired mispredicted macro conditional branches
138 0xc4 extra: all_indirect_jump_non_call_ret Mispredicted indirect branches excluding calls and returns
139 0xff extra: all_branches Speculative and retired mispredicted macro conditional branches
140 name:idq_uops_not_delivered type:exclusive default:core
141 0x1 extra: core Uops not delivered by the Frontend to the Backend of the machine, while there is no Backend stall
142 0x1 extra:cmask=1 cycles_le_3_uop_deliv.core Cycles with 3 or less uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
143 0x1 extra:cmask=1,inv cycles_fe_was_ok Cycles with 4 uops delivered by the Frontend to the Backend of the machine, or the Backend was stalling
144 0x1 extra:cmask=2 cycles_le_2_uop_deliv.core Cycles with 2 or less uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
145 0x1 extra:cmask=3 cycles_le_1_uop_deliv.core Cycles with 1 or less uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
146 0x1 extra:cmask=4 cycles_0_uops_deliv.core Cycles with no uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
147 name:uops_dispatched_port type:exclusive default:port_0
148 0x1 extra: port_0 Cycles per thread when uops are dispatched to port 0
149 0x1 extra:any port_0_core Cycles per core when uops are dispatched to port 0
150 0x2 extra: port_1 Cycles per thread when uops are dispatched to port 1
151 0x2 extra:any port_1_core Cycles per core when uops are dispatched to port 1
152 0xc extra: port_2 Cycles per thread when load or STA uops are dispatched to port 2
153 0xc extra:any port_2_core Cycles per core when load or STA uops are dispatched to port 2
154 0x30 extra: port_3 Cycles per thread when load or STA uops are dispatched to port 3
155 0x30 extra:any port_3_core Cycles per core when load or STA uops are dispatched to port 3
156 0x40 extra: port_4 Cycles per thread when uops are dispatched to port 4
157 0x40 extra:any port_4_core Cycles per core when uops are dispatched to port 4
158 0x80 extra: port_5 Cycles per thread when uops are dispatched to port 5
159 0x80 extra:any port_5_core Cycles per core when uops are dispatched to port 5
160 name:resource_stalls type:bitmask default:0x1
161 0x1 extra: any Resource-related stall cycles
162 0x4 extra: rs Cycles stalled due to no eligible RS entry available.
163 0x8 extra: sb Cycles stalled due to no store buffers available. (not including draining form sync).
164 0x10 extra: rob Cycles stalled due to re-order buffer full.
165 name:cycle_activity type:exclusive default:0x1
166 0x1 extra:cmask=1 cycles_l2_pending Cycles with pending L2 cache miss loads.
167 0x2 extra:cmask=2 cycles_ldm_pending Cycles with pending memory loads.
168 0x4 extra:cmask=4 cycles_no_execute Total execution stalls
169 0x5 extra:cmask=5 stalls_l2_pending Execution stalls due to L2 cache misses.
170 0x6 extra:cmask=6 stalls_ldm_pending Execution stalls due to memory subsystem.
171 0x8 extra:cmask=8 cycles_l1d_pending Cycles with pending L1 cache miss loads.
172 0xc extra:cmask=c stalls_l1d_pending Execution stalls due to L1 data cache misses
173 name:dsb2mite_switches type:mandatory default:0x1
174 0x1 extra: count Decode Stream Buffer (DSB)-to-MITE switches
175 name:dsb_fill type:mandatory default:0x8
176 0x8 extra: exceed_dsb_lines Cycles when Decode Stream Buffer (DSB) fill encounter more than 3 Decode Stream Buffer (DSB) lines
177 name:itlb type:mandatory default:0x1
178 0x1 extra: itlb_flush Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.
179 name:offcore_requests type:bitmask default:0x1
180 0x1 extra: demand_data_rd Demand Data Read requests sent to uncore
181 0x2 extra: demand_code_rd Cacheable and noncachaeble code read requests
182 0x4 extra: demand_rfo Demand RFO requests including regular RFOs, locks, ItoM
183 0x8 extra: all_data_rd Demand and prefetch data reads
184 name:uops_executed type:exclusive default:thread
185 0x1 extra: thread Counts the number of uops to be executed per-thread each cycle.
186 0x1 extra:cmask=1 cycles_ge_1_uop_exec Cycles where at least 1 uop was executed per-thread
187 0x1 extra:cmask=1,inv stall_cycles Counts number of cycles no uops were dispatched to be executed on this thread.
188 0x1 extra:cmask=2 cycles_ge_2_uops_exec Cycles where at least 2 uops were executed per-thread
189 0x1 extra:cmask=3 cycles_ge_3_uops_exec Cycles where at least 3 uops were executed per-thread
190 0x1 extra:cmask=4 cycles_ge_4_uops_exec Cycles where at least 4 uops were executed per-thread
191 0x2 extra: core Number of uops executed on the core.
192 name:tlb_flush type:bitmask default:0x1
193 0x1 extra: dtlb_thread DTLB flush attempts of the thread-specific entries
194 0x20 extra: stlb_any STLB flush attempts
195 name:other_assists type:bitmask default:0x8
196 0x8 extra: avx_store Number of AVX memory assist for stores. AVX microcode assist is being invoked whenever the hardware is unable to properly handle AVX-256b operations.
197 0x10 extra: avx_to_sse Number of transitions from AVX-256 to legacy SSE when penalty applicable.
198 0x20 extra: sse_to_avx Number of transitions from SSE to AVX-256 when penalty applicable.
199 name:uops_retired type:exclusive default:all
200 0x1 extra: all Actually retired uops.
201 0x1 extra:cmask=1,inv stall_cycles Cycles without actually retired uops.
202 0x1 extra:cmask=1,inv,any core_stall_cycles Cycles without actually retired uops.
203 0x1 extra:cmask=10,inv total_cycles Cycles with less than 10 actually retired uops.
204 0x2 extra: retire_slots Retirement slots used.
205 name:machine_clears type:bitmask default:0x2
206 0x2 extra: memory_ordering Counts the number of machine clears due to memory order conflicts.
207 0x4 extra: smc Self-modifying code (SMC) detected.
208 0x20 extra: maskmov This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.
209 name:br_inst_retired type:exclusive default:0x1
210 0x1 extra: conditional Conditional branch instructions retired.
211 0x2 extra: near_call_r3 Direct and indirect macro near call instructions retired (captured in ring 3).
212 0x2 extra: near_call Direct and indirect near call instructions retired.
213 0x8 extra: near_return Return instructions retired.
214 0x10 extra: not_taken Not taken branch instructions retired.
215 0x20 extra: near_taken Taken branch instructions retired.
216 0x40 extra: far_branch Far branch instructions retired.
217 name:br_misp_retired type:bitmask default:0x1
218 0x1 extra: conditional Mispredicted conditional branch instructions retired.
219 0x20 extra: near_taken number of near branch instructions retired that were mispredicted and taken.
220 name:fp_assist type:exclusive default:0x1e
221 0x2 extra: x87_output Number of X87 assists due to output value.
222 0x4 extra: x87_input Number of X87 assists due to input value.
223 0x8 extra: simd_output Number of SIMD FP assists due to Output values
224 0x10 extra: simd_input Number of SIMD FP assists due to input values
225 0x1e extra:cmask=1 any Cycles with any input/output SSE or FP assist
226 name:rob_misc_events type:mandatory default:0x20
227 0x20 extra: lbr_inserts Count cases of saving new LBR
228 name:mem_uops_retired type:exclusive default:0x81
229 0x11 extra: stlb_miss_loads Load uops with true STLB miss retired to architected path.
230 0x12 extra: stlb_miss_stores Store uops with true STLB miss retired to architected path.
231 0x21 extra: lock_loads Load uops with locked access retired to architected path.
232 0x41 extra: split_loads Line-splitted load uops retired to architected path.
233 0x42 extra: split_stores Line-splitted store uops retired to architected path.
234 0x81 extra: all_loads Load uops retired to architected path with filter on bits 0 and 1 applied.
235 0x82 extra: all_stores Store uops retired to architected path with filter on bits 0 and 1 applied.
236 name:mem_load_uops_retired type:bitmask default:0x1
237 0x1 extra: l1_hit Retired load uops with L1 cache hits as data sources.
238 0x2 extra: l2_hit Retired load uops with L2 cache hits as data sources.
239 0x4 extra: llc_hit Retired load uops which data sources were data hits in LLC without snoops required.
240 0x40 extra: hit_lfb Retired load uops which data sources were load uops missed L1 but hit forward buffer due to preceding miss to the same cache line with data not ready.
241 name:mem_load_uops_llc_hit_retired type:bitmask default:0x1
242 0x1 extra: xsnp_miss Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache.
243 0x2 extra: xsnp_hit Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache.
244 0x4 extra: xsnp_hitm Retired load uops which data sources were HitM responses from shared LLC.
245 0x8 extra: xsnp_none Retired load uops which data sources were hits in LLC without snoops required.
246 name:mem_load_uops_llc_miss_retired type:mandatory default:0x1
247 0x1 extra: local_dram Data from local DRAM either Snoop not needed or Snoop Miss (RspI)
248 name:baclears type:mandatory default:0x1f
249 0x1f extra: any Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.
250 name:l2_trans type:bitmask default:0x80
251 0x1 extra: demand_data_rd Demand Data Read requests that access L2 cache
252 0x2 extra: rfo RFO requests that access L2 cache
253 0x4 extra: code_rd L2 cache accesses when fetching instructions
254 0x8 extra: all_pf L2 or LLC HW prefetches that access L2 cache
255 0x10 extra: l1d_wb L1D writebacks that access L2 cache
256 0x20 extra: l2_fill L2 fill requests that access L2 cache
257 0x40 extra: l2_wb L2 writebacks that access L2 cache
258 0x80 extra: all_requests Transactions accessing L2 pipe
259 name:l2_lines_in type:exclusive default:0x7
260 0x1 extra: i L2 cache lines in I state filling L2
261 0x2 extra: s L2 cache lines in S state filling L2
262 0x4 extra: e L2 cache lines in E state filling L2
263 0x7 extra: all L2 cache lines filling L2
264 name:l2_lines_out type:exclusive default:0x1
265 0x1 extra: demand_clean Clean L2 cache lines evicted by demand
266 0x2 extra: demand_dirty Dirty L2 cache lines evicted by demand
267 0x4 extra: pf_clean Clean L2 cache lines evicted by L2 prefetch
268 0x8 extra: pf_dirty Dirty L2 cache lines evicted by L2 prefetch
269 0xa extra: dirty_all Dirty L2 cache lines filling the L2