NHM_EP_EVENTS(3CPC) CPU Performance Counters Library Functions NHM_EP_EVENTS(3CPC)

nhm_ep_eventsprocessor model specific performance counter events

This manual page describes events specific to the following Intel CPU models and is derived from Intel's perfmon data. For more information, please consult the Intel Software Developer's Manual or Intel's perfmon website.

CPU models described by this document:

The following events are supported:

Cycles the divider is busy
Divide Operations executed
Multiply operations executed
BACLEAR asserted with bad target address
BACLEAR asserted, regardless of cause
Instruction queue forced BACLEAR
Early Branch Prediciton Unit clears
Late Branch Prediction Unit clears
Branch prediction unit missed call or return
Branch instructions decoded
Branch instructions executed
Conditional branch instructions executed
Unconditional branches executed
Unconditional call branches executed
Indirect call branches executed
Indirect non call branches executed
Call branches executed
All non call branches executed
Indirect return branches executed
Taken branches executed
Retired branch instructions (Precise Event)
Retired conditional branch instructions (Precise Event)
Retired near call instructions (Precise Event)
Mispredicted branches executed
Mispredicted conditional branches executed
Mispredicted unconditional branches executed
Mispredicted non call branches executed
Mispredicted indirect call branches executed
Mispredicted indirect non call branches executed
Mispredicted call branches executed
Mispredicted non call branches executed
Mispredicted return branches executed
Mispredicted taken branches executed
Mispredicted near retired calls (Precise Event)
Cycles L1D locked
Cycles L1D and L2 locked
Reference base clock (133 Mhz) cycles when thread is not halted (programmable counter)
Cycles when thread is not halted (programmable counter)
Total CPU cycles
DTLB load misses
DTLB load miss caused by low part of address
DTLB second level hit
DTLB load miss page walks complete
DTLB misses
DTLB first level misses but second level hit
DTLB miss page walks
ES segment renames
X87 Floating point assists (Precise Event)
X87 Floating poiint assists for invalid input value (Precise Event)
X87 Floating point assists for invalid output value (Precise Event)
MMX Uops
SSE* FP double precision Uops
SSE and SSE2 FP Uops
SSE FP packed Uops
SSE FP scalar Uops
SSE* FP single precision Uops
SSE2 integer Uops
Computational floating-point operations executed
All Floating Point to and from MMX transitions
Transitions from MMX to Floating Point instructions
Transitions from Floating Point to MMX instructions
Any Instruction Length Decoder stall cycles
Instruction Queue full stall cycles
Length Change Prefix stall cycles
Stall cycles due to BPU MRU bypass
Regen stall cycles
Instructions that must be decoded by decoder 0
Cycles instructions are written to the instruction queue
Instructions written to instruction queue.
Instructions retired (Programmable counter and Precise Event)
Retired MMX instructions (Precise Event)
Total cycles (Precise Event)
Retired floating-point operations (Precise Event)
I/O transactions
ITLB flushes
Retired instructions that missed the ITLB (Precise Event)
ITLB miss
ITLB miss page walks
L1D cache lines replaced in M state
L1D cache lines allocated in the M state
L1D snoop eviction of cache lines in M state
L1 data cache lines allocated
All references to the L1 data cache
L1 data cacheable reads and writes
L1 data cache read in E state
L1 data cache read in I state (misses)
L1 data cache read in M state
L1 data cache reads
L1 data cache read in S state
L1 data cache load locks in E state
L1 data cache load lock hits
L1 data cache load locks in M state
L1 data cache load locks in S state
L1D load lock accepted in fill buffer
L1D prefetch load lock accepted in fill buffer
L1 data cache stores in E state
L1 data cache stores in M state
L1 data cache stores in S state
L1D hardware prefetch misses
L1D hardware prefetch requests
L1D hardware prefetch requests triggered
L1 writebacks to L2 in E state
L1 writebacks to L2 in I state (misses)
L1 writebacks to L2 in M state
All L1 writebacks to L2
L1 writebacks to L2 in S state
L1I instruction fetch stall cycles
L1I instruction fetch hits
L1I instruction fetch misses
L1I Instruction fetches
All L2 data requests
L2 data demand loads in E state
L2 data demand loads in I state (misses)
L2 data demand loads in M state
L2 data demand requests
L2 data demand loads in S state
L2 data prefetches in E state
L2 data prefetches in the I state (misses)
L2 data prefetches in M state
All L2 data prefetches
L2 data prefetches in the S state
L2 lines alloacated
L2 lines allocated in the E state
L2 lines allocated in the S state
L2 lines evicted
L2 lines evicted by a demand request
L2 modified lines evicted by a demand request
L2 lines evicted by a prefetch request
L2 modified lines evicted by a prefetch request
L2 instruction fetch hits
L2 instruction fetch misses
L2 instruction fetches
L2 load hits
L2 load misses
L2 requests
All L2 misses
L2 prefetch hits
L2 prefetch misses
All L2 prefetches
All L2 requests
L2 RFO hits
L2 RFO misses
L2 RFO requests
All L2 transactions
L2 fill transactions
L2 instruction fetch transactions
L1D writeback to L2 transactions
L2 Load transactions
L2 prefetch transactions
L2 RFO transactions
L2 writeback to LLC transactions
L2 demand lock RFOs in E state
All demand L2 lock RFOs that hit the cache
L2 demand lock RFOs in I state (misses)
L2 demand lock RFOs in M state
All demand L2 lock RFOs
L2 demand lock RFOs in S state
All L2 demand store RFOs that hit the cache
L2 demand store RFOs in I state (misses)
L2 demand store RFOs in M state
All L2 demand store RFOs
L2 demand store RFOs in S state
Large ITLB hit
All loads dispatched
Loads dispatched from the MOB
Loads dispatched that bypass the MOB
Loads dispatched from stage 305
Load operations conflicting with software prefetches
Longest latency cache miss
Longest latency cache reference
Cycles when uops were delivered by the LSD
Cycles no uops were delivered by the LSD
Loops that can't stream from the instruction queue
Cycles machine clear asserted
Execution pipeline restart due to Memory ordering conflicts
Self-Modifying Code detected
Instructions decoded
Macro-fused instructions decoded
Instructions retired which contains a load (Precise Event)
Instructions retired which contains a store (Precise Event)
Retired loads that miss the DTLB (Precise Event)
Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)
Retired loads that hit the L1 data cache (Precise Event)
Retired loads that hit the L2 cache (Precise Event)
Retired loads that miss the LLC cache (Precise Event)
Retired loads that hit valid versions in the LLC cache (Precise Event)
Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)
Retired stores that miss the DTLB (Precise Event)
Load instructions retired with a data source of local DRAM or locally homed remote hitm (Precise Event)
Load instructions retired that HIT modified data in sibling core (Precise Event)
Load instructions retired remote cache HIT data source (Precise Event)
Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)
Load instructions retired IO (Precise Event)
Offcore L1 data cache writebacks
Offcore requests blocked due to Super Queue full
False dependencies due to partial address aliasing
All RAT stall cycles
Flag stall cycles
Partial register stall cycles
ROB read port stalls cycles
Scoreboard stall cycles
Resource related stall cycles
FPU control word write stall cycles
Load buffer stall cycles
MXCSR rename stall cycles
Other Resource related stall cycles
ROB full stall cycles
Reservation Station full stall cycles
Store buffer stall cycles
All Store buffer stall cycles
Segment rename stall cycles
128 bit SIMD integer pack operations
128 bit SIMD integer arithmetic operations
128 bit SIMD integer logical operations
128 bit SIMD integer multiply operations
128 bit SIMD integer shift operations
128 bit SIMD integer shuffle/move operations
128 bit SIMD integer unpack operations
SIMD integer 64 bit pack operations
SIMD integer 64 bit arithmetic operations
SIMD integer 64 bit logical operations
SIMD integer 64 bit packed multiply operations
SIMD integer 64 bit shift operations
SIMD integer 64 bit shuffle/move operations
SIMD integer 64 bit unpack operations
Thread responded HIT to snoop
Thread responded HITE to snoop
Thread responded HITM to snoop
Super Queue full stall cycles
Super Queue lock splits across a cache line
SIMD Packed-Double Uops retired (Precise Event)
SIMD Packed-Single Uops retired (Precise Event)
SIMD Scalar-Double Uops retired (Precise Event)
SIMD Scalar-Single Uops retired (Precise Event)
SIMD Vector Integer Uops retired (Precise Event)
Loads delayed with at-Retirement block code
Cacheable loads delayed with L1D block code
Two Uop instructions decoded
Uop unfusions due to FP exceptions
Stack pointer instructions decoded
Stack pointer sync operations
Uops decoded by Microcode Sequencer
Cycles no Uops are decoded
Cycles Uops executed on any port (core count)
Cycles Uops executed on ports 0-4 (core count)
Uops executed on any port (core count)
Uops executed on ports 0-4 (core count)
Cycles no Uops issued on any port (core count)
Cycles no Uops issued on ports 0-4 (core count)
Uops executed on port 0
Uops issued on ports 0, 1 or 5
Cycles no Uops issued on ports 0, 1 or 5
Uops executed on port 1
Uops executed on port 2 (core count)
Uops issued on ports 2, 3 or 4
Uops executed on port 3 (core count)
Uops executed on port 4 (core count)
Uops executed on port 5
Uops issued
Cycles no Uops were issued on any thread
Cycles Uops were issued on either thread
Fused Uops issued
Cycles no Uops were issued
Cycles Uops are being retired
Uops retired (Precise Event)
Macro-fused Uops retired (Precise Event)
Retirement slots used (Precise Event)
Cycles Uops are not retiring (Precise Event)
Total cycles using precise uop retired event (Precise Event)
Total cycles (Precise Event)

cpc(3CPC)

https://download.01.org/perfmon/index/

June 18, 2018 OmniOS