NAME

amd_f17h_zen2_events — AMD Family 17h Zen2 processor performance monitoring events

DESCRIPTION

This manual page describes events specfic to AMD Family 17h Zen2 processors. For more information, please consult the appropriate AMD BIOS and Kernel Developer's guide or Open-Source Register Reference.

Each of the events listed below includes the AMD mnemonic which matches the name found in the AMD manual and a brief summary of the event. If available, a more detailed description of the event follows and then any additional unit values that modify the event. Each unit can be combined to create a new event in the system by placing the '.' character between the event name and the unit name.

The following events are supported:

FpRetSseAvxOps

Core::X86::Pmc::Core::FpRetSseAvxOps - Retired SSE/AVX FLOPs

This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event is a MergeEvent since it can count above 15.

This event has the following units which may be used to modify the behavior of the event:

MacFLOPs: MacFLOPs count as 2 FLOPs. Does not provide a useful count without use of the MergeEvent feature.
DivFLOPs: Divide/square root FLOPs. Does not provide a useful count without use of the MergeEvent feature.
MultFLOPs: Multiply FLOPs. Does not provide a useful count without use of the MergeEvent feature.
AddSubFLOPs: Add/subtract FLOPs. Does not provide a useful count without use of the MergeEvent feature.

FpRetiredSerOps

Core::X86::Pmc::Core::FpRetiredSerOps - Retired Serializing Ops

The number of serializing Ops retired.

This event has the following units which may be used to modify the behavior of the event:

SseBotRet: SSE bottom-executing uOps retired.
SseCtrlRet: SSE control word mispredict traps due to mispredictions in RC, FTZ or DAZ, or changes in mask bits.
X87BotRet: x87 bottom-executing uOps retired.
X87CtrlRet: x87 control word mispredict traps due to mispredictions in RC or PC, or changes in mask bits.

FpDispFaults

Core::X86::Pmc::Core::FpDispFaults - FP Dispatch Faults

Floating Point Dispatch Faults.

This event has the following units which may be used to modify the behavior of the event:

YmmSpillFault: YMM Spill fault.
YmmFillFault: YMM Fill fault.
XmmFillFault: XMM Fill fault.
x87FillFault: x87 Fill fault.

LsBadStatus2

Core::X86::Pmc::Core::LsBadStatus2 - Bad Status 2

Store To Load Interlock (STLI) are loads that were unable to complete because of a possible match with an older store, and the older store could not do STLF for some reason. There are a number of reasons why this occurs, and this perfmon organizes them into three major groups.

This event has the following units which may be used to modify the behavior of the event:

StliOther: Non-forwardable conflict; used to reduce STLI's via software. All reasons. The most common among these is that there is only a partial overlap between the store and the load, for example there's an 8B store to address A and a 16B load starting at address A. STLF can't be performed in this case because only some of the load's data is coming from the store, so the load gets StliOther. Another StliOther case is if the load hits a non-cacheable store that's sitting in the non-cacheable buffers (WCBs).

LsLocks

Core::X86::Pmc::Core::LsLocks - Retired Lock Instructions

LsRetClClush

Core::X86::Pmc::Core::LsRetClClush - Retired CLFLUSH Instructions

The number of retired CLFLUSH instructions. This is a non-speculative event.

LsRetCpuid

Core::X86::Pmc::Core::LsRetCpuid - Retired CPUID Instructions

The number of CPUID instructions retired.

LsDispatch

Core::X86::Pmc::Core::LsDispatch - LS Dispatch

Counts the number of operations dispatched to the LS unit.

LsSmiRx

Core::X86::Pmc::Core::LsSmiRx - SMIs Received

Counts the number of SMIs received.

LsIntTaken

Core::X86::Pmc::Core::LsIntTaken - Interrupts Taken

Counts the number of interrupts taken.

LsRdTsc

Core::X86::Pmc::Core::LsRdTsc - Time Stamp Counter Reads

Counts the number of reads of the TSC (RDTSC instructions). The count is speculative.

LsSTLF

Core::X86::Pmc::Core::LsSTLF - Store to Load Forward

Number of STLF hits.

LsStCommitCancel2

Core::X86::Pmc::Core::LsStCommitCancel2 - Store Commit Cancels 2

This event has the following units which may be used to modify the behavior of the event:

StCommitCancelWcbFull: A non-cacheable store and the non-cacheable commit buffer is full.

LsDcAccesses

Core::X86::Pmc::Core::LsDcAccesses - Data Cache Accesses

The number of accesses to the data cache for load and store references. This may include certain microcode scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access, although the instruction may only be accessing a portion of that. This event is a speculative event.

LsMabAlloc

Core::X86::Pmc::Core::LsMabAlloc - DC Miss By Type

This event has the following units which may be used to modify the behavior of the event:

DcPrefetcher
Stores
Loads

LsRefillsFromSys

Core::X86::Pmc::Core::LsRefillsFromSys - Data Cache Refills from System

Demand Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

LS_MABRESP_RMT_DRAM: DRAM or IO from different die.
LS_MABRESP_RMT_CACHE: Hit in cache; Remote CCX and the address's Home Node is on a different die.
LS_MABRESP_LCL_DRAM: DRAM or IO from this thread's die.
LS_MABRESP_LCL_CACHE: Hit in cache; local CCX (not Local L2), or Remote CCXand the address's Home Node is on this thread's die.
MABRESP_LCL_L2: Local L2 hit.

LsL1DTlbMiss

Core::X86::Pmc::Core::LsL1DTlbMiss - L1 DTLB Miss

This event has the following units which may be used to modify the behavior of the event:

TlbReload1GL2Miss: DTLB reload to a 1G page that miss in the L2 TLB.
TlbReload2ML2Miss: DTLB reload to a 2M page that miss in the L2 TLB.
TlbReloadCoalescedPageMiss
TlbReload4KL2Miss: DTLB reload to a 4K page that miss the L2 TLB.
TlbReload1GL2Hit: DTLB reload to a 1G page that hit in the L2 TLB.
TlbReload2ML2Hit: DTLB reload to a 2M page that hit in the L2 TLB.
TlbReloadCoalescedPageHit
TlbReload4KL2Hit: DTLB reload to a 4K page that hit in the L2 TLB.

LsMisalAccesses

Core::X86::Pmc::Core::LsMisalAccesses - Misaligned loads

LsPrefInstrDisp

Core::X86::Pmc::Core::LsPrefInstrDisp - Prefetch Instructions Dispatched

Software Prefetch Instructions Dispatched (Speculative).

This event has the following units which may be used to modify the behavior of the event:

PrefetchNTA: PrefetchNTA instruction. See AMD64 Architecture Programmer's Manual Volume 3: Instruction-Set Reference, order# 24594 PREFETCHlevel.
PrefetchW: PrefetchW instruction. See AMD64 Architecture Programmer's Manual Volume 3: Instruction-Set Reference, order# 24594 PREFETCHlevel.
Prefetch: PrefetchT0, T1 and T2 instructions. See AMD64 Architecture Programmer's Manual Volume 3: Instruction-Set Reference, order# 24594 PREFETCHlevel.

LsInefSwPref

Core::X86::Pmc::Core::LsInefSwPref - Ineffective Software Prefetches

The number of software prefetches that did not fetch data outside of the processor core.

This event has the following units which may be used to modify the behavior of the event:

MabMchCnt: Software PREFETCH instruction saw a match on an already-allocated miss request buffer.
DataPipeSwPfDcHit: Software PREFETCH instruction saw a DC hit.

LsSwPfDcFills

Core::X86::Pmc::Core::LsSwPfDcFills - Software Prefetch Data Cache Fills

Software Prefetch Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

LS_MABRESP_RMT_DRAM: DRAM or IO from different die.
LS_MABRESP_RMT_CACHE: Hit in cache; Remote CCX and the address's Home Node is on a different die.
LS_MABRESP_LCL_DRAM: DRAM or IO from this thread's die.
LS_MABRESP_LCL_CACHE: Hit in cache; local CCX (not Local L2), or Remote CCX and the address's Home Node is on this thread's die.
MABRESP_LCL_L2: Local L2 hit.

LsHwPfDcFills

Core::X86::Pmc::Core::LsHwPfDcFills - Hardware Prefetch Data Cache Fills

Hardware Prefetch Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

LS_MABRESP_RMT_DRAM: DRAM or IO from different die.
LS_MABRESP_RMT_CACHE: Hit in cache; Remote CCX and the address's Home Nodeis on a different die.
LS_MABRESP_LCL_DRAM: DRAM or IO from this thread's die.
LS_MABRESP_LCL_CACHE: Hit in cache; local CCX (not Local L2), or Remote CCXand the address's Home Node is on this thread's die.
MABRESP_LCL_L2: Local L2 hit.

LsNotHaltedCyc

Core::X86::Pmc::Core::LsNotHaltedCyc - Cycles not in Halt

LsTlbFlush

Core::X86::Pmc::Core::LsTlbFlush - All TLB Flushes

IcCacheFillL2

Core::X86::Pmc::Core::IcCacheFillL2 - Instruction Cache Refills from L2

The number of 64 byte instruction cache line was fulfilled from the L2 cache.

IcCacheFillSys

Core::X86::Pmc::Core::IcCacheFillSys - Instruction Cache Refills from System

The number of 64 byte instruction cache line fulfilled from system memory or another cache.

BpL1TlbMissL2TlbHit

Core::X86::Pmc::Core::BpL1TlbMissL2TlbHit - L1 ITLB Miss, L2 ITLB Hit

The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB.

BpL1TlbMissL2TlbMiss

Core::X86::Pmc::Core::BpL1TlbMissL2TlbMiss - L1 ITLB Miss, L2 ITLB Miss

The number of instruction fetches that miss in both the L1 and L2 TLBs.

This event has the following units which may be used to modify the behavior of the event:

IF1G: Instruction fetches to a 1 GB page.
IF2M: Instruction fetches to a 2 MB page.
IF4K: Instruction fetches to a 4 KB page.

BpL1BTBCorrect

Core::X86::Pmc::Core::BpL1BTBCorrect - L1 Branch Prediction Overrides Existing Prediction (speculative)

BpL2BTBCorrect

Core::X86::Pmc::Core::BpL2BTBCorrect - L2 Branch Prediction Overrides Existing Prediction (speculative)

BpDynIndPred

Core::X86::Pmc::Core::BpDynIndPred - Dynamic Indirect Predictions

Indirect Branch Prediction for potential multi-target branch (speculative)

BpDeReDirect

Core::X86::Pmc::Core::BpDeReDirect - Decoder Overrides Existing Branch Prediction (speculative)

BpL1TlbFetchHit

Core::X86::Pmc::Core::BpL1TlbFetchHit - ITLB Instruction Fetch Hits

The number of instruction fetches that hit in the L1 ITLB.

This event has the following units which may be used to modify the behavior of the event:

IF1G: Instruction fetches to a 1 GB page.
IF2M: Instruction fetches to a 2 MB page.
IF4K: Instruction fetches to a 4 KB page.

DeDisUopQueueEmptyDi0

Core::X86::Pmc::Core::DeDisUopQueueEmptyDi0 - Micro-Op Queue Empty

Cycles where the Micro-Op Queue is empty.

DeDisUopsFromDecoder

Core::X86::Pmc::Core::DeDisUopsFromDecoder - UOps Dispatched From Decoder

Ops dispatched from either the decoders, OpCache or both.

This event has the following units which may be used to modify the behavior of the event:

OpCacheDispatched: Count of dispatched Ops from OpCache.
DecoderDispatched: Count of dispatched Ops from Decoder.

DeDisDispatchTokenStalls1

Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 - Dispatch Resource Stall Cycles 1

Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall.

This event has the following units which may be used to modify the behavior of the event:

FPMiscRsrcStall: FP Miscellaneous resource unavailable. Applies to the recovery of mispredicts with FP ops.
FPSchRsrcStall: FP scheduler resource stall. Applies to ops that use the FP scheduler.
FpRegFileRsrcStall: floating point register file resource stall. Applies to all FP ops that have a destination register.
TakenBrnchBufferRsrc: taken branch buffer resource stall.
IntSchedulerMiscRsrcStall: Integer Scheduler miscellaneous resource stall.
StoreQueueRsrcStall: Store Queue resource stall. Applies to all ops with store semantics.
LoadQueueRsrcStall: Load Queue resource stall. Applies to all ops with load semantics.
IntPhyRegFileRsrcStall: Integer Physical Register File resource stall. Integer Physical Register File, applies to all ops that have an integer destination register.

DeDisDispatchTokenStalls0

Core::X86::Pmc::Core::DeDisDispatchTokenStalls0 - Dispatch Resource Stall Cycles 0

Cycles where a dispatch group is valid but does not get dispatched due to a token stall.

This event has the following units which may be used to modify the behavior of the event:

ScAguDispatchStall: SC AGU dispatch stall.
RetireTokenStall: RETIRE Tokens unavailable.
AGSQTokenStall: AGSQ Tokens unavailable.
ALUTokenStall: ALU tokens total unavailable.
ALSQ3_0_TokenStall
ALSQ2RsrcStall: ALSQ 2 Resources unavailable.
ALSQ1RsrcStall: ALSQ 1 Resources unavailable.

ExRetInstr

Core::X86::Pmc::Core::ExRetInstr - Retired Instructions

ExRetCops

Core::X86::Pmc::Core::ExRetCops - Retired Uops

The number of micro-ops retired. This count includes all processor activity (instructions, exceptions, interrupts, microcode assists, etc.). The number of events logged per cycle can vary from 0 to 8.

ExRetBrn

Core::X86::Pmc::Core::ExRetBrn - Retired Branch Instructions

The number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts.

ExRetBrnMisp

Core::X86::Pmc::Core::ExRetBrnMisp - Retired Branch Instructions Mispredicted

The number of branch instructions retired, of any type, that were not correctly predicted. This includes those for which prediction is not attempted (far control transfers, exceptions and interrupts).

ExRetBrnTkn

Core::X86::Pmc::Core::ExRetBrnTkn - Retired Taken Branch Instructions

The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts.

ExRetBrnTknMisp

Core::X86::Pmc::Core::ExRetBrnTknMisp - Retired Taken Branch Instructions Mispredicted

The number of retired taken branch instructions that were mispredicted.

ExRetBrnFar

Core::X86::Pmc::Core::ExRetBrnFar - Retired Far Control Transfers

The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction.

ExRetNearRet

Core::X86::Pmc::Core::ExRetNearRet - Retired Near Returns

The number of near return instructions (RET or RET Iw) retired.

ExRetNearRetMispred

Core::X86::Pmc::Core::ExRetNearRetMispred - Retired Near Returns Mispredicted

The number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredictincurs the same penalty as a mispredicted conditional branch instruction.

ExRetBrnIndMisp

Core::X86::Pmc::Core::ExRetBrnIndMisp - Retired Indirect Branch Instructions Mispredicted

The number of indirect branches retired that were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction. Note that only EX mispredicts are counted.

ExRetMmxFpInstr

Core::X86::Pmc::Core::ExRetMmxFpInstr - Retired MMX/FP Instructions

The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPs.

This event has the following units which may be used to modify the behavior of the event:

SseInstr: SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX).
MmxInstr: MMX instructions.
X87Instr: x87 instructions.

ExRetCond

Core::X86::Pmc::Core::ExRetCond - Retired Conditional Branch Instructions

ExDivBusy

Core::X86::Pmc::Core::ExDivBusy - Div Cycles Busy count

ExDivCount

Core::X86::Pmc::Core::ExDivCount - Div Op Count

ExTaggedIbsOps

Core::X86::Pmc::Core::ExTaggedIbsOps - Tagged IBS Ops

This event has the following units which may be used to modify the behavior of the event:

IbsCountRollover: Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired.
IbsTaggedOpsRet: Number of Ops tagged by IBS that retired.
IbsTaggedOps: Number of Ops tagged by IBS.

ExRetFusBrnchInst

Core::X86::Pmc::Core::ExRetFusBrnchInst - Retired Fused Branch Instructions

The number of fuse-branch instructions retired per cycle. The number of events logged per cycle can vary from 0-8.

L2RequestG1

Core::X86::Pmc::Core::L2RequestG1 - Requests to L2 Group1

All L2 Cache Requests (Breakdown 1 - Common).

This event has the following units which may be used to modify the behavior of the event:

RdBlkL: Data Cache Reads (including hardware and software prefetch).
RdBlkX: Data Cache Stores.
LsRdBlkC_S: Data Cache Shared Reads.
CacheableIcRead: Instruction Cache Reads.
ChangeToX: Data Cache State Change Requests. Request change to writable, check L2 for current state.
PrefetchL2Cmd
L2HwPf: L2 Prefetcher. All prefetches accepted by L2 pipeline, hit or miss. Types of PF and L2 hit/miss broken out in a separate perfmon event.
Group2: Miscellaneous events covered in more detail by Core::X86::Pmc::Core::L2RequestG2 (PMCx061).

L2RequestG2

Core::X86::Pmc::Core::L2RequestG2 - Requests to L2 Group2

All L2 Cache Requests (Breakdown 2 - Rare).

This event has the following units which may be used to modify the behavior of the event:

Group1: Miscellaneous events covered in more detail by Core::X86::Pmc::Core::L2RequestG1 (PMCx060).
LsRdSized: Data cache read sized.
LsRdSizedNC: Data cache read sized non-cacheable.
IcRdSized: Instruction cache read sized.
IcRdSizedNC: Instruction cache read sized non-cacheable.
SmcInval: Self-modifying code invalidates.
BusLocksOriginator: Bus locks.
BusLocksResponses: Bus Lock Response.

L2CacheReqStat

Core::X86::Pmc::Core::L2CacheReqStat - Core to L2 Cacheable Request Access Status

L2 Cache Request Outcomes (not including L2 Prefetch).

This event has the following units which may be used to modify the behavior of the event:

LsRdBlkCS: Data Cache Shared Read Hit in L2.
LsRdBlkLHitX: Data Cache Read Hit in L2.
LsRdBlkLHitS: Data Cache Read Hit on Shared Line in L2.
LsRdBlkX: Data Cache Store or State Change Hit in L2.
LsRdBlkC: Data Cache Req Miss in L2 (all types).
IcFillHitX: Instruction Cache Hit Modifiable Line in L2.
IcFillHitS: Instruction Cache Hit Clean Line in L2.
IcFillMiss: Instruction Cache Req Miss in L2.

L2PfHitL2

Core::X86::Pmc::Core::L2PfHitL2 - L2 Prefetch Hit in L2

L2PfMissL2HitL2

Core::X86::Pmc::Core::L2PfMissL2HitL2 - L2 Prefetcher Hits in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.

L2PfMissL2L3

Core::X86::Pmc::Core::L2PfMissL2L3 - L2 Prefetcher Misses in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.

NAME

DESCRIPTION

SEE ALSO