NAME

amd_f19h_zen3_events — AMD Family 19h Zen3 processor performance monitoring events

DESCRIPTION

This manual page describes events specfic to AMD Family 19h Zen3 processors. For more information, please consult the appropriate AMD BIOS and Kernel Developer's guide or Open-Source Register Reference.

Each of the events listed below includes the AMD mnemonic which matches the name found in the AMD manual and a brief summary of the event. If available, a more detailed description of the event follows and then any additional unit values that modify the event. Each unit can be combined to create a new event in the system by placing the '.' character between the event name and the unit name.

The following events are supported:

FpRetSseAvxOps

Core::X86::Pmc::Core::FpRetSseAvxOps - Retired SSE/AVX FLOPs

This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.

This event has the following units which may be used to modify the behavior of the event:

MacFLOPs: Multiply-Accumulate FLOPs. Each MAC operation is counted as 2 FLOPS.
DivFLOPs: Divide/square root FLOPs.
MultFLOPs: Multiply FLOPs.
AddSubFLOPs: Add/subtract FLOPs.

FpRetiredSerOps

Core::X86::Pmc::Core::FpRetiredSerOps - Retired Serializing Ops

The number of serializing Ops retired.

This event has the following units which may be used to modify the behavior of the event:

SseBotRet: SSE/AVX bottom-executing ops retired.
SseCtrlRet: SSE/AVX control word mispredict traps.
X87BotRet: x87 bottom-executing ops retired.
X87CtrlRet: x87 control word mispredict traps due to mispredictions in RC or PC, or changes in Exception Mask bits.

FpDispFaults

Core::X86::Pmc::Core::FpDispFaults - FP Dispatch Faults

Floating Point Dispatch Faults.

This event has the following units which may be used to modify the behavior of the event:

YmmSpillFault: YMM Spill fault.
YmmFillFault: YMM Fill fault.
XmmFillFault: XMM Fill fault.
x87FillFault: x87 Fill fault.

LsBadStatus2

Core::X86::Pmc::Core::LsBadStatus2 - Bad Status 2

This event has the following units which may be used to modify the behavior of the event:

StliOther: Store-to-load conflicts: A load was unable to complete due to a non-forwardable conflict with an older store. Most commonly, a load's address range partially but not completely overlaps with an uncompleted older store. Software can avoid this problem by using same-size and same-alignment loads and stores when accessing the same data. Vector/SIMD code is particularly susceptible to this problem; software should construct wide vector stores by manipulating vector elements in registers using shuffle/blend/swap instructions prior to storing to memory, instead of using narrow element-by-element stores.

LsLocks

Core::X86::Pmc::Core::LsLocks - Retired Lock Instructions

This event has the following units which may be used to modify the behavior of the event:

BusLock: Read-write. Reset: 0. Comparable to legacy bus lock.

LsRetClClush

Core::X86::Pmc::Core::LsRetClClush - Retired CLFLUSH Instructions

The number of retired CLFLUSH instructions. This is a non-speculative event.

LsRetCpuid

Core::X86::Pmc::Core::LsRetCpuid - Retired CPUID Instructions

The number of CPUID instructions retired.

LsDispatch

Core::X86::Pmc::Core::LsDispatch - LS Dispatch

Counts the number of operations dispatched to the LS unit.

LsSmiRx

Core::X86::Pmc::Core::LsSmiRx - SMIs Received

Counts the number of SMIs received.

LsIntTaken

Core::X86::Pmc::Core::LsIntTaken - Interrupts Taken

Counts the number of interrupts taken.

LsSTLF

Core::X86::Pmc::Core::LsSTLF - Store to Load Forward

Number of STLF hits.

LsStCommitCancel2

Core::X86::Pmc::Core::LsStCommitCancel2 - Store Commit Cancels 2

This event has the following units which may be used to modify the behavior of the event:

StCommitCancelWcbFull: A non-cacheable store and the non-cacheable commit buffer is full.

LsMabAlloc

Core::X86::Pmc::Core::LsMabAlloc - LS MAB Allocates by Type

Counts when a LS pipe allocates a MAB entry.

LsDmndFillsFromSys

Core::X86::Pmc::Core::LsDmndFillsFromSys - Demand Data Cache Fills by Data Source

Demand Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

MemIoRemote: From DRAM or IO connected in different Node.
ExtCacheRemote: From CCX Cache in different Node.
MemIoLocal: From DRAM or IO connected in same node.
ExtCacheLocal: From cache of different CCX in same node.
IntCache: From L3 or different L2 in same CCX.
LclL2: From Local L2 to the core.

LsAnyFillsFromSys

Core::X86::Pmc::Core::LsAnyFillsFromSys - Any Data Cache Fills by Data Source

Any Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

MemIoRemote: From DRAM or IO connected in different Node.
ExtCacheRemote: From CCX Cache in different Node.
MemIoLocal: From DRAM or IO connected in same node.
ExtCacheLocal: From cache of different CCX in same node.
IntCache: From L3 or different L2 in same CCX.
LclL2: From Local L2 to the core.

LsL1DTlbMiss

Core::X86::Pmc::Core::LsL1DTlbMiss - L1 DTLB Misses

This event has the following units which may be used to modify the behavior of the event:

TlbReload1GL2Miss: DTLB reload to a 1G page that also missed in the L2 TLB.
TlbReload2ML2Miss: DTLB reload to a 2M page that also missed in the L2 TLB.
TlbReloadCoalescedPageMiss: DTLB reload to a coalesced page that also missed in the L2 TLB.
TlbReload4KL2Miss: DTLB reload to a 4 K page that missed the L2 TLB
TlbReload1GL2Hit: DTLB reload to a 1G page that hit in the L2 TLB.
TlbReload2ML2Hit: DTLB reload to a 2M page that hit in the L2 TLB.1TlbReloadCoalescedPageHit. Read-write. Reset: 0. DTLB reload to a coalesced page that hit in the L2 TLB.
TlbReload4KL2Hit: DTLB reload to a 4K page that hit in the L2 TLB.

LsMisalLoads

Core::X86::Pmc::Core::LsMisalLoads - Misaligned loads

This event has the following units which may be used to modify the behavior of the event:

MA4K: The number of 4KB misaligned (i.e., page crossing) loads.
MA64: The number of 64B misaligned (i.e., cacheline crossing) loads.

LsPrefInstrDisp

Core::X86::Pmc::Core::LsPrefInstrDisp - Prefetch Instructions Dispatched

Software Prefetch Instructions Dispatched (Speculative).

This event has the following units which may be used to modify the behavior of the event:

PREFETCHNTA: PrefetchNTA instruction. See docAPM3 PREFETCHlevel.
PREFETCHW: PrefetchW instruction. See docAPM3 PREFETCHW.
PREFETCH: PrefetchT0, T1 and T2 instructions. See docAPM3 PREFETCHlevel.

LsInefSwPref

Core::X86::Pmc::Core::LsInefSwPref - Ineffective Software Prefetches

The number of software prefetches that did not fetch data outside of the processor core.

This event has the following units which may be used to modify the behavior of the event:

MabMchCnt: Software PREFETCH instruction saw a match on an already-allocated miss request buffer.
DataPipeSwPfDcHit: Software PREFETCH instruction saw a DC hit.

LsSwPfDcFills

Core::X86::Pmc::Core::LsSwPfDcFills - Software Prefetch Data Cache Fills

Software Prefetch Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

MemIoRemote: From DRAM or IO connected in different Node.
ExtCacheRemote: From CCX Cache in different Node.
MemIoLocal: From DRAM or IO connected in same node.
ExtCacheLocal: From cache of different CCX in same node.
IntCache: From L3 or different L2 in same CCX.
LclL2: From Local L2 to the core.

LsHwPfDcFills

Core::X86::Pmc::Core::LsHwPfDcFills - Hardware Prefetch Data Cache Fills

Hardware Prefetch Data Cache Fills by Data Source.

This event has the following units which may be used to modify the behavior of the event:

MemIoRemote: From DRAM or IO connected in different Node.
ExtCacheRemote: From CCX Cache in different Node.
MemIoLocal: From DRAM or IO connected in same node.
ExtCacheLocal: From cache of different CCX in same node.
IntCache: From L3 or different L2 in same CCX.
LclL2: From Local L2 to the core.

LsAllocMabCount

Core::X86::Pmc::Core::LsAllocMabCount - Count of Allocated Mabs

This event counts the in-flight L1 data cache misses (allocated Miss Address Buffers) divided by 4 and rounded down each cycle unless used with the MergeEvent functionality. If the MergeEvent is used, it counts the exact number of outstanding L1 data cache misses. See 2.1.17.3 [Large Increment per Cycle Events].

LsNotHaltedCyc

Core::X86::Pmc::Core::LsNotHaltedCyc - Cycles not in Halt

LsTlbFlush

Core::X86::Pmc::Core::LsTlbFlush - All TLB Flushes

Requires unit mask 0xFF to engage event for counting.

IcCacheFillL2

Core::X86::Pmc::Core::IcCacheFillL2 - Instruction Cache Refills from L2

The number of 64-byte instruction cache line was fulfilled from the L2 cache.

IcCacheFillSys

Core::X86::Pmc::Core::IcCacheFillSys - Instruction Cache Refills from System

The number of 64-byte instruction cache line fulfilled from system memory or another cache.

BpL1TlbMissL2TlbHit

Core::X86::Pmc::Core::BpL1TlbMissL2TlbHit - L1 ITLB Miss, L2 ITLB Hit

The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB.

BpL1TlbMissL2TlbMiss

Core::X86::Pmc::Core::BpL1TlbMissL2TlbMiss - ITLB Reload from Page-Table walk

The number of valid fills into the ITLB originating from the LS Page-Table Walker. Tablewalk requests are issued for L1-ITLB and L2-ITLB misses.

This event has the following units which may be used to modify the behavior of the event:

Coalesced4K: Walk for >4K Coalesced page.
IF1G: Walk for 1G page.
IF2M: Walk for 2M page.
IF4K: Walk to 4K page.

BpL2BTBCorrect

Core::X86::Pmc::Core::BpL2BTBCorrect - L2 Branch Prediction Overrides Existing Prediction (speculative)

BpDynIndPred

Core::X86::Pmc::Core::BpDynIndPred - Dynamic Indirect Predictions

The number of times a branch used the indirect predictor to make a prediction.

BpDeReDirect

Core::X86::Pmc::Core::BpDeReDirect - Decode Redirects

The number of times the instruction decoder overrides the predicted target.

BpL1TlbFetchHit

Core::X86::Pmc::Core::BpL1TlbFetchHit - L1 TLB Hits for Instruction Fetch

The number of instruction fetches that hit in the L1 ITLB.

This event has the following units which may be used to modify the behavior of the event:

IF1G: L1 Instruction TLB hit (1G page size).
IF2M: L1 Instruction TLB hit (2M page size).
IF4K: L1 Instruction TLB hit (4K or 16K page size).

IcTagHitMiss

Core::X86::Pmc::Core::IcTagHitMiss - IC Tag Hit/Miss Events

Counts various IC tag related hit and miss events.

OpCacheHitMiss

Core::X86::Pmc::Core::OpCacheHitMiss - Op Cache Hit/Miss

Counts Op Cache micro-tag hit/miss events.

DeSrcOpDisp

Core::X86::Pmc::Core::DeSrcOpDisp - Source of Op Dispatched From Decoder

Counts the number of ops dispatched from the decoder classified by op source. See docRevG erratum #1287.

This event has the following units which may be used to modify the behavior of the event:

OpCache: Count of ops fetched from Op Cache and dispatched.
x86Decoder: Count of ops fetched from Instruction Cache and dispatched.

DeDisCopsFromDecoder

Core::X86::Pmc::Core::DeDisCopsFromDecoder - Types of Oops Dispatched From Decoder

Counts the number of ops dispatched from the decoder classified by op type. The UnitMask value encodes which types of ops are counted.

DeDisDispatchTokenStalls1

Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 - Dispatch Resource Stall Cycles 1

Cycles where a dispatch group is valid but does not get dispatched due to a Token Stall. Also counts cycles when the thread is not selected to dispatch but would have been stalled due to a Token Stall.

This event has the following units which may be used to modify the behavior of the event:

FpFlushRecoveryStall: FP Flush recovery stall.
FPSchRsrcStall: FP scheduler resource stall. Applies to ops that use the FP scheduler.
FpRegFileRsrcStall: floating point register file resource stall. Applies to all FP ops that have a destination register.
TakenBrnchBufferRsrc: taken branch buffer resource stall.
StoreQueueRsrcStall: Store Queue resource stall. Applies to all ops with store semantics.
LoadQueueRsrcStall: Load Queue resource stall. Applies to all ops with load semantics.
IntPhyRegFileRsrcStall: Integer Physical Register File resource stall. Integer Physical Register File, applies to all ops that have an integer destination register.

DeDisDispatchTokenStalls2

Core::X86::Pmc::Core::DeDisDispatchTokenStalls2 - Dynamic Tokens Dispatch Stall Cycles 2

Cycles where a dispatch group is valid but does not get dispatched due to a token stall.

This event has the following units which may be used to modify the behavior of the event:

RetireTokenStall: Insufficient Retire Queue tokens available.
IntSch3TokenStall: No tokens for Integer Scheduler Queue 3 available.
IntSch2TokenStall: No tokens for Integer Scheduler Queue 2 available.
IntSch1TokenStall: No tokens for Integer Scheduler Queue 1 available.
IntSch0TokenStall: No tokens for Integer Scheduler Queue 0 available.

ExRetInstr

Core::X86::Pmc::Core::ExRetInstr - Retired Instructions

The number of instructions retired.

ExRetOps

Core::X86::Pmc::Core::ExRetOps - Retired Ops

The number of macro-ops retired.

ExRetBrn

Core::X86::Pmc::Core::ExRetBrn - Retired Branch Instructions

The number of branch instructions retired. This includes all types of architectural control flow changes, including exceptions and interrupts.

ExRetBrnMisp

Core::X86::Pmc::Core::ExRetBrnMisp - Retired Branch Instructions Mispredicted

The number of retired branch instructions, that were mispredicted.

ExRetBrnTkn

Core::X86::Pmc::Core::ExRetBrnTkn - Retired Taken Branch Instructions

The number of taken branches that were retired. This includes all types of architectural control flow changes, including exceptions and interrupts.

ExRetBrnTknMisp

Core::X86::Pmc::Core::ExRetBrnTknMisp - Retired Taken Branch Instructions Mispredicted

The number of retired taken branch instructions that were mispredicted.

ExRetBrnFar

Core::X86::Pmc::Core::ExRetBrnFar - Retired Far Control Transfers

The number of far control transfers retired including far call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and interrupts. Far control transfers are not subject to branch prediction.

ExRetNearRet

Core::X86::Pmc::Core::ExRetNearRet - Retired Near Returns

The number of near return instructions (RET or RET Iw) retired.

ExRetNearRetMispred

Core::X86::Pmc::Core::ExRetNearRetMispred - Retired Near Returns Mispredicted

The number of near returns retired that were not correctly predicted by the return address predictor. Each such mispredictincurs the same penalty as a mispredicted conditional branch instruction.

ExRetBrnIndMisp

Core::X86::Pmc::Core::ExRetBrnIndMisp - Retired Indirect Branch Instructions Mispredicted

The number of indirect branches retired that were not correctly predicted. Each such mispredict incurs the same penalty as a mispredicted conditional branch instruction. Note that only EX mispredicts are counted.

ExRetMmxFpInstr

Core::X86::Pmc::Core::ExRetMmxFpInstr - Retired MMX/FP Instructions

The number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions it is not suitable for measuring MFLOPs.

This event has the following units which may be used to modify the behavior of the event:

SseInstr: SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41, SSE42, AVX).
MmxInstr: MMX instructions.
X87Instr: x87 instructions.

ExRetIndBrchInstr

Core::X86::Pmc::Core::ExRetIndBrchInstr - Retired Indirect Branch Instructions

The number of indirect branches retired.

ExRetCond

Core::X86::Pmc::Core::ExRetCond - Retired Conditional Branch Instructions

ExDivBusy

Core::X86::Pmc::Core::ExDivBusy - Div Cycles Busy count

ExDivCount

Core::X86::Pmc::Core::ExDivCount - Div Op Count

ExRetMsprdBrnchInstrDirMsmtch

Core::X86::Pmc::Core::ExRetMsprdBrnchInstrDirMsmtch - Retired Mispredicted Branch Instructions due to Direction Mismatch

The number of retired conditional branch instructions that were not correctly predicted because of a branch direction mismatch.

ExTaggedIbsOps

Core::X86::Pmc::Core::ExTaggedIbsOps - Tagged IBS Ops

Counts Op IBS related events.

This event has the following units which may be used to modify the behavior of the event:

IbsCountRollover: Number of times an op could not be tagged by IBS because of a previous tagged op that has not retired.
IbsTaggedOpsRet: Number of Ops tagged by IBS that retired.
IbsTaggedOps: Number of Ops tagged by IBS.

ExRetFusedInstr

Core::X86::Pmc::Core::ExRetFusedInstr - Retired Fused Instructions

Counts retired fused instructions.

L2RequestG1

Core::X86::Pmc::Core::L2RequestG1 - Requests to L2 Group1

All L2 Cache Requests

This event has the following units which may be used to modify the behavior of the event:

RdBlkL: Data Cache Reads (including hardware and software prefetch).
RdBlkX: Data Cache Stores.
LsRdBlkC_S: Data Cache Shared Reads.
CacheableIcRead: Instruction Cache Reads.
ChangeToX: Data Cache State Change Requests. Request change to writable, check L2 for current state.
PrefetchL2Cmd
L2HwPf: L2 Prefetcher. All prefetches accepted by L2 pipeline, hit or miss. Types of PF and L2 hit/miss broken out in a separate perfmon event.

L2CacheReqStat

Core::X86::Pmc::Core::L2CacheReqStat - Core to L2 Cacheable Request Access Status

L2 Cache Request Outcomes (not including L2 Prefetch).

This event has the following units which may be used to modify the behavior of the event:

LsRdBlkCS: Data Cache Shared Read Hit in L2.
LsRdBlkLHitX: Data Cache Read Hit in L2.
LsRdBlkLHitS: Data Cache Read Hit Non-Modifiable Line in L2.
LsRdBlkX: Data Cache Store or State Change Hit in L2.
LsRdBlkC: Data Cache Req Miss in L2 (all types).
IcFillHitX: Instruction Cache Hit Modifiable Line in L2.
IcFillHitS: Instruction Cache Hit Non-Modifiable Line in L2.
IcFillMiss: Instruction Cache Req Miss in L2.

L2PfHitL2

Core::X86::Pmc::Core::L2PfHitL2 - L2 Prefetch Hit in L2

L2PfMissL2HitL3

Core::X86::Pmc::Core::L2PfMissL2HitL3 - L2 Prefetcher Hits in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.

L2PfMissL2L3

Core::X86::Pmc::Core::L2PfMissL2L3 - L2 Prefetcher Misses in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.

NAME

DESCRIPTION

SEE ALSO