Lines Matching +full:per +full:- +full:rate

3         "BriefDescription": "C1 residency percent per core",
4 "MetricExpr": "cstate_core@c1\\-residency@ / TSC",
10 "BriefDescription": "C2 residency percent per package",
11 "MetricExpr": "cstate_pkg@c2\\-residency@ / TSC",
17 "BriefDescription": "C6 residency percent per core",
18 "MetricExpr": "cstate_core@c6\\-residency@ / TSC",
24 "BriefDescription": "C6 residency percent per package",
25 "MetricExpr": "cstate_pkg@c6\\-residency@ / TSC",
31 "BriefDescription": "Uncore frequency per die [GHZ]",
37 …"BriefDescription": "Cycles per instruction retired; indicating how much time each executed instru…
282 …"BriefDescription": "Uops delivered from legacy decode pipeline (Micro-instruction Translation Eng…
313 "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
339 …sible; which incur a few cycles load re-issue. However; the short re-issue duration is often hidde…
356 …er-cases for operations that cannot be handled natively by the execution pipeline. For example; wh…
362 …"MetricExpr": "topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-ret…
367-of-order scheduler dispatches ready uops into their respective execution units; and once complete…
373 "MetricExpr": "max(1 - (tma_frontend_bound + tma_backend_bound + tma_retiring), 0)",
378 …s for which the issue-pipeline was blocked due to recovery from earlier incorrect speculation. For…
382 …of instruction fetch related bottlenecks by large code footprint programs (i-side cache; TLB and B…
389 …"BriefDescription": "Total pipeline cost of instructions used for program control-flow - a subset …
394 …"PublicDescription": "Total pipeline cost of instructions used for program control-flow - a subset…
397 …"BriefDescription": "Total pipeline cost of external Memory- or Cache-Bandwidth related bottleneck…
402 …"PublicDescription": "Total pipeline cost of external Memory- or Cache-Bandwidth related bottlenec…
405 …"BriefDescription": "Total pipeline cost of external Memory- or Cache-Latency related bottlenecks",
410 …"PublicDescription": "Total pipeline cost of external Memory- or Cache-Latency related bottlenecks…
413 … "BriefDescription": "Total pipeline cost when the execution is compute-bound - an estimation",
418 …ine cost when the execution is compute-bound - an estimation. Covers Core Bound when High ILP as w…
421 …tch bandwidth related bottlenecks (when the front-end could not sustain operations delivery to the…
422- (1 - 10 * tma_microcode_sequencer * tma_other_mispredicts / tma_branch_mispredicts) * tma_fetch_…
433 …"PublicDescription": "Total pipeline cost of irregular execution (e.g. FP-assists in HPC, Wait tim…
436 …ription": "Total pipeline cost of Memory Address Translation related bottlenecks (data-side TLBs)",
441 …"Total pipeline cost of Memory Address Translation related bottlenecks (data-side TLBs). Related m…
445 …t_stores + tma_store_latency + tma_streaming_stores - tma_store_latency)) + tma_machine_clears * (…
453 …"MetricExpr": "100 * (1 - 10 * tma_microcode_sequencer * tma_other_mispredicts / tma_branch_mispre…
460 "BriefDescription": "Total pipeline cost of remaining bottlenecks in the back-end",
461 …"MetricExpr": "100 - (tma_bottleneck_big_code + tma_bottleneck_instruction_fetch_bw + tma_bottlene…
465 …aining bottlenecks in the back-end. Examples include data-dependencies (Core Bound when Low ILP) a…
468 …"BriefDescription": "Total pipeline cost of \"useful operations\" - the portion of Retiring catego…
469 … "100 * (tma_retiring - (BR_INST_RETIRED.ALL_BRANCHES + 2 * BR_INST_RETIRED.NEAR_CALL + INST_RETIR…
489 …etched from an incorrectly speculated program path; or stalls when the out-of-order part of the ma…
498 … corrected path; following all sorts of miss-predicted branches. For example; branchy code with lo…
503 "MetricExpr": "max(0, tma_microcode_sequencer - tma_assists)",
507 … as in the case of read-modify-write as an example. Since these instructions require multiple uops…
512 …"MetricExpr": "(1 - BR_MISP_RETIRED.ALL_BRANCHES / (BR_MISP_RETIRED.ALL_BRANCHES + MACHINE_CLEARS.…
521 "MetricExpr": "max(0, tma_icache_misses - tma_code_l2_miss)",
536 …irst level) ITLB was missed by instructions fetches, that later on hit in second-level TLB (STLB)",
537 "MetricExpr": "max(0, tma_itlb_misses - tma_code_stlb_miss)",
544 …"BriefDescription": "This metric estimates the fraction of cycles where the Second-level TLB (STLB…
578 …"BriefDescription": "This metric represents fraction of slots where Core non-memory issues were of…
579 "MetricExpr": "max(0, tma_backend_bound - tma_memory_bound)",
584-memory issues were of a bottleneck. Shortage in hardware compute resources; or dependencies in s…
588 …n of cycles while the memory subsystem was handling synchronizations due to data-sharing accesses",
590 … (MEM_LOAD_L3_HIT_RETIRED.XSNP_HIT + MEM_LOAD_L3_HIT_RETIRED.XSNP_HITM * (1 - OCR.DEMAND_DATA_RD.L…
594 … cycles while the memory subsystem was handling synchronizations due to data-sharing accesses. Dat…
598 …"BriefDescription": "This metric represents fraction of cycles where decoder-0 was the only active…
599 …"MetricExpr": "(cpu@INST_DECODED.DECODERS\\,cmask\\=1@ - cpu@INST_DECODED.DECODERS\\,cmask\\=2@) /…
603 …"PublicDescription": "This metric represents fraction of cycles where decoder-0 was the only activ…
618 …o_thread_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_thread…
627 "MetricExpr": "(IDQ.DSB_CYCLES_ANY - IDQ.DSB_CYCLES_OK) / tma_info_core_core_clks / 2",
640 …o switches from DSB to MITE pipelines. The DSB (decoded i-cache) is a Uop Cache where the front-en…
645 …mask\\=1@ + DTLB_LOAD_MISSES.WALK_ACTIVE, max(CYCLE_ACTIVITY.CYCLES_MEM_ANY - CYCLE_ACTIVITY.CYCLE…
649-aside Buffers) are processor caches for recently used entries out of the Page Tables that are use…
653 …: "This metric roughly estimates the fraction of cycles spent handling first-level data TLB store …
658-level data TLB store misses. As with ordinary data caching; focus on improving data locality and…
667 …hreading hiccup; where multiple Logical Processors contend on different data-elements mapped into …
681 "MetricExpr": "max(0, tma_frontend_bound - tma_fetch_latency)",
691 …"MetricExpr": "(5 * IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE - INT_MISC.UOP_DROPPING) / tma…
696 …he CPU was stalled due to Frontend latency issues. For example; instruction-cache misses; iTLB mi…
701 "MetricExpr": "tma_heavy_operations - tma_microcode_sequencer",
705 …tiring instructions that that are decoder into two or more uops. This highly-correlates with the n…
709 …"BriefDescription": "This metric represents overall arithmetic floating-point (FP) operations frac…
714-point (FP) operations fraction the CPU has executed (retired). Note this metric's value may excee…
723 …ts. FP Assist may apply when working with very small floating point values (so-called Denormals).",
727 …"BriefDescription": "This metric represents fraction of cycles where the Floating-Point Divider un…
735 …"BriefDescription": "This metric approximates arithmetic floating-point (FP) scalar uops fraction …
740 …"PublicDescription": "This metric approximates arithmetic floating-point (FP) scalar uops fraction…
744 …"BriefDescription": "This metric approximates arithmetic floating-point (FP) vector uops fraction …
749 …"PublicDescription": "This metric approximates arithmetic floating-point (FP) vector uops fraction…
753 …tric approximates arithmetic FP vector uops fraction the CPU has retired for 128-bit wide vectors",
758 … approximates arithmetic FP vector uops fraction the CPU has retired for 128-bit wide vectors. May…
762 …tric approximates arithmetic FP vector uops fraction the CPU has retired for 256-bit wide vectors",
767 … approximates arithmetic FP vector uops fraction the CPU has retired for 256-bit wide vectors. May…
771 …tric approximates arithmetic FP vector uops fraction the CPU has retired for 512-bit wide vectors",
776 … approximates arithmetic FP vector uops fraction the CPU has retired for 512-bit wide vectors. May…
782 …"MetricExpr": "topdown\\-fe\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-ret…
787-lines are fetched from the memory subsystem; parsed into instructions; and lastly decoded into mi…
791 … slots where the CPU was retiring heavy-weight operations -- instructions that require two or more…
792 …"MetricExpr": "tma_microcode_sequencer + tma_retiring * (UOPS_DECODED.DEC0 - cpu@UOPS_DECODED.DEC0…
797 …he CPU was retiring heavy-weight operations -- instructions that require two or more uops or micro
810 …ch Misprediction Cost: Cycles representing fraction of TMA slots wasted per non-speculative branch…
815 …ch Misprediction Cost: Cycles representing fraction of TMA slots wasted per non-speculative branch…
818 …scription": "Instructions per retired Mispredicts for conditional non-taken branches (lower number…
825 …Description": "Instructions per retired Mispredicts for conditional taken branches (lower number m…
832 …scription": "Instructions per retired Mispredicts for indirect CALL or JMP branches (lower number …
839 …BriefDescription": "Instructions per retired Mispredicts for return branches (lower number means h…
846 …ion": "Number of Instructions per non-speculative Branch Misprediction (JEClear) (lower number mea…
859 … "BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
861 …"MetricExpr": "(100 * (1 - tma_core_bound / tma_ports_utilization if tma_core_bound < tma_ports_ut…
867 …"BriefDescription": "Total pipeline cost of DSB (uop cache) hits - subset of the Instruction_Fetch…
872 …"PublicDescription": "Total pipeline cost of DSB (uop cache) hits - subset of the Instruction_Fetc…
875 …"BriefDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fet…
881 …"PublicDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fe…
884 …"BriefDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bott…
890 …"PublicDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bot…
899 "BriefDescription": "Fraction of branches that are non-taken conditionals",
912 …"MetricExpr": "(BR_INST_RETIRED.NEAR_TAKEN - BR_INST_RETIRED.COND_TAKEN - 2 * BR_INST_RETIRED.NEAR…
918 …"MetricExpr": "1 - (tma_info_branches_cond_nt + tma_info_branches_cond_tk + tma_info_branches_call…
929 "BriefDescription": "Instructions Per Cycle across hyper-threads (per physical core)",
935 "BriefDescription": "uops Executed per Cycle",
941 "BriefDescription": "Floating Point Operations Per Cycle",
947 …"BriefDescription": "Actual per-core usage of the Floating Point non-X87 execution units (regardle…
951per-core usage of the Floating Point non-X87 execution units (regardless of precision or vector-wi…
954 …efDescription": "Instruction-Level-Parallelism (average number of uops executed when there is exec…
968 …tion": "Average number of cycles of a switch from the DSB fetch-unit to MITE fetch unit - see DSB_…
974 "BriefDescription": "Average number of Uops issued by front-end when it issued something",
986 …"BriefDescription": "Instructions per non-speculative DSB miss (lower number means higher occurren…
993 …Description": "Instructions per speculative Unknown Branch Misprediction (BAClear) (lower number m…
999 "BriefDescription": "L2 cache true code cacheline misses per kilo instruction",
1005 "BriefDescription": "L2 cache speculative code cacheline misses per kilo instruction",
1011 "BriefDescription": "Taken Branches retired Per Cycle",
1017 "BriefDescription": "Branch instructions per taken branch.",
1030 …"BriefDescription": "Instructions per FP Arithmetic instruction (lower number means higher occurre…
1035 …"PublicDescription": "Instructions per FP Arithmetic instruction (lower number means higher occurr…
1038 …riefDescription": "Instructions per FP Arithmetic AVX/SSE 128-bit instruction (lower number means …
1043 …blicDescription": "Instructions per FP Arithmetic AVX/SSE 128-bit instruction (lower number means …
1046 …"BriefDescription": "Instructions per FP Arithmetic AVX* 256-bit instruction (lower number means h…
1051 …PublicDescription": "Instructions per FP Arithmetic AVX* 256-bit instruction (lower number means h…
1054 …"BriefDescription": "Instructions per FP Arithmetic AVX 512-bit instruction (lower number means hi…
1059 …PublicDescription": "Instructions per FP Arithmetic AVX 512-bit instruction (lower number means hi…
1062 …Description": "Instructions per FP Arithmetic Scalar Double-Precision instruction (lower number me…
1067 …Description": "Instructions per FP Arithmetic Scalar Double-Precision instruction (lower number me…
1070 …Description": "Instructions per FP Arithmetic Scalar Single-Precision instruction (lower number me…
1075 …Description": "Instructions per FP Arithmetic Scalar Single-Precision instruction (lower number me…
1078 "BriefDescription": "Instructions per Branch (lower number means higher occurrence rate)",
1085 … "BriefDescription": "Instructions per (near) call (lower number means higher occurrence rate)",
1092 …"BriefDescription": "Instructions per Floating Point (FP) Operation (lower number means higher occ…
1099 "BriefDescription": "Instructions per Load (lower number means higher occurrence rate)",
1106 "BriefDescription": "Instructions per PAUSE (lower number means higher occurrence rate)",
1112 "BriefDescription": "Instructions per Store (lower number means higher occurrence rate)",
1119 …ion": "Instructions per Software prefetch instruction (of any type: NTA/T0/T1/T2/Prefetch) (lower …
1126 "BriefDescription": "Instructions per taken branch",
1131 …"PublicDescription": "Instructions per taken branch. Related metrics: tma_dsb_switches, tma_fetch_…
1134 "BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
1140 "BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
1146 "BriefDescription": "Rate of non silent evictions from the L2 cache per Kilo instruction",
1152 …"BriefDescription": "Rate of silent evictions from the L2 cache per Kilo instruction where the evi…
1158 "BriefDescription": "Average per-core data access bandwidth to the L3 cache [GB / sec]",
1164 "BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
1170 …iption": "Fill Buffer (FB) hits per kilo instructions for retired demand loads (L1D misses that me…
1176 … "BriefDescription": "Average per-thread data fill bandwidth to the L1 data cache [GB / sec]",
1182 "BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
1188 …"BriefDescription": "L1 cache true misses per kilo instruction for all demand loads (including spe…
1194 "BriefDescription": "Average per-thread data fill bandwidth to the L2 cache [GB / sec]",
1200 …"BriefDescription": "L2 cache hits per kilo instruction for all demand loads (including speculati…
1206 "BriefDescription": "L2 cache true misses per kilo instruction for retired demand loads",
1212 …"BriefDescription": "L2 cache ([RKL+] true) misses per kilo instruction for all request types (inc…
1213 …"MetricExpr": "1e3 * (OFFCORE_REQUESTS.ALL_DATA_RD - OFFCORE_REQUESTS.DEMAND_DATA_RD + L2_RQSTS.AL…
1218 …"BriefDescription": "L2 cache ([RKL+] true) misses per kilo instruction for all demand loads (inc…
1224 "BriefDescription": "Offcore requests (L2 cache miss) per kilo instruction for demand RFOs",
1230 "BriefDescription": "Average per-thread data access bandwidth to the L3 cache [GB / sec]",
1236 "BriefDescription": "Average per-thread data fill bandwidth to the L3 cache [GB / sec]",
1242 "BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
1266 …"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core…
1272 "BriefDescription": "\"Bus lock\" per kilo instruction",
1278 "BriefDescription": "Un-cacheable retired load per kilo instruction",
1284 …"BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is…
1288 …ublicDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is …
1291 "BriefDescription": "Rate of L2 HW prefetched lines that were not used by demand accesses",
1298 …ription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-s…
1304 …on": "STLB (2nd level TLB) data load speculative misses per kilo instruction (misses of any page-s…
1317 …n": "STLB (2nd level TLB) data store speculative misses per kilo instruction (misses of any page-s…
1329 "BriefDescription": "Average number of uops fetched from DSB per cycle",
1335 "BriefDescription": "Average number of uops fetched from MITE per cycle",
1341 "BriefDescription": "Instructions per a microcode Assist invocation",
1346 …tion": "Instructions per a microcode Assist invocation. See Assists tree node for details (lower n…
1380 "BriefDescription": "Giga Floating Point Operations Per Second",
1384 …ting Point Operations Per Second. Aggregate across all supported options of: FP precisions, scalar…
1401per Far Branch ( Far Branches apply upon transition from application to operating system, handling…
1408 "BriefDescription": "Cycles Per Instruction for the Operating System (OS) Kernel mode",
1425 …to external DRAM memory [in nanoseconds]. Accounts for demand loads and L1/L2 data-read prefetches"
1446 …y (in nanoseconds). Accounts for demand loads and L1/L2 prefetches. ([RKL+]memory-controller only)"
1457 … "MetricExpr": "(power@energy\\-pkg@ * 61 + 15.6 * power@energy\\-ram@) / (duration_time * 1e6)",
1462 …"BriefDescription": "Fraction of Core cycles where the core was running with power-delivery for ba…
1466 …s running with power-delivery for baseline license level 0. This includes non-AVX codes, SSE, AVX…
1469 …"BriefDescription": "Fraction of Core cycles where the core was running with power-delivery for li…
1474 … running with power-delivery for license level 1. This includes high current AVX 256-bit instruct…
1477 …"BriefDescription": "Fraction of Core cycles where the core was running with power-delivery for li…
1482 …e the core was running with power-delivery for license level 2 (introduced in SKX). This includes…
1486 …"MetricExpr": "(1 - CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_DISTRIBUTED if #SMT_…
1516 … "BriefDescription": "Per-Logical Processor actual clocks when the Logical Processor is active.",
1522 "BriefDescription": "Cycles Per Instruction (per Logical Processor)",
1528 "BriefDescription": "The ratio of Executed- by Issued-Uops",
1532 …ion": "The ratio of Executed- by Issued-Uops. Ratio > 1 suggests high rate of uop micro-fusions. R…
1535 "BriefDescription": "Instructions Per Cycle (per Logical Processor)",
1541 …"BriefDescription": "Total issue-pipeline slots (per-Physical Core till ICL; per-Logical Processor…
1547 … "BriefDescription": "Fraction of Physical Core issue-slots utilized by this Logical Processor",
1553 "BriefDescription": "Uops Per Instruction",
1560 "BriefDescription": "Uops per taken branch",
1568 "MetricExpr": "tma_divider - tma_fp_divider",
1585 …"MetricExpr": "max((CYCLE_ACTIVITY.STALLS_MEM_ANY - CYCLE_ACTIVITY.STALLS_L1D_MISS) / tma_info_thr…
1589 … TLB. These cases are characterized by execution unit stalls; while some non-completed demand load…
1594 …EM_INST_RETIRED.ALL_LOADS - MEM_LOAD_RETIRED.FB_HIT - MEM_LOAD_RETIRED.L1_MISS) * 20 / 100, max(CY…
1598 … the L1D cache. The short latency of the L1D cache may be exposed in pointer-chasing memory access…
1604 …1_MISS) + L1D_PEND_MISS.FB_FULL_PERIODS) * ((CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALL…
1623 …"MetricExpr": "(CYCLE_ACTIVITY.STALLS_L2_MISS - CYCLE_ACTIVITY.STALLS_L3_MISS) / tma_info_thread_c…
1649 …slots where the CPU was retiring light-weight operations -- instructions that require no more than…
1650 "MetricExpr": "max(0, tma_retiring - tma_heavy_operations)",
1655-weight operations -- instructions that require no more than one uop (micro-operation). This corre…
1668 … the (first level) DTLB was missed by load accesses, that later on hit in second-level TLB (STLB)",
1669 "MetricExpr": "tma_dtlb_load - tma_load_stlb_miss",
1676 …"BriefDescription": "This metric estimates the fraction of cycles where the Second-level TLB (STLB…
1719 …"MetricExpr": "(16 * max(0, MEM_INST_RETIRED.LOCK_LOADS - L2_RQSTS.ALL_RFO) + MEM_INST_RETIRED.LOC…
1728 "MetricExpr": "max(0, tma_bad_speculation - tma_branch_mispredicts)",
1733-of-order portion of the machine needs to recover its state after the clear. For example; this can…
1737 …as likely hurt due to approaching bandwidth limits of external memory - DRAM ([SPR-HBM] and/or HBM…
1742- DRAM ([SPR-HBM] and/or HBM). The underlying heuristic assumes that a similar off-core traffic i…
1746 …e the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM…
1747 …EAD, OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD) / tma_info_thread_clks - tma_mem_bandwidth",
1751 …e the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM…
1761 …o demand load or store instructions. This accounts mainly for (1) non-completed in-flight memory d…
1765 … represents fraction of slots where the CPU was retiring memory operations -- uops for memory load…
1793 "MetricExpr": "(IDQ.MITE_CYCLES_ANY - IDQ.MITE_CYCLES_OK) / tma_info_core_core_clks / 2",
1797 …the legacy decode pipeline). This pipeline is used for code that was not pre-cached in the DSB or …
1802 …"MetricExpr": "(cpu@IDQ.MITE_UOPS\\,cmask\\=4@ - cpu@IDQ.MITE_UOPS\\,cmask\\=5@) / tma_info_thread…
1809 …n terms of percentage of([SKL+] injected blend uops out of all Uops Issued -- the Count Domain; [A…
1814 …n terms of percentage of([SKL+] injected blend uops out of all Uops Issued -- the Count Domain; [A…
1818 …es in which CPU was likely limited due to the Microcode Sequencer (MS) unit - see Microcode_Sequen…
1831 … Commonly used instructions are optimized for delivery by the DSB (decoded i-cache) or MITE (legac…
1840 …o op) instructions. Compilers often use NOPs for certain address alignments - e.g. start address o…
1844 …is metric represents the remaining light uops fraction the CPU has executed - remaining means not …
1846 …"MetricExpr": "max(0, tma_light_operations - (tma_fp_arith + tma_memory_operations + tma_branch_in…
1850 …is metric represents the remaining light uops fraction the CPU has executed - remaining means not …
1854 …action of slots the CPU was stalled due to other cases of misprediction (non-retired x86 branches …
1855 …"MetricExpr": "max(tma_branch_mispredicts * (1 - BR_MISP_RETIRED.ALL_BRANCHES / (INT_MISC.CLEARS_C…
1863 …"MetricExpr": "max(tma_machine_clears * (1 - MACHINE_CLEARS.MEMORY_ORDERING / MACHINE_CLEARS.COUNT…
1906 … the CPU performance was potentially limited due to Core computation issues (non divider-related)",
1907 … tma_info_thread_clks if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALL…
1911-related). Two distinct categories can be attributed into this metric: (1) heavy data-dependency …
1920 …t (Logical Processor cycles since ICL, Physical Core cycles otherwise). Long-latency instructions …
1924 …metric represents fraction of cycles where the CPU executed total of 1 uop per cycle on all execut…
1929per cycle on all execution ports (Logical Processor cycles since ICL, Physical Core cycles otherwi…
1933 …": "This metric represents fraction of cycles CPU executed total of 2 uops per cycle on all execut…
1938per cycle on all execution ports (Logical Processor cycles since ICL, Physical Core cycles otherwi…
1942 … metric represents fraction of cycles CPU executed total of 3 or more uops per cycle on all execut…
1947 … metric represents fraction of cycles CPU executed total of 3 or more uops per cycle on all execut…
1956 …r sockets including synchronizations issues. This is caused often due to non-optimal NUMA allocati…
1965 …ystem was handling loads from remote memory. This is caused often due to non-optimal NUMA allocati…
1971 …"MetricExpr": "topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retir…
1976 …ions-per-cycle (see IPC metric). Note that a high Retiring value does not necessary mean there is …
1980 …"BriefDescription": "This metric represents fraction of cycles the CPU issue-pipeline was stalled …
1985 …ycles the CPU issue-pipeline was stalled due to serializing operations. Instructions like CPUID; W…
1998 … estimates fraction of cycles handling memory load split accesses - load that cross 64-byte cache …
2003 … estimates fraction of cycles handling memory load split accesses - load that cross 64-byte cache …
2007 "BriefDescription": "This metric represents rate of split store accesses",
2013 …blicDescription": "This metric represents rate of split store accesses. Consider aligning your da…
2017 …f cycles where the Super Queue (SQ) was full taking into account all request-types and both hardwa…
2022 …f cycles where the Super Queue (SQ) was full taking into account all request-types and both hardwa…
2026 … CPU was stalled due to RFO store memory accesses; RFO store issue a read-for-ownership request b…
2031 …ses; RFO store issue a read-for-ownership request before the write. Even though store accesses do …
2041 …perations in the pipeline; a load can avoid waiting for memory if a prior in-flight store is writi…
2046 …"MetricExpr": "(L2_RQSTS.RFO_HIT * 10 * (1 - MEM_INST_RETIRED.LOCK_LOADS / MEM_INST_RETIRED.ALL_ST…
2050-of-order core performance; however; holding resources for longer time can lead into undesired imp…
2063 …tion of cycles where the TLB was missed by store accesses, hitting in the second-level TLB (STLB)",
2064 "MetricExpr": "tma_dtlb_store - tma_store_stlb_miss",
2108 …uired by RFO stores. Even though store accesses do not typically stall out-of-order CPUs; there ar…
2131 "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
2138 "MetricExpr": "(cycles\\-t / el\\-start if has_event(el\\-start) else 0)",
2145 "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
2152 "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",