1*d332424dSBesar Wicaksono============================================================ 2*d332424dSBesar WicaksonoNVIDIA Tegra241 SoC Uncore Performance Monitoring Unit (PMU) 3*d332424dSBesar Wicaksono============================================================ 4*d332424dSBesar Wicaksono 5*d332424dSBesar WicaksonoThe NVIDIA Tegra241 SoC includes various system PMUs to measure key performance 6*d332424dSBesar Wicaksonometrics like memory bandwidth, latency, and utilization: 7*d332424dSBesar Wicaksono 8*d332424dSBesar Wicaksono* Scalable Coherency Fabric (SCF) 9*d332424dSBesar Wicaksono* NVLink-C2C0 10*d332424dSBesar Wicaksono* NVLink-C2C1 11*d332424dSBesar Wicaksono* CNVLink 12*d332424dSBesar Wicaksono* PCIE 13*d332424dSBesar Wicaksono 14*d332424dSBesar WicaksonoPMU Driver 15*d332424dSBesar Wicaksono---------- 16*d332424dSBesar Wicaksono 17*d332424dSBesar WicaksonoThe PMUs in this document are based on ARM CoreSight PMU Architecture as 18*d332424dSBesar Wicaksonodescribed in document: ARM IHI 0091. Since this is a standard architecture, the 19*d332424dSBesar WicaksonoPMUs are managed by a common driver "arm-cs-arch-pmu". This driver describes 20*d332424dSBesar Wicaksonothe available events and configuration of each PMU in sysfs. Please see the 21*d332424dSBesar Wicaksonosections below to get the sysfs path of each PMU. Like other uncore PMU drivers, 22*d332424dSBesar Wicaksonothe driver provides "cpumask" sysfs attribute to show the CPU id used to handle 23*d332424dSBesar Wicaksonothe PMU event. There is also "associated_cpus" sysfs attribute, which contains a 24*d332424dSBesar Wicaksonolist of CPUs associated with the PMU instance. 25*d332424dSBesar Wicaksono 26*d332424dSBesar Wicaksono.. _SCF_PMU_Section: 27*d332424dSBesar Wicaksono 28*d332424dSBesar WicaksonoSCF PMU 29*d332424dSBesar Wicaksono------- 30*d332424dSBesar Wicaksono 31*d332424dSBesar WicaksonoThe SCF PMU monitors system level cache events, CPU traffic, and 32*d332424dSBesar Wicaksonostrongly-ordered (SO) PCIE write traffic to local/remote memory. Please see 33*d332424dSBesar Wicaksono:ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about the PMU 34*d332424dSBesar Wicaksonotraffic coverage. 35*d332424dSBesar Wicaksono 36*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs, 37*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_scf_pmu_<socket-id>. 38*d332424dSBesar Wicaksono 39*d332424dSBesar WicaksonoExample usage: 40*d332424dSBesar Wicaksono 41*d332424dSBesar Wicaksono* Count event id 0x0 in socket 0:: 42*d332424dSBesar Wicaksono 43*d332424dSBesar Wicaksono perf stat -a -e nvidia_scf_pmu_0/event=0x0/ 44*d332424dSBesar Wicaksono 45*d332424dSBesar Wicaksono* Count event id 0x0 in socket 1:: 46*d332424dSBesar Wicaksono 47*d332424dSBesar Wicaksono perf stat -a -e nvidia_scf_pmu_1/event=0x0/ 48*d332424dSBesar Wicaksono 49*d332424dSBesar WicaksonoNVLink-C2C0 PMU 50*d332424dSBesar Wicaksono-------------------- 51*d332424dSBesar Wicaksono 52*d332424dSBesar WicaksonoThe NVLink-C2C0 PMU monitors incoming traffic from a GPU/CPU connected with 53*d332424dSBesar WicaksonoNVLink-C2C (Chip-2-Chip) interconnect. The type of traffic captured by this PMU 54*d332424dSBesar Wicaksonovaries dependent on the chip configuration: 55*d332424dSBesar Wicaksono 56*d332424dSBesar Wicaksono* NVIDIA Grace Hopper Superchip: Hopper GPU is connected with Grace SoC. 57*d332424dSBesar Wicaksono 58*d332424dSBesar Wicaksono In this config, the PMU captures GPU ATS translated or EGM traffic from the GPU. 59*d332424dSBesar Wicaksono 60*d332424dSBesar Wicaksono* NVIDIA Grace CPU Superchip: two Grace CPU SoCs are connected. 61*d332424dSBesar Wicaksono 62*d332424dSBesar Wicaksono In this config, the PMU captures read and relaxed ordered (RO) writes from 63*d332424dSBesar Wicaksono PCIE device of the remote SoC. 64*d332424dSBesar Wicaksono 65*d332424dSBesar WicaksonoPlease see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about 66*d332424dSBesar Wicaksonothe PMU traffic coverage. 67*d332424dSBesar Wicaksono 68*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs, 69*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_nvlink_c2c0_pmu_<socket-id>. 70*d332424dSBesar Wicaksono 71*d332424dSBesar WicaksonoExample usage: 72*d332424dSBesar Wicaksono 73*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 0:: 74*d332424dSBesar Wicaksono 75*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0/ 76*d332424dSBesar Wicaksono 77*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 1:: 78*d332424dSBesar Wicaksono 79*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c0_pmu_1/event=0x0/ 80*d332424dSBesar Wicaksono 81*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 2:: 82*d332424dSBesar Wicaksono 83*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c0_pmu_2/event=0x0/ 84*d332424dSBesar Wicaksono 85*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 3:: 86*d332424dSBesar Wicaksono 87*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c0_pmu_3/event=0x0/ 88*d332424dSBesar Wicaksono 89*d332424dSBesar WicaksonoThe NVLink-C2C has two ports that can be connected to one GPU (occupying both 90*d332424dSBesar Wicaksonoports) or to two GPUs (one GPU per port). The user can use "port" bitmap 91*d332424dSBesar Wicaksonoparameter to select the port(s) to monitor. Each bit represents the port number, 92*d332424dSBesar Wicaksonoe.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. The 93*d332424dSBesar WicaksonoPMU will monitor both ports by default if not specified. 94*d332424dSBesar Wicaksono 95*d332424dSBesar WicaksonoExample for port filtering: 96*d332424dSBesar Wicaksono 97*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 0 on port 0:: 98*d332424dSBesar Wicaksono 99*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x1/ 100*d332424dSBesar Wicaksono 101*d332424dSBesar Wicaksono* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and port 1:: 102*d332424dSBesar Wicaksono 103*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x3/ 104*d332424dSBesar Wicaksono 105*d332424dSBesar WicaksonoNVLink-C2C1 PMU 106*d332424dSBesar Wicaksono------------------- 107*d332424dSBesar Wicaksono 108*d332424dSBesar WicaksonoThe NVLink-C2C1 PMU monitors incoming traffic from a GPU connected with 109*d332424dSBesar WicaksonoNVLink-C2C (Chip-2-Chip) interconnect. This PMU captures untranslated GPU 110*d332424dSBesar Wicaksonotraffic, in contrast with NvLink-C2C0 PMU that captures ATS translated traffic. 111*d332424dSBesar WicaksonoPlease see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about 112*d332424dSBesar Wicaksonothe PMU traffic coverage. 113*d332424dSBesar Wicaksono 114*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs, 115*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_nvlink_c2c1_pmu_<socket-id>. 116*d332424dSBesar Wicaksono 117*d332424dSBesar WicaksonoExample usage: 118*d332424dSBesar Wicaksono 119*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 0:: 120*d332424dSBesar Wicaksono 121*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0/ 122*d332424dSBesar Wicaksono 123*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 1:: 124*d332424dSBesar Wicaksono 125*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c1_pmu_1/event=0x0/ 126*d332424dSBesar Wicaksono 127*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 2:: 128*d332424dSBesar Wicaksono 129*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c1_pmu_2/event=0x0/ 130*d332424dSBesar Wicaksono 131*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 3:: 132*d332424dSBesar Wicaksono 133*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c1_pmu_3/event=0x0/ 134*d332424dSBesar Wicaksono 135*d332424dSBesar WicaksonoThe NVLink-C2C has two ports that can be connected to one GPU (occupying both 136*d332424dSBesar Wicaksonoports) or to two GPUs (one GPU per port). The user can use "port" bitmap 137*d332424dSBesar Wicaksonoparameter to select the port(s) to monitor. Each bit represents the port number, 138*d332424dSBesar Wicaksonoe.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. The 139*d332424dSBesar WicaksonoPMU will monitor both ports by default if not specified. 140*d332424dSBesar Wicaksono 141*d332424dSBesar WicaksonoExample for port filtering: 142*d332424dSBesar Wicaksono 143*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 0 on port 0:: 144*d332424dSBesar Wicaksono 145*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x1/ 146*d332424dSBesar Wicaksono 147*d332424dSBesar Wicaksono* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and port 1:: 148*d332424dSBesar Wicaksono 149*d332424dSBesar Wicaksono perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x3/ 150*d332424dSBesar Wicaksono 151*d332424dSBesar WicaksonoCNVLink PMU 152*d332424dSBesar Wicaksono--------------- 153*d332424dSBesar Wicaksono 154*d332424dSBesar WicaksonoThe CNVLink PMU monitors traffic from GPU and PCIE device on remote sockets 155*d332424dSBesar Wicaksonoto local memory. For PCIE traffic, this PMU captures read and relaxed ordered 156*d332424dSBesar Wicaksono(RO) write traffic. Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` 157*d332424dSBesar Wicaksonofor more info about the PMU traffic coverage. 158*d332424dSBesar Wicaksono 159*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs, 160*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_cnvlink_pmu_<socket-id>. 161*d332424dSBesar Wicaksono 162*d332424dSBesar WicaksonoEach SoC socket can be connected to one or more sockets via CNVLink. The user can 163*d332424dSBesar Wicaksonouse "rem_socket" bitmap parameter to select the remote socket(s) to monitor. 164*d332424dSBesar WicaksonoEach bit represents the socket number, e.g. "rem_socket=0xE" corresponds to 165*d332424dSBesar Wicaksonosocket 1 to 3. The PMU will monitor all remote sockets by default if not 166*d332424dSBesar Wicaksonospecified. 167*d332424dSBesar Wicaksono/sys/bus/event_source/devices/nvidia_cnvlink_pmu_<socket-id>/format/rem_socket 168*d332424dSBesar Wicaksonoshows the valid bits that can be set in the "rem_socket" parameter. 169*d332424dSBesar Wicaksono 170*d332424dSBesar WicaksonoThe PMU can not distinguish the remote traffic initiator, therefore it does not 171*d332424dSBesar Wicaksonoprovide filter to select the traffic source to monitor. It reports combined 172*d332424dSBesar Wicaksonotraffic from remote GPU and PCIE devices. 173*d332424dSBesar Wicaksono 174*d332424dSBesar WicaksonoExample usage: 175*d332424dSBesar Wicaksono 176*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 1, 2, and 3 to socket 0:: 177*d332424dSBesar Wicaksono 178*d332424dSBesar Wicaksono perf stat -a -e nvidia_cnvlink_pmu_0/event=0x0,rem_socket=0xE/ 179*d332424dSBesar Wicaksono 180*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 0, 2, and 3 to socket 1:: 181*d332424dSBesar Wicaksono 182*d332424dSBesar Wicaksono perf stat -a -e nvidia_cnvlink_pmu_1/event=0x0,rem_socket=0xD/ 183*d332424dSBesar Wicaksono 184*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 0, 1, and 3 to socket 2:: 185*d332424dSBesar Wicaksono 186*d332424dSBesar Wicaksono perf stat -a -e nvidia_cnvlink_pmu_2/event=0x0,rem_socket=0xB/ 187*d332424dSBesar Wicaksono 188*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 0, 1, and 2 to socket 3:: 189*d332424dSBesar Wicaksono 190*d332424dSBesar Wicaksono perf stat -a -e nvidia_cnvlink_pmu_3/event=0x0,rem_socket=0x7/ 191*d332424dSBesar Wicaksono 192*d332424dSBesar Wicaksono 193*d332424dSBesar WicaksonoPCIE PMU 194*d332424dSBesar Wicaksono------------ 195*d332424dSBesar Wicaksono 196*d332424dSBesar WicaksonoThe PCIE PMU monitors all read/write traffic from PCIE root ports to 197*d332424dSBesar Wicaksonolocal/remote memory. Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` 198*d332424dSBesar Wicaksonofor more info about the PMU traffic coverage. 199*d332424dSBesar Wicaksono 200*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs, 201*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_pcie_pmu_<socket-id>. 202*d332424dSBesar Wicaksono 203*d332424dSBesar WicaksonoEach SoC socket can support multiple root ports. The user can use 204*d332424dSBesar Wicaksono"root_port" bitmap parameter to select the port(s) to monitor, i.e. 205*d332424dSBesar Wicaksono"root_port=0xF" corresponds to root port 0 to 3. The PMU will monitor all root 206*d332424dSBesar Wicaksonoports by default if not specified. 207*d332424dSBesar Wicaksono/sys/bus/event_source/devices/nvidia_pcie_pmu_<socket-id>/format/root_port 208*d332424dSBesar Wicaksonoshows the valid bits that can be set in the "root_port" parameter. 209*d332424dSBesar Wicaksono 210*d332424dSBesar WicaksonoExample usage: 211*d332424dSBesar Wicaksono 212*d332424dSBesar Wicaksono* Count event id 0x0 from root port 0 and 1 of socket 0:: 213*d332424dSBesar Wicaksono 214*d332424dSBesar Wicaksono perf stat -a -e nvidia_pcie_pmu_0/event=0x0,root_port=0x3/ 215*d332424dSBesar Wicaksono 216*d332424dSBesar Wicaksono* Count event id 0x0 from root port 0 and 1 of socket 1:: 217*d332424dSBesar Wicaksono 218*d332424dSBesar Wicaksono perf stat -a -e nvidia_pcie_pmu_1/event=0x0,root_port=0x3/ 219*d332424dSBesar Wicaksono 220*d332424dSBesar Wicaksono.. _NVIDIA_Uncore_PMU_Traffic_Coverage_Section: 221*d332424dSBesar Wicaksono 222*d332424dSBesar WicaksonoTraffic Coverage 223*d332424dSBesar Wicaksono---------------- 224*d332424dSBesar Wicaksono 225*d332424dSBesar WicaksonoThe PMU traffic coverage may vary dependent on the chip configuration: 226*d332424dSBesar Wicaksono 227*d332424dSBesar Wicaksono* **NVIDIA Grace Hopper Superchip**: Hopper GPU is connected with Grace SoC. 228*d332424dSBesar Wicaksono 229*d332424dSBesar Wicaksono Example configuration with two Grace SoCs:: 230*d332424dSBesar Wicaksono 231*d332424dSBesar Wicaksono ********************************* ********************************* 232*d332424dSBesar Wicaksono * SOCKET-A * * SOCKET-B * 233*d332424dSBesar Wicaksono * * * * 234*d332424dSBesar Wicaksono * :::::::: * * :::::::: * 235*d332424dSBesar Wicaksono * : PCIE : * * : PCIE : * 236*d332424dSBesar Wicaksono * :::::::: * * :::::::: * 237*d332424dSBesar Wicaksono * | * * | * 238*d332424dSBesar Wicaksono * | * * | * 239*d332424dSBesar Wicaksono * ::::::: ::::::::: * * ::::::::: ::::::: * 240*d332424dSBesar Wicaksono * : : : : * * : : : : * 241*d332424dSBesar Wicaksono * : GPU :<--NVLink-->: Grace :<---CNVLink--->: Grace :<--NVLink-->: GPU : * 242*d332424dSBesar Wicaksono * : : C2C : SoC : * * : SoC : C2C : : * 243*d332424dSBesar Wicaksono * ::::::: ::::::::: * * ::::::::: ::::::: * 244*d332424dSBesar Wicaksono * | | * * | | * 245*d332424dSBesar Wicaksono * | | * * | | * 246*d332424dSBesar Wicaksono * &&&&&&&& &&&&&&&& * * &&&&&&&& &&&&&&&& * 247*d332424dSBesar Wicaksono * & GMEM & & CMEM & * * & CMEM & & GMEM & * 248*d332424dSBesar Wicaksono * &&&&&&&& &&&&&&&& * * &&&&&&&& &&&&&&&& * 249*d332424dSBesar Wicaksono * * * * 250*d332424dSBesar Wicaksono ********************************* ********************************* 251*d332424dSBesar Wicaksono 252*d332424dSBesar Wicaksono GMEM = GPU Memory (e.g. HBM) 253*d332424dSBesar Wicaksono CMEM = CPU Memory (e.g. LPDDR5X) 254*d332424dSBesar Wicaksono 255*d332424dSBesar Wicaksono | 256*d332424dSBesar Wicaksono | Following table contains traffic coverage of Grace SoC PMU in socket-A: 257*d332424dSBesar Wicaksono 258*d332424dSBesar Wicaksono :: 259*d332424dSBesar Wicaksono 260*d332424dSBesar Wicaksono +--------------+-------+-----------+-----------+-----+----------+----------+ 261*d332424dSBesar Wicaksono | | Source | 262*d332424dSBesar Wicaksono + +-------+-----------+-----------+-----+----------+----------+ 263*d332424dSBesar Wicaksono | Destination | |GPU ATS |GPU Not-ATS| | Socket-B | Socket-B | 264*d332424dSBesar Wicaksono | |PCI R/W|Translated,|Translated | CPU | CPU/PCIE1| GPU/PCIE2| 265*d332424dSBesar Wicaksono | | |EGM | | | | | 266*d332424dSBesar Wicaksono +==============+=======+===========+===========+=====+==========+==========+ 267*d332424dSBesar Wicaksono | Local | PCIE |NVLink-C2C0|NVLink-C2C1| SCF | SCF PMU | CNVLink | 268*d332424dSBesar Wicaksono | SYSRAM/CMEM | PMU |PMU |PMU | PMU | | PMU | 269*d332424dSBesar Wicaksono +--------------+-------+-----------+-----------+-----+----------+----------+ 270*d332424dSBesar Wicaksono | Local GMEM | PCIE | N/A |NVLink-C2C1| SCF | SCF PMU | CNVLink | 271*d332424dSBesar Wicaksono | | PMU | |PMU | PMU | | PMU | 272*d332424dSBesar Wicaksono +--------------+-------+-----------+-----------+-----+----------+----------+ 273*d332424dSBesar Wicaksono | Remote | PCIE |NVLink-C2C0|NVLink-C2C1| SCF | | | 274*d332424dSBesar Wicaksono | SYSRAM/CMEM | PMU |PMU |PMU | PMU | N/A | N/A | 275*d332424dSBesar Wicaksono | over CNVLink | | | | | | | 276*d332424dSBesar Wicaksono +--------------+-------+-----------+-----------+-----+----------+----------+ 277*d332424dSBesar Wicaksono | Remote GMEM | PCIE |NVLink-C2C0|NVLink-C2C1| SCF | | | 278*d332424dSBesar Wicaksono | over CNVLink | PMU |PMU |PMU | PMU | N/A | N/A | 279*d332424dSBesar Wicaksono +--------------+-------+-----------+-----------+-----+----------+----------+ 280*d332424dSBesar Wicaksono 281*d332424dSBesar Wicaksono PCIE1 traffic represents strongly ordered (SO) writes. 282*d332424dSBesar Wicaksono PCIE2 traffic represents reads and relaxed ordered (RO) writes. 283*d332424dSBesar Wicaksono 284*d332424dSBesar Wicaksono* **NVIDIA Grace CPU Superchip**: two Grace CPU SoCs are connected. 285*d332424dSBesar Wicaksono 286*d332424dSBesar Wicaksono Example configuration with two Grace SoCs:: 287*d332424dSBesar Wicaksono 288*d332424dSBesar Wicaksono ******************* ******************* 289*d332424dSBesar Wicaksono * SOCKET-A * * SOCKET-B * 290*d332424dSBesar Wicaksono * * * * 291*d332424dSBesar Wicaksono * :::::::: * * :::::::: * 292*d332424dSBesar Wicaksono * : PCIE : * * : PCIE : * 293*d332424dSBesar Wicaksono * :::::::: * * :::::::: * 294*d332424dSBesar Wicaksono * | * * | * 295*d332424dSBesar Wicaksono * | * * | * 296*d332424dSBesar Wicaksono * ::::::::: * * ::::::::: * 297*d332424dSBesar Wicaksono * : : * * : : * 298*d332424dSBesar Wicaksono * : Grace :<--------NVLink------->: Grace : * 299*d332424dSBesar Wicaksono * : SoC : * C2C * : SoC : * 300*d332424dSBesar Wicaksono * ::::::::: * * ::::::::: * 301*d332424dSBesar Wicaksono * | * * | * 302*d332424dSBesar Wicaksono * | * * | * 303*d332424dSBesar Wicaksono * &&&&&&&& * * &&&&&&&& * 304*d332424dSBesar Wicaksono * & CMEM & * * & CMEM & * 305*d332424dSBesar Wicaksono * &&&&&&&& * * &&&&&&&& * 306*d332424dSBesar Wicaksono * * * * 307*d332424dSBesar Wicaksono ******************* ******************* 308*d332424dSBesar Wicaksono 309*d332424dSBesar Wicaksono GMEM = GPU Memory (e.g. HBM) 310*d332424dSBesar Wicaksono CMEM = CPU Memory (e.g. LPDDR5X) 311*d332424dSBesar Wicaksono 312*d332424dSBesar Wicaksono | 313*d332424dSBesar Wicaksono | Following table contains traffic coverage of Grace SoC PMU in socket-A: 314*d332424dSBesar Wicaksono 315*d332424dSBesar Wicaksono :: 316*d332424dSBesar Wicaksono 317*d332424dSBesar Wicaksono +-----------------+-----------+---------+----------+-------------+ 318*d332424dSBesar Wicaksono | | Source | 319*d332424dSBesar Wicaksono + +-----------+---------+----------+-------------+ 320*d332424dSBesar Wicaksono | Destination | | | Socket-B | Socket-B | 321*d332424dSBesar Wicaksono | | PCI R/W | CPU | CPU/PCIE1| PCIE2 | 322*d332424dSBesar Wicaksono | | | | | | 323*d332424dSBesar Wicaksono +=================+===========+=========+==========+=============+ 324*d332424dSBesar Wicaksono | Local | PCIE PMU | SCF PMU | SCF PMU | NVLink-C2C0 | 325*d332424dSBesar Wicaksono | SYSRAM/CMEM | | | | PMU | 326*d332424dSBesar Wicaksono +-----------------+-----------+---------+----------+-------------+ 327*d332424dSBesar Wicaksono | Remote | | | | | 328*d332424dSBesar Wicaksono | SYSRAM/CMEM | PCIE PMU | SCF PMU | N/A | N/A | 329*d332424dSBesar Wicaksono | over NVLink-C2C | | | | | 330*d332424dSBesar Wicaksono +-----------------+-----------+---------+----------+-------------+ 331*d332424dSBesar Wicaksono 332*d332424dSBesar Wicaksono PCIE1 traffic represents strongly ordered (SO) writes. 333*d332424dSBesar Wicaksono PCIE2 traffic represents reads and relaxed ordered (RO) writes. 334