xref: /linux/Documentation/admin-guide/perf/nvidia-tegra241-pmu.rst (revision c43267e6794a36013fd495a4d81bf7f748fe4615)
1*d332424dSBesar Wicaksono============================================================
2*d332424dSBesar WicaksonoNVIDIA Tegra241 SoC Uncore Performance Monitoring Unit (PMU)
3*d332424dSBesar Wicaksono============================================================
4*d332424dSBesar Wicaksono
5*d332424dSBesar WicaksonoThe NVIDIA Tegra241 SoC includes various system PMUs to measure key performance
6*d332424dSBesar Wicaksonometrics like memory bandwidth, latency, and utilization:
7*d332424dSBesar Wicaksono
8*d332424dSBesar Wicaksono* Scalable Coherency Fabric (SCF)
9*d332424dSBesar Wicaksono* NVLink-C2C0
10*d332424dSBesar Wicaksono* NVLink-C2C1
11*d332424dSBesar Wicaksono* CNVLink
12*d332424dSBesar Wicaksono* PCIE
13*d332424dSBesar Wicaksono
14*d332424dSBesar WicaksonoPMU Driver
15*d332424dSBesar Wicaksono----------
16*d332424dSBesar Wicaksono
17*d332424dSBesar WicaksonoThe PMUs in this document are based on ARM CoreSight PMU Architecture as
18*d332424dSBesar Wicaksonodescribed in document: ARM IHI 0091. Since this is a standard architecture, the
19*d332424dSBesar WicaksonoPMUs are managed by a common driver "arm-cs-arch-pmu". This driver describes
20*d332424dSBesar Wicaksonothe available events and configuration of each PMU in sysfs. Please see the
21*d332424dSBesar Wicaksonosections below to get the sysfs path of each PMU. Like other uncore PMU drivers,
22*d332424dSBesar Wicaksonothe driver provides "cpumask" sysfs attribute to show the CPU id used to handle
23*d332424dSBesar Wicaksonothe PMU event. There is also "associated_cpus" sysfs attribute, which contains a
24*d332424dSBesar Wicaksonolist of CPUs associated with the PMU instance.
25*d332424dSBesar Wicaksono
26*d332424dSBesar Wicaksono.. _SCF_PMU_Section:
27*d332424dSBesar Wicaksono
28*d332424dSBesar WicaksonoSCF PMU
29*d332424dSBesar Wicaksono-------
30*d332424dSBesar Wicaksono
31*d332424dSBesar WicaksonoThe SCF PMU monitors system level cache events, CPU traffic, and
32*d332424dSBesar Wicaksonostrongly-ordered (SO) PCIE write traffic to local/remote memory. Please see
33*d332424dSBesar Wicaksono:ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about the PMU
34*d332424dSBesar Wicaksonotraffic coverage.
35*d332424dSBesar Wicaksono
36*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs,
37*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_scf_pmu_<socket-id>.
38*d332424dSBesar Wicaksono
39*d332424dSBesar WicaksonoExample usage:
40*d332424dSBesar Wicaksono
41*d332424dSBesar Wicaksono* Count event id 0x0 in socket 0::
42*d332424dSBesar Wicaksono
43*d332424dSBesar Wicaksono   perf stat -a -e nvidia_scf_pmu_0/event=0x0/
44*d332424dSBesar Wicaksono
45*d332424dSBesar Wicaksono* Count event id 0x0 in socket 1::
46*d332424dSBesar Wicaksono
47*d332424dSBesar Wicaksono   perf stat -a -e nvidia_scf_pmu_1/event=0x0/
48*d332424dSBesar Wicaksono
49*d332424dSBesar WicaksonoNVLink-C2C0 PMU
50*d332424dSBesar Wicaksono--------------------
51*d332424dSBesar Wicaksono
52*d332424dSBesar WicaksonoThe NVLink-C2C0 PMU monitors incoming traffic from a GPU/CPU connected with
53*d332424dSBesar WicaksonoNVLink-C2C (Chip-2-Chip) interconnect. The type of traffic captured by this PMU
54*d332424dSBesar Wicaksonovaries dependent on the chip configuration:
55*d332424dSBesar Wicaksono
56*d332424dSBesar Wicaksono* NVIDIA Grace Hopper Superchip: Hopper GPU is connected with Grace SoC.
57*d332424dSBesar Wicaksono
58*d332424dSBesar Wicaksono  In this config, the PMU captures GPU ATS translated or EGM traffic from the GPU.
59*d332424dSBesar Wicaksono
60*d332424dSBesar Wicaksono* NVIDIA Grace CPU Superchip: two Grace CPU SoCs are connected.
61*d332424dSBesar Wicaksono
62*d332424dSBesar Wicaksono  In this config, the PMU captures read and relaxed ordered (RO) writes from
63*d332424dSBesar Wicaksono  PCIE device of the remote SoC.
64*d332424dSBesar Wicaksono
65*d332424dSBesar WicaksonoPlease see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about
66*d332424dSBesar Wicaksonothe PMU traffic coverage.
67*d332424dSBesar Wicaksono
68*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs,
69*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_nvlink_c2c0_pmu_<socket-id>.
70*d332424dSBesar Wicaksono
71*d332424dSBesar WicaksonoExample usage:
72*d332424dSBesar Wicaksono
73*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 0::
74*d332424dSBesar Wicaksono
75*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0/
76*d332424dSBesar Wicaksono
77*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 1::
78*d332424dSBesar Wicaksono
79*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c0_pmu_1/event=0x0/
80*d332424dSBesar Wicaksono
81*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 2::
82*d332424dSBesar Wicaksono
83*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c0_pmu_2/event=0x0/
84*d332424dSBesar Wicaksono
85*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU/CPU connected with socket 3::
86*d332424dSBesar Wicaksono
87*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c0_pmu_3/event=0x0/
88*d332424dSBesar Wicaksono
89*d332424dSBesar WicaksonoThe NVLink-C2C has two ports that can be connected to one GPU (occupying both
90*d332424dSBesar Wicaksonoports) or to two GPUs (one GPU per port). The user can use "port" bitmap
91*d332424dSBesar Wicaksonoparameter to select the port(s) to monitor. Each bit represents the port number,
92*d332424dSBesar Wicaksonoe.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. The
93*d332424dSBesar WicaksonoPMU will monitor both ports by default if not specified.
94*d332424dSBesar Wicaksono
95*d332424dSBesar WicaksonoExample for port filtering:
96*d332424dSBesar Wicaksono
97*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 0 on port 0::
98*d332424dSBesar Wicaksono
99*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x1/
100*d332424dSBesar Wicaksono
101*d332424dSBesar Wicaksono* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and port 1::
102*d332424dSBesar Wicaksono
103*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x3/
104*d332424dSBesar Wicaksono
105*d332424dSBesar WicaksonoNVLink-C2C1 PMU
106*d332424dSBesar Wicaksono-------------------
107*d332424dSBesar Wicaksono
108*d332424dSBesar WicaksonoThe NVLink-C2C1 PMU monitors incoming traffic from a GPU connected with
109*d332424dSBesar WicaksonoNVLink-C2C (Chip-2-Chip) interconnect. This PMU captures untranslated GPU
110*d332424dSBesar Wicaksonotraffic, in contrast with NvLink-C2C0 PMU that captures ATS translated traffic.
111*d332424dSBesar WicaksonoPlease see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about
112*d332424dSBesar Wicaksonothe PMU traffic coverage.
113*d332424dSBesar Wicaksono
114*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs,
115*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_nvlink_c2c1_pmu_<socket-id>.
116*d332424dSBesar Wicaksono
117*d332424dSBesar WicaksonoExample usage:
118*d332424dSBesar Wicaksono
119*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 0::
120*d332424dSBesar Wicaksono
121*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0/
122*d332424dSBesar Wicaksono
123*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 1::
124*d332424dSBesar Wicaksono
125*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c1_pmu_1/event=0x0/
126*d332424dSBesar Wicaksono
127*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 2::
128*d332424dSBesar Wicaksono
129*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c1_pmu_2/event=0x0/
130*d332424dSBesar Wicaksono
131*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 3::
132*d332424dSBesar Wicaksono
133*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c1_pmu_3/event=0x0/
134*d332424dSBesar Wicaksono
135*d332424dSBesar WicaksonoThe NVLink-C2C has two ports that can be connected to one GPU (occupying both
136*d332424dSBesar Wicaksonoports) or to two GPUs (one GPU per port). The user can use "port" bitmap
137*d332424dSBesar Wicaksonoparameter to select the port(s) to monitor. Each bit represents the port number,
138*d332424dSBesar Wicaksonoe.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. The
139*d332424dSBesar WicaksonoPMU will monitor both ports by default if not specified.
140*d332424dSBesar Wicaksono
141*d332424dSBesar WicaksonoExample for port filtering:
142*d332424dSBesar Wicaksono
143*d332424dSBesar Wicaksono* Count event id 0x0 from the GPU connected with socket 0 on port 0::
144*d332424dSBesar Wicaksono
145*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x1/
146*d332424dSBesar Wicaksono
147*d332424dSBesar Wicaksono* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and port 1::
148*d332424dSBesar Wicaksono
149*d332424dSBesar Wicaksono   perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x3/
150*d332424dSBesar Wicaksono
151*d332424dSBesar WicaksonoCNVLink PMU
152*d332424dSBesar Wicaksono---------------
153*d332424dSBesar Wicaksono
154*d332424dSBesar WicaksonoThe CNVLink PMU monitors traffic from GPU and PCIE device on remote sockets
155*d332424dSBesar Wicaksonoto local memory. For PCIE traffic, this PMU captures read and relaxed ordered
156*d332424dSBesar Wicaksono(RO) write traffic. Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section`
157*d332424dSBesar Wicaksonofor more info about the PMU traffic coverage.
158*d332424dSBesar Wicaksono
159*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs,
160*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_cnvlink_pmu_<socket-id>.
161*d332424dSBesar Wicaksono
162*d332424dSBesar WicaksonoEach SoC socket can be connected to one or more sockets via CNVLink. The user can
163*d332424dSBesar Wicaksonouse "rem_socket" bitmap parameter to select the remote socket(s) to monitor.
164*d332424dSBesar WicaksonoEach bit represents the socket number, e.g. "rem_socket=0xE" corresponds to
165*d332424dSBesar Wicaksonosocket 1 to 3. The PMU will monitor all remote sockets by default if not
166*d332424dSBesar Wicaksonospecified.
167*d332424dSBesar Wicaksono/sys/bus/event_source/devices/nvidia_cnvlink_pmu_<socket-id>/format/rem_socket
168*d332424dSBesar Wicaksonoshows the valid bits that can be set in the "rem_socket" parameter.
169*d332424dSBesar Wicaksono
170*d332424dSBesar WicaksonoThe PMU can not distinguish the remote traffic initiator, therefore it does not
171*d332424dSBesar Wicaksonoprovide filter to select the traffic source to monitor. It reports combined
172*d332424dSBesar Wicaksonotraffic from remote GPU and PCIE devices.
173*d332424dSBesar Wicaksono
174*d332424dSBesar WicaksonoExample usage:
175*d332424dSBesar Wicaksono
176*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 1, 2, and 3 to socket 0::
177*d332424dSBesar Wicaksono
178*d332424dSBesar Wicaksono   perf stat -a -e nvidia_cnvlink_pmu_0/event=0x0,rem_socket=0xE/
179*d332424dSBesar Wicaksono
180*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 0, 2, and 3 to socket 1::
181*d332424dSBesar Wicaksono
182*d332424dSBesar Wicaksono   perf stat -a -e nvidia_cnvlink_pmu_1/event=0x0,rem_socket=0xD/
183*d332424dSBesar Wicaksono
184*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 0, 1, and 3 to socket 2::
185*d332424dSBesar Wicaksono
186*d332424dSBesar Wicaksono   perf stat -a -e nvidia_cnvlink_pmu_2/event=0x0,rem_socket=0xB/
187*d332424dSBesar Wicaksono
188*d332424dSBesar Wicaksono* Count event id 0x0 for the traffic from remote socket 0, 1, and 2 to socket 3::
189*d332424dSBesar Wicaksono
190*d332424dSBesar Wicaksono   perf stat -a -e nvidia_cnvlink_pmu_3/event=0x0,rem_socket=0x7/
191*d332424dSBesar Wicaksono
192*d332424dSBesar Wicaksono
193*d332424dSBesar WicaksonoPCIE PMU
194*d332424dSBesar Wicaksono------------
195*d332424dSBesar Wicaksono
196*d332424dSBesar WicaksonoThe PCIE PMU monitors all read/write traffic from PCIE root ports to
197*d332424dSBesar Wicaksonolocal/remote memory. Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section`
198*d332424dSBesar Wicaksonofor more info about the PMU traffic coverage.
199*d332424dSBesar Wicaksono
200*d332424dSBesar WicaksonoThe events and configuration options of this PMU device are described in sysfs,
201*d332424dSBesar Wicaksonosee /sys/bus/event_source/devices/nvidia_pcie_pmu_<socket-id>.
202*d332424dSBesar Wicaksono
203*d332424dSBesar WicaksonoEach SoC socket can support multiple root ports. The user can use
204*d332424dSBesar Wicaksono"root_port" bitmap parameter to select the port(s) to monitor, i.e.
205*d332424dSBesar Wicaksono"root_port=0xF" corresponds to root port 0 to 3. The PMU will monitor all root
206*d332424dSBesar Wicaksonoports by default if not specified.
207*d332424dSBesar Wicaksono/sys/bus/event_source/devices/nvidia_pcie_pmu_<socket-id>/format/root_port
208*d332424dSBesar Wicaksonoshows the valid bits that can be set in the "root_port" parameter.
209*d332424dSBesar Wicaksono
210*d332424dSBesar WicaksonoExample usage:
211*d332424dSBesar Wicaksono
212*d332424dSBesar Wicaksono* Count event id 0x0 from root port 0 and 1 of socket 0::
213*d332424dSBesar Wicaksono
214*d332424dSBesar Wicaksono   perf stat -a -e nvidia_pcie_pmu_0/event=0x0,root_port=0x3/
215*d332424dSBesar Wicaksono
216*d332424dSBesar Wicaksono* Count event id 0x0 from root port 0 and 1 of socket 1::
217*d332424dSBesar Wicaksono
218*d332424dSBesar Wicaksono   perf stat -a -e nvidia_pcie_pmu_1/event=0x0,root_port=0x3/
219*d332424dSBesar Wicaksono
220*d332424dSBesar Wicaksono.. _NVIDIA_Uncore_PMU_Traffic_Coverage_Section:
221*d332424dSBesar Wicaksono
222*d332424dSBesar WicaksonoTraffic Coverage
223*d332424dSBesar Wicaksono----------------
224*d332424dSBesar Wicaksono
225*d332424dSBesar WicaksonoThe PMU traffic coverage may vary dependent on the chip configuration:
226*d332424dSBesar Wicaksono
227*d332424dSBesar Wicaksono* **NVIDIA Grace Hopper Superchip**: Hopper GPU is connected with Grace SoC.
228*d332424dSBesar Wicaksono
229*d332424dSBesar Wicaksono  Example configuration with two Grace SoCs::
230*d332424dSBesar Wicaksono
231*d332424dSBesar Wicaksono   *********************************          *********************************
232*d332424dSBesar Wicaksono   * SOCKET-A                      *          * SOCKET-B                      *
233*d332424dSBesar Wicaksono   *                               *          *                               *
234*d332424dSBesar Wicaksono   *                     ::::::::  *          *  ::::::::                     *
235*d332424dSBesar Wicaksono   *                     : PCIE :  *          *  : PCIE :                     *
236*d332424dSBesar Wicaksono   *                     ::::::::  *          *  ::::::::                     *
237*d332424dSBesar Wicaksono   *                         |     *          *      |                        *
238*d332424dSBesar Wicaksono   *                         |     *          *      |                        *
239*d332424dSBesar Wicaksono   *  :::::::            ::::::::: *          *  :::::::::            ::::::: *
240*d332424dSBesar Wicaksono   *  :     :            :       : *          *  :       :            :     : *
241*d332424dSBesar Wicaksono   *  : GPU :<--NVLink-->: Grace :<---CNVLink--->: Grace :<--NVLink-->: GPU : *
242*d332424dSBesar Wicaksono   *  :     :    C2C     :  SoC  : *          *  :  SoC  :    C2C     :     : *
243*d332424dSBesar Wicaksono   *  :::::::            ::::::::: *          *  :::::::::            ::::::: *
244*d332424dSBesar Wicaksono   *     |                   |     *          *      |                   |    *
245*d332424dSBesar Wicaksono   *     |                   |     *          *      |                   |    *
246*d332424dSBesar Wicaksono   *  &&&&&&&&           &&&&&&&&  *          *   &&&&&&&&           &&&&&&&& *
247*d332424dSBesar Wicaksono   *  & GMEM &           & CMEM &  *          *   & CMEM &           & GMEM & *
248*d332424dSBesar Wicaksono   *  &&&&&&&&           &&&&&&&&  *          *   &&&&&&&&           &&&&&&&& *
249*d332424dSBesar Wicaksono   *                               *          *                               *
250*d332424dSBesar Wicaksono   *********************************          *********************************
251*d332424dSBesar Wicaksono
252*d332424dSBesar Wicaksono   GMEM = GPU Memory (e.g. HBM)
253*d332424dSBesar Wicaksono   CMEM = CPU Memory (e.g. LPDDR5X)
254*d332424dSBesar Wicaksono
255*d332424dSBesar Wicaksono  |
256*d332424dSBesar Wicaksono  | Following table contains traffic coverage of Grace SoC PMU in socket-A:
257*d332424dSBesar Wicaksono
258*d332424dSBesar Wicaksono  ::
259*d332424dSBesar Wicaksono
260*d332424dSBesar Wicaksono   +--------------+-------+-----------+-----------+-----+----------+----------+
261*d332424dSBesar Wicaksono   |              |                        Source                             |
262*d332424dSBesar Wicaksono   +              +-------+-----------+-----------+-----+----------+----------+
263*d332424dSBesar Wicaksono   | Destination  |       |GPU ATS    |GPU Not-ATS|     | Socket-B | Socket-B |
264*d332424dSBesar Wicaksono   |              |PCI R/W|Translated,|Translated | CPU | CPU/PCIE1| GPU/PCIE2|
265*d332424dSBesar Wicaksono   |              |       |EGM        |           |     |          |          |
266*d332424dSBesar Wicaksono   +==============+=======+===========+===========+=====+==========+==========+
267*d332424dSBesar Wicaksono   | Local        | PCIE  |NVLink-C2C0|NVLink-C2C1| SCF | SCF PMU  | CNVLink  |
268*d332424dSBesar Wicaksono   | SYSRAM/CMEM  | PMU   |PMU        |PMU        | PMU |          | PMU      |
269*d332424dSBesar Wicaksono   +--------------+-------+-----------+-----------+-----+----------+----------+
270*d332424dSBesar Wicaksono   | Local GMEM   | PCIE  |    N/A    |NVLink-C2C1| SCF | SCF PMU  | CNVLink  |
271*d332424dSBesar Wicaksono   |              | PMU   |           |PMU        | PMU |          | PMU      |
272*d332424dSBesar Wicaksono   +--------------+-------+-----------+-----------+-----+----------+----------+
273*d332424dSBesar Wicaksono   | Remote       | PCIE  |NVLink-C2C0|NVLink-C2C1| SCF |          |          |
274*d332424dSBesar Wicaksono   | SYSRAM/CMEM  | PMU   |PMU        |PMU        | PMU |   N/A    |   N/A    |
275*d332424dSBesar Wicaksono   | over CNVLink |       |           |           |     |          |          |
276*d332424dSBesar Wicaksono   +--------------+-------+-----------+-----------+-----+----------+----------+
277*d332424dSBesar Wicaksono   | Remote GMEM  | PCIE  |NVLink-C2C0|NVLink-C2C1| SCF |          |          |
278*d332424dSBesar Wicaksono   | over CNVLink | PMU   |PMU        |PMU        | PMU |   N/A    |   N/A    |
279*d332424dSBesar Wicaksono   +--------------+-------+-----------+-----------+-----+----------+----------+
280*d332424dSBesar Wicaksono
281*d332424dSBesar Wicaksono   PCIE1 traffic represents strongly ordered (SO) writes.
282*d332424dSBesar Wicaksono   PCIE2 traffic represents reads and relaxed ordered (RO) writes.
283*d332424dSBesar Wicaksono
284*d332424dSBesar Wicaksono* **NVIDIA Grace CPU Superchip**: two Grace CPU SoCs are connected.
285*d332424dSBesar Wicaksono
286*d332424dSBesar Wicaksono  Example configuration with two Grace SoCs::
287*d332424dSBesar Wicaksono
288*d332424dSBesar Wicaksono   *******************             *******************
289*d332424dSBesar Wicaksono   * SOCKET-A        *             * SOCKET-B        *
290*d332424dSBesar Wicaksono   *                 *             *                 *
291*d332424dSBesar Wicaksono   *    ::::::::     *             *    ::::::::     *
292*d332424dSBesar Wicaksono   *    : PCIE :     *             *    : PCIE :     *
293*d332424dSBesar Wicaksono   *    ::::::::     *             *    ::::::::     *
294*d332424dSBesar Wicaksono   *        |        *             *        |        *
295*d332424dSBesar Wicaksono   *        |        *             *        |        *
296*d332424dSBesar Wicaksono   *    :::::::::    *             *    :::::::::    *
297*d332424dSBesar Wicaksono   *    :       :    *             *    :       :    *
298*d332424dSBesar Wicaksono   *    : Grace :<--------NVLink------->: Grace :    *
299*d332424dSBesar Wicaksono   *    :  SoC  :    *     C2C     *    :  SoC  :    *
300*d332424dSBesar Wicaksono   *    :::::::::    *             *    :::::::::    *
301*d332424dSBesar Wicaksono   *        |        *             *        |        *
302*d332424dSBesar Wicaksono   *        |        *             *        |        *
303*d332424dSBesar Wicaksono   *     &&&&&&&&    *             *     &&&&&&&&    *
304*d332424dSBesar Wicaksono   *     & CMEM &    *             *     & CMEM &    *
305*d332424dSBesar Wicaksono   *     &&&&&&&&    *             *     &&&&&&&&    *
306*d332424dSBesar Wicaksono   *                 *             *                 *
307*d332424dSBesar Wicaksono   *******************             *******************
308*d332424dSBesar Wicaksono
309*d332424dSBesar Wicaksono   GMEM = GPU Memory (e.g. HBM)
310*d332424dSBesar Wicaksono   CMEM = CPU Memory (e.g. LPDDR5X)
311*d332424dSBesar Wicaksono
312*d332424dSBesar Wicaksono  |
313*d332424dSBesar Wicaksono  | Following table contains traffic coverage of Grace SoC PMU in socket-A:
314*d332424dSBesar Wicaksono
315*d332424dSBesar Wicaksono  ::
316*d332424dSBesar Wicaksono
317*d332424dSBesar Wicaksono   +-----------------+-----------+---------+----------+-------------+
318*d332424dSBesar Wicaksono   |                 |                      Source                  |
319*d332424dSBesar Wicaksono   +                 +-----------+---------+----------+-------------+
320*d332424dSBesar Wicaksono   | Destination     |           |         | Socket-B | Socket-B    |
321*d332424dSBesar Wicaksono   |                 |  PCI R/W  |   CPU   | CPU/PCIE1| PCIE2       |
322*d332424dSBesar Wicaksono   |                 |           |         |          |             |
323*d332424dSBesar Wicaksono   +=================+===========+=========+==========+=============+
324*d332424dSBesar Wicaksono   | Local           |  PCIE PMU | SCF PMU | SCF PMU  | NVLink-C2C0 |
325*d332424dSBesar Wicaksono   | SYSRAM/CMEM     |           |         |          | PMU         |
326*d332424dSBesar Wicaksono   +-----------------+-----------+---------+----------+-------------+
327*d332424dSBesar Wicaksono   | Remote          |           |         |          |             |
328*d332424dSBesar Wicaksono   | SYSRAM/CMEM     |  PCIE PMU | SCF PMU |   N/A    |     N/A     |
329*d332424dSBesar Wicaksono   | over NVLink-C2C |           |         |          |             |
330*d332424dSBesar Wicaksono   +-----------------+-----------+---------+----------+-------------+
331*d332424dSBesar Wicaksono
332*d332424dSBesar Wicaksono   PCIE1 traffic represents strongly ordered (SO) writes.
333*d332424dSBesar Wicaksono   PCIE2 traffic represents reads and relaxed ordered (RO) writes.
334