xref: /linux/Documentation/gpu/nova/core/todo.rst (revision 30bbcb44707a97fcb62246bebc8b413b5ab293f8)
1.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
2
3=========
4Task List
5=========
6
7Tasks may have the following fields:
8
9- ``Complexity``: Describes the required familiarity with Rust and / or the
10  corresponding kernel APIs or subsystems. There are four different complexities,
11  ``Beginner``, ``Intermediate``, ``Advanced`` and ``Expert``.
12- ``Reference``: References to other tasks.
13- ``Link``: Links to external resources.
14- ``Contact``: The person that can be contacted for further information about
15  the task.
16
17A task might have `[ABCD]` code after its name. This code can be used to grep
18into the code for `TODO` entries related to it.
19
20Enablement (Rust)
21=================
22
23Tasks that are not directly related to nova-core, but are preconditions in terms
24of required APIs.
25
26FromPrimitive API [FPRI]
27------------------------
28
29Sometimes the need arises to convert a number to a value of an enum or a
30structure.
31
32A good example from nova-core would be the ``Chipset`` enum type, which defines
33the value ``AD102``. When probing the GPU the value ``0x192`` can be read from a
34certain register indication the chipset AD102. Hence, the enum value ``AD102``
35should be derived from the number ``0x192``. Currently, nova-core uses a custom
36implementation (``Chipset::from_u32`` for this.
37
38Instead, it would be desirable to have something like the ``FromPrimitive``
39trait [1] from the num crate.
40
41Having this generalization also helps with implementing a generic macro that
42automatically generates the corresponding mappings between a value and a number.
43
44| Complexity: Beginner
45| Link: https://docs.rs/num/latest/num/trait.FromPrimitive.html
46
47Conversion from byte slices for types implementing FromBytes [TRSM]
48-------------------------------------------------------------------
49
50We retrieve several structures from byte streams coming from the BIOS or loaded
51firmware. At the moment converting the bytes slice into the proper type require
52an inelegant `unsafe` operation; this will go away once `FromBytes` implements
53a proper `from_bytes` method.
54
55| Complexity: Beginner
56
57CoherentAllocation improvements [COHA]
58--------------------------------------
59
60`CoherentAllocation` needs a safe way to write into the allocation, and to
61obtain slices within the allocation.
62
63| Complexity: Beginner
64| Contact: Abdiel Janulgue
65
66Generic register abstraction [REGA]
67-----------------------------------
68
69Work out how register constants and structures can be automatically generated
70through generalized macros.
71
72Example:
73
74.. code-block:: rust
75
76	register!(BOOT0, 0x0, u32, pci::Bar<SIZE>, Fields [
77	   MINOR_REVISION(3:0, RO),
78	   MAJOR_REVISION(7:4, RO),
79	   REVISION(7:0, RO), // Virtual register combining major and minor rev.
80	])
81
82This could expand to something like:
83
84.. code-block:: rust
85
86	const BOOT0_OFFSET: usize = 0x00000000;
87	const BOOT0_MINOR_REVISION_SHIFT: u8 = 0;
88	const BOOT0_MINOR_REVISION_MASK: u32 = 0x0000000f;
89	const BOOT0_MAJOR_REVISION_SHIFT: u8 = 4;
90	const BOOT0_MAJOR_REVISION_MASK: u32 = 0x000000f0;
91	const BOOT0_REVISION_SHIFT: u8 = BOOT0_MINOR_REVISION_SHIFT;
92	const BOOT0_REVISION_MASK: u32 = BOOT0_MINOR_REVISION_MASK | BOOT0_MAJOR_REVISION_MASK;
93
94	struct Boot0(u32);
95
96	impl Boot0 {
97	   #[inline]
98	   fn read(bar: &RevocableGuard<'_, pci::Bar<SIZE>>) -> Self {
99	      Self(bar.readl(BOOT0_OFFSET))
100	   }
101
102	   #[inline]
103	   fn minor_revision(&self) -> u32 {
104	      (self.0 & BOOT0_MINOR_REVISION_MASK) >> BOOT0_MINOR_REVISION_SHIFT
105	   }
106
107	   #[inline]
108	   fn major_revision(&self) -> u32 {
109	      (self.0 & BOOT0_MAJOR_REVISION_MASK) >> BOOT0_MAJOR_REVISION_SHIFT
110	   }
111
112	   #[inline]
113	   fn revision(&self) -> u32 {
114	      (self.0 & BOOT0_REVISION_MASK) >> BOOT0_REVISION_SHIFT
115	   }
116	}
117
118Usage:
119
120.. code-block:: rust
121
122	let bar = bar.try_access().ok_or(ENXIO)?;
123
124	let boot0 = Boot0::read(&bar);
125	pr_info!("Revision: {}\n", boot0.revision());
126
127A work-in-progress implementation currently resides in
128`drivers/gpu/nova-core/regs/macros.rs` and is used in nova-core. It would be
129nice to improve it (possibly using proc macros) and move it to the `kernel`
130crate so it can be used by other components as well.
131
132Features desired before this happens:
133
134* Relative register with build-time base address validation,
135* Arrays of registers with build-time index validation,
136* Make I/O optional I/O (for field values that are not registers),
137* Support other sizes than `u32`,
138* Allow visibility control for registers and individual fields,
139* Use Rust slice syntax to express fields ranges.
140
141| Complexity: Advanced
142| Contact: Alexandre Courbot
143
144Numerical operations [NUMM]
145---------------------------
146
147Nova uses integer operations that are not part of the standard library (or not
148implemented in an optimized way for the kernel). These include:
149
150- The "Find Last Set Bit" (`fls` function of the C part of the kernel)
151  operation.
152
153A `num` core kernel module is being designed to provide these operations.
154
155| Complexity: Intermediate
156| Contact: Alexandre Courbot
157
158Delay / Sleep abstractions [DLAY]
159---------------------------------
160
161Rust abstractions for the kernel's delay() and sleep() functions.
162
163FUJITA Tomonori plans to work on abstractions for read_poll_timeout_atomic()
164(and friends) [1].
165
166| Complexity: Beginner
167| Link: https://lore.kernel.org/netdev/20250228.080550.354359820929821928.fujita.tomonori@gmail.com/ [1]
168
169IRQ abstractions
170----------------
171
172Rust abstractions for IRQ handling.
173
174There is active ongoing work from Daniel Almeida [1] for the "core" abstractions
175to request IRQs.
176
177Besides optional review and testing work, the required ``pci::Device`` code
178around those core abstractions needs to be worked out.
179
180| Complexity: Intermediate
181| Link: https://lore.kernel.org/lkml/20250122163932.46697-1-daniel.almeida@collabora.com/ [1]
182| Contact: Daniel Almeida
183
184Page abstraction for foreign pages
185----------------------------------
186
187Rust abstractions for pages not created by the Rust page abstraction without
188direct ownership.
189
190There is active onging work from Abdiel Janulgue [1] and Lina [2].
191
192| Complexity: Advanced
193| Link: https://lore.kernel.org/linux-mm/20241119112408.779243-1-abdiel.janulgue@gmail.com/ [1]
194| Link: https://lore.kernel.org/rust-for-linux/20250202-rust-page-v1-0-e3170d7fe55e@asahilina.net/ [2]
195
196Scatterlist / sg_table abstractions
197-----------------------------------
198
199Rust abstractions for scatterlist / sg_table.
200
201There is preceding work from Abdiel Janulgue, which hasn't made it to the
202mailing list yet.
203
204| Complexity: Intermediate
205| Contact: Abdiel Janulgue
206
207PCI MISC APIs
208-------------
209
210Extend the existing PCI device / driver abstractions by SR-IOV, config space,
211capability, MSI API abstractions.
212
213| Complexity: Beginner
214
215XArray bindings [XARR]
216----------------------
217
218We need bindings for `xa_alloc`/`xa_alloc_cyclic` in order to generate the
219auxiliary device IDs.
220
221| Complexity: Intermediate
222
223Debugfs abstractions
224--------------------
225
226Rust abstraction for debugfs APIs.
227
228| Reference: Export GSP log buffers
229| Complexity: Intermediate
230
231GPU (general)
232=============
233
234Parse firmware headers
235----------------------
236
237Parse ELF headers from the firmware files loaded from the filesystem.
238
239| Reference: ELF utils
240| Complexity: Beginner
241| Contact: Abdiel Janulgue
242
243Build radix3 page table
244-----------------------
245
246Build the radix3 page table to map the firmware.
247
248| Complexity: Intermediate
249| Contact: Abdiel Janulgue
250
251Initial Devinit support
252-----------------------
253
254Implement BIOS Device Initialization, i.e. memory sizing, waiting, PLL
255configuration.
256
257| Contact: Dave Airlie
258| Complexity: Beginner
259
260MMU / PT management
261-------------------
262
263Work out the architecture for MMU / page table management.
264
265We need to consider that nova-drm will need rather fine-grained control,
266especially in terms of locking, in order to be able to implement asynchronous
267Vulkan queues.
268
269While generally sharing the corresponding code is desirable, it needs to be
270evaluated how (and if at all) sharing the corresponding code is expedient.
271
272| Complexity: Expert
273
274VRAM memory allocator
275---------------------
276
277Investigate options for a VRAM memory allocator.
278
279Some possible options:
280  - Rust abstractions for
281    - RB tree (interval tree) / drm_mm
282    - maple_tree
283  - native Rust collections
284
285| Complexity: Advanced
286
287Instance Memory
288---------------
289
290Implement support for instmem (bar2) used to store page tables.
291
292| Complexity: Intermediate
293| Contact: Dave Airlie
294
295GPU System Processor (GSP)
296==========================
297
298Export GSP log buffers
299----------------------
300
301Recent patches from Timur Tabi [1] added support to expose GSP-RM log buffers
302(even after failure to probe the driver) through debugfs.
303
304This is also an interesting feature for nova-core, especially in the early days.
305
306| Link: https://lore.kernel.org/nouveau/20241030202952.694055-2-ttabi@nvidia.com/ [1]
307| Reference: Debugfs abstractions
308| Complexity: Intermediate
309
310GSP firmware abstraction
311------------------------
312
313The GSP-RM firmware API is unstable and may incompatibly change from version to
314version, in terms of data structures and semantics.
315
316This problem is one of the big motivations for using Rust for nova-core, since
317it turns out that Rust's procedural macro feature provides a rather elegant way
318to address this issue:
319
3201. generate Rust structures from the C headers in a separate namespace per version
3212. build abstraction structures (within a generic namespace) that implement the
322   firmware interfaces; annotate the differences in implementation with version
323   identifiers
3243. use a procedural macro to generate the actual per version implementation out
325   of this abstraction
3264. instantiate the correct version type one on runtime (can be sure that all
327   have the same interface because it's defined by a common trait)
328
329There is a PoC implementation of this pattern, in the context of the nova-core
330PoC driver.
331
332This task aims at refining the feature and ideally generalize it, to be usable
333by other drivers as well.
334
335| Complexity: Expert
336
337GSP message queue
338-----------------
339
340Implement low level GSP message queue (command, status) for communication
341between the kernel driver and GSP.
342
343| Complexity: Advanced
344| Contact: Dave Airlie
345
346Bootstrap GSP
347-------------
348
349Call the boot firmware to boot the GSP processor; execute initial control
350messages.
351
352| Complexity: Intermediate
353| Contact: Dave Airlie
354
355Client / Device APIs
356--------------------
357
358Implement the GSP message interface for client / device allocation and the
359corresponding client and device allocation APIs.
360
361| Complexity: Intermediate
362| Contact: Dave Airlie
363
364Bar PDE handling
365----------------
366
367Synchronize page table handling for BARs between the kernel driver and GSP.
368
369| Complexity: Beginner
370| Contact: Dave Airlie
371
372FIFO engine
373-----------
374
375Implement support for the FIFO engine, i.e. the corresponding GSP message
376interface and provide an API for chid allocation and channel handling.
377
378| Complexity: Advanced
379| Contact: Dave Airlie
380
381GR engine
382---------
383
384Implement support for the graphics engine, i.e. the corresponding GSP message
385interface and provide an API for (golden) context creation and promotion.
386
387| Complexity: Advanced
388| Contact: Dave Airlie
389
390CE engine
391---------
392
393Implement support for the copy engine, i.e. the corresponding GSP message
394interface.
395
396| Complexity: Intermediate
397| Contact: Dave Airlie
398
399VFN IRQ controller
400------------------
401
402Support for the VFN interrupt controller.
403
404| Complexity: Intermediate
405| Contact: Dave Airlie
406
407External APIs
408=============
409
410nova-core base API
411------------------
412
413Work out the common pieces of the API to connect 2nd level drivers, i.e. vGPU
414manager and nova-drm.
415
416| Complexity: Advanced
417
418vGPU manager API
419----------------
420
421Work out the API parts required by the vGPU manager, which are not covered by
422the base API.
423
424| Complexity: Advanced
425
426nova-core C API
427---------------
428
429Implement a C wrapper for the APIs required by the vGPU manager driver.
430
431| Complexity: Intermediate
432
433Testing
434=======
435
436CI pipeline
437-----------
438
439Investigate option for continuous integration testing.
440
441This can go from as simple as running KUnit tests over running (graphics) CTS to
442booting up (multiple) guest VMs to test VFIO use-cases.
443
444It might also be worth to consider the introduction of a new test suite directly
445sitting on top of the uAPI for more targeted testing and debugging. There may be
446options for collaboration / shared code with the Mesa project.
447
448| Complexity: Advanced
449