xref: /linux/Documentation/gpu/rfc/i915_scheduler.rst (revision a1c613ae4c322ddd58d5a8539dbfba2a0380a8c0)
1f587623bSMatthew Brost=========================================
2f587623bSMatthew BrostI915 GuC Submission/DRM Scheduler Section
3f587623bSMatthew Brost=========================================
4f587623bSMatthew Brost
5f587623bSMatthew BrostUpstream plan
6f587623bSMatthew Brost=============
7f587623bSMatthew BrostFor upstream the overall plan for landing GuC submission and integrating the
8f587623bSMatthew Brosti915 with the DRM scheduler is:
9f587623bSMatthew Brost
10f587623bSMatthew Brost* Merge basic GuC submission
11f587623bSMatthew Brost	* Basic submission support for all gen11+ platforms
12f587623bSMatthew Brost	* Not enabled by default on any current platforms but can be enabled via
13f587623bSMatthew Brost	  modparam enable_guc
14f587623bSMatthew Brost	* Lots of rework will need to be done to integrate with DRM scheduler so
15f587623bSMatthew Brost	  no need to nit pick everything in the code, it just should be
16f587623bSMatthew Brost	  functional, no major coding style / layering errors, and not regress
17f587623bSMatthew Brost	  execlists
18f587623bSMatthew Brost	* Update IGTs / selftests as needed to work with GuC submission
19f587623bSMatthew Brost	* Enable CI on supported platforms for a baseline
20f587623bSMatthew Brost	* Rework / get CI heathly for GuC submission in place as needed
21f587623bSMatthew Brost* Merge new parallel submission uAPI
22f587623bSMatthew Brost	* Bonding uAPI completely incompatible with GuC submission, plus it has
23f587623bSMatthew Brost	  severe design issues in general, which is why we want to retire it no
24f587623bSMatthew Brost	  matter what
25f587623bSMatthew Brost	* New uAPI adds I915_CONTEXT_ENGINES_EXT_PARALLEL context setup step
26f587623bSMatthew Brost	  which configures a slot with N contexts
27f587623bSMatthew Brost	* After I915_CONTEXT_ENGINES_EXT_PARALLEL a user can submit N batches to
28f587623bSMatthew Brost	  a slot in a single execbuf IOCTL and the batches run on the GPU in
29f587623bSMatthew Brost	  paralllel
30f587623bSMatthew Brost	* Initially only for GuC submission but execlists can be supported if
31f587623bSMatthew Brost	  needed
32f587623bSMatthew Brost* Convert the i915 to use the DRM scheduler
33f587623bSMatthew Brost	* GuC submission backend fully integrated with DRM scheduler
34f587623bSMatthew Brost		* All request queues removed from backend (e.g. all backpressure
35f587623bSMatthew Brost		  handled in DRM scheduler)
36f587623bSMatthew Brost		* Resets / cancels hook in DRM scheduler
37f587623bSMatthew Brost		* Watchdog hooks into DRM scheduler
38f587623bSMatthew Brost		* Lots of complexity of the GuC backend can be pulled out once
39f587623bSMatthew Brost		  integrated with DRM scheduler (e.g. state machine gets
40*d56b699dSBjorn Helgaas		  simpler, locking gets simpler, etc...)
41f587623bSMatthew Brost	* Execlists backend will minimum required to hook in the DRM scheduler
42f587623bSMatthew Brost		* Legacy interface
43f587623bSMatthew Brost		* Features like timeslicing / preemption / virtual engines would
44f587623bSMatthew Brost		  be difficult to integrate with the DRM scheduler and these
45f587623bSMatthew Brost		  features are not required for GuC submission as the GuC does
46f587623bSMatthew Brost		  these things for us
47f587623bSMatthew Brost		* ROI low on fully integrating into DRM scheduler
48f587623bSMatthew Brost		* Fully integrating would add lots of complexity to DRM
49f587623bSMatthew Brost		  scheduler
50f587623bSMatthew Brost	* Port i915 priority inheritance / boosting feature in DRM scheduler
51f587623bSMatthew Brost		* Used for i915 page flip, may be useful to other DRM drivers as
52f587623bSMatthew Brost		  well
53f587623bSMatthew Brost		* Will be an optional feature in the DRM scheduler
54f587623bSMatthew Brost	* Remove in-order completion assumptions from DRM scheduler
55f587623bSMatthew Brost		* Even when using the DRM scheduler the backends will handle
56f587623bSMatthew Brost		  preemption, timeslicing, etc... so it is possible for jobs to
57f587623bSMatthew Brost		  finish out of order
58f587623bSMatthew Brost	* Pull out i915 priority levels and use DRM priority levels
59f587623bSMatthew Brost	* Optimize DRM scheduler as needed
60f587623bSMatthew Brost
61f587623bSMatthew BrostTODOs for GuC submission upstream
62f587623bSMatthew Brost=================================
63f587623bSMatthew Brost
64f587623bSMatthew Brost* Need an update to GuC firmware / i915 to enable error state capture
65f587623bSMatthew Brost* Open source tool to decode GuC logs
66f587623bSMatthew Brost* Public GuC spec
67f587623bSMatthew Brost
68f587623bSMatthew BrostNew uAPI for basic GuC submission
69f587623bSMatthew Brost=================================
70f587623bSMatthew BrostNo major changes are required to the uAPI for basic GuC submission. The only
71f587623bSMatthew Brostchange is a new scheduler attribute: I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP.
72f587623bSMatthew BrostThis attribute indicates the 2k i915 user priority levels are statically mapped
73f587623bSMatthew Brostinto 3 levels as follows:
74f587623bSMatthew Brost
75f587623bSMatthew Brost* -1k to -1 Low priority
76f587623bSMatthew Brost* 0 Medium priority
77f587623bSMatthew Brost* 1 to 1k High priority
78f587623bSMatthew Brost
79f587623bSMatthew BrostThis is needed because the GuC only has 4 priority bands. The highest priority
80f587623bSMatthew Brostband is reserved with the kernel. This aligns with the DRM scheduler priority
81f587623bSMatthew Brostlevels too.
82f587623bSMatthew Brost
83f587623bSMatthew BrostSpec references:
84f587623bSMatthew Brost----------------
85f587623bSMatthew Brost* https://www.khronos.org/registry/EGL/extensions/IMG/EGL_IMG_context_priority.txt
86f587623bSMatthew Brost* https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/chap5.html#devsandqueues-priority
87f587623bSMatthew Brost* https://spec.oneapi.com/level-zero/latest/core/api.html#ze-command-queue-priority-t
88f587623bSMatthew Brost
89f587623bSMatthew BrostNew parallel submission uAPI
90f587623bSMatthew Brost============================
910454a490SMatthew BrostThe existing bonding uAPI is completely broken with GuC submission because
920454a490SMatthew Brostwhether a submission is a single context submit or parallel submit isn't known
930454a490SMatthew Brostuntil execbuf time activated via the I915_SUBMIT_FENCE. To submit multiple
940454a490SMatthew Brostcontexts in parallel with the GuC the context must be explicitly registered with
950454a490SMatthew BrostN contexts and all N contexts must be submitted in a single command to the GuC.
960454a490SMatthew BrostThe GuC interfaces do not support dynamically changing between N contexts as the
970454a490SMatthew Brostbonding uAPI does. Hence the need for a new parallel submission interface. Also
980454a490SMatthew Brostthe legacy bonding uAPI is quite confusing and not intuitive at all. Furthermore
990454a490SMatthew BrostI915_SUBMIT_FENCE is by design a future fence, so not really something we should
1000454a490SMatthew Brostcontinue to support.
1010454a490SMatthew Brost
1020454a490SMatthew BrostThe new parallel submission uAPI consists of 3 parts:
1030454a490SMatthew Brost
1040454a490SMatthew Brost* Export engines logical mapping
1050454a490SMatthew Brost* A 'set_parallel' extension to configure contexts for parallel
1060454a490SMatthew Brost  submission
1070454a490SMatthew Brost* Extend execbuf2 IOCTL to support submitting N BBs in a single IOCTL
1080454a490SMatthew Brost
1090454a490SMatthew BrostExport engines logical mapping
1100454a490SMatthew Brost------------------------------
1110454a490SMatthew BrostCertain use cases require BBs to be placed on engine instances in logical order
1120454a490SMatthew Brost(e.g. split-frame on gen11+). The logical mapping of engine instances can change
1130454a490SMatthew Brostbased on fusing. Rather than making UMDs be aware of fusing, simply expose the
1140454a490SMatthew Brostlogical mapping with the existing query engine info IOCTL. Also the GuC
1150454a490SMatthew Brostsubmission interface currently only supports submitting multiple contexts to
1160454a490SMatthew Brostengines in logical order which is a new requirement compared to execlists.
1170454a490SMatthew BrostLastly, all current platforms have at most 2 engine instances and the logical
1180454a490SMatthew Brostorder is the same as uAPI order. This will change on platforms with more than 2
1190454a490SMatthew Brostengine instances.
1200454a490SMatthew Brost
1210454a490SMatthew BrostA single bit will be added to drm_i915_engine_info.flags indicating that the
1220454a490SMatthew Brostlogical instance has been returned and a new field,
1230454a490SMatthew Brostdrm_i915_engine_info.logical_instance, returns the logical instance.
1240454a490SMatthew Brost
1250454a490SMatthew BrostA 'set_parallel' extension to configure contexts for parallel submission
1260454a490SMatthew Brost------------------------------------------------------------------------
1270454a490SMatthew BrostThe 'set_parallel' extension configures a slot for parallel submission of N BBs.
1280454a490SMatthew BrostIt is a setup step that must be called before using any of the contexts. See
1290454a490SMatthew BrostI915_CONTEXT_ENGINES_EXT_LOAD_BALANCE or I915_CONTEXT_ENGINES_EXT_BOND for
1300454a490SMatthew Brostsimilar existing examples. Once a slot is configured for parallel submission the
1310454a490SMatthew Brostexecbuf2 IOCTL can be called submitting N BBs in a single IOCTL. Initially only
1320454a490SMatthew Brostsupports GuC submission. Execlists supports can be added later if needed.
1330454a490SMatthew Brost
1340454a490SMatthew BrostAdd I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT and
1350454a490SMatthew Brostdrm_i915_context_engines_parallel_submit to the uAPI to implement this
1360454a490SMatthew Brostextension.
1370454a490SMatthew Brost
138f6757dfcSJani Nikula.. c:namespace-push:: rfc
139f6757dfcSJani Nikula
1400d7502fcSMatthew Brost.. kernel-doc:: include/uapi/drm/i915_drm.h
1410d7502fcSMatthew Brost        :functions: i915_context_engines_parallel_submit
1420454a490SMatthew Brost
143f6757dfcSJani Nikula.. c:namespace-pop::
144f6757dfcSJani Nikula
1450454a490SMatthew BrostExtend execbuf2 IOCTL to support submitting N BBs in a single IOCTL
1460454a490SMatthew Brost-------------------------------------------------------------------
1470454a490SMatthew BrostContexts that have been configured with the 'set_parallel' extension can only
1480454a490SMatthew Brostsubmit N BBs in a single execbuf2 IOCTL. The BBs are either the last N objects
1490454a490SMatthew Brostin the drm_i915_gem_exec_object2 list or the first N if I915_EXEC_BATCH_FIRST is
1500454a490SMatthew Brostset. The number of BBs is implicit based on the slot submitted and how it has
1510454a490SMatthew Brostbeen configured by 'set_parallel' or other extensions. No uAPI changes are
1520454a490SMatthew Brostrequired to the execbuf2 IOCTL.
153