3ced71a8 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xsk: add XSkFQ refill and XSk wakeup helpers
XSkFQ refill is pretty generic across the drivers minus FQ descriptor filling and can easily be unified with one inline callback. XSk wakeup is u
libeth: xsk: add XSkFQ refill and XSk wakeup helpers
XSkFQ refill is pretty generic across the drivers minus FQ descriptor filling and can easily be unified with one inline callback. XSk wakeup is usually not, but here, instead of commonly used "SW interrupts", I picked firing an IPI. In most tests, it showed better performance; it also provides better control for userspace on which CPU will handle the xmit, as SW interrupts honor IRQ affinity no matter which core produces XSk xmit descs (while XDPSQs are associated 1:1 with cores having the same ID).
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
5495c58c | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xsk: add XSk Rx processing support
Add XSk counterparts for preparing XSk &libeth_xdp_buff (adding head and frags), running the program, and handling the verdict, inc. XDP_PASS. Shortcuts in
libeth: xsk: add XSk Rx processing support
Add XSk counterparts for preparing XSk &libeth_xdp_buff (adding head and frags), running the program, and handling the verdict, inc. XDP_PASS. Shortcuts in comparison with regular Rx: frags and all verdicts except XDP_REDIRECT are under unlikely() and out of line; no checks for XDP program presence as it's always true for XSk.
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # optimizations Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
40e846d1 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xsk: add XSk xmit functions
Reuse core sending functions to send XSk xmit frames. Both metadata and no metadata pools/driver are supported. libeth_xdp also provides generic XSk metadata ops,
libeth: xsk: add XSk xmit functions
Reuse core sending functions to send XSk xmit frames. Both metadata and no metadata pools/driver are supported. libeth_xdp also provides generic XSk metadata ops, currently with the checksum offload only and for cases when HW doesn't require supplying L3/L4 checksum offsets. Drivers are free to pass their own ops. &libeth_xdp_tx_bulk is not used here as it would be redundant; pool->tx_descs are accessed directly. Fake "libeth_xsktmo" is needed to hide implementation details from the drivers when they want to use the generic ops: the original struct is defined in the same file where dev->xsk_tx_metadata_ops gets set to avoid duplication of slowpath; at the same time; XSk xmit functions use local "fast" copy to inline XMO callbacks. Tx descriptor filling loop is unrolled by 8.
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # optimizations Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
b3ad8450 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xsk: add XSk XDP_TX sending helpers
Add Xsk counterparts for XDP_TX buffer sending and completion. The same base structures and functions used from the libeth_xdp core, with adjustments to t
libeth: xsk: add XSk XDP_TX sending helpers
Add Xsk counterparts for XDP_TX buffer sending and completion. The same base structures and functions used from the libeth_xdp core, with adjustments to that XSk Rx always operates on &xdp_buff_xsk for both head and frags. And unlike regular Rx, here unlikely() are used for frags, as the header split gives no benefits for XSk Rx, at least for now.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
576cc5c1 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add RSS hash hint and XDP features setup helpers
End the XDP section by adding helpers to setup XDP features, flipping .ndo_xdp_xmit() support at runtime (in case when it's not always o
libeth: xdp: add RSS hash hint and XDP features setup helpers
End the XDP section by adding helpers to setup XDP features, flipping .ndo_xdp_xmit() support at runtime (in case when it's not always on), and calculating the queue clean/refill threshold.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
4c805f7a | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add XDP prog run and verdict result handling
Running a prog and handling the verdicts, up to napi_gro_receive() is also pretty generic code not really differing between vendors (except
libeth: xdp: add XDP prog run and verdict result handling
Running a prog and handling the verdicts, up to napi_gro_receive() is also pretty generic code not really differing between vendors (except for Tx descriptor filling and Rx descriptor parsing).
Define a couple inlines to do that. The inline callbacks a driver needs to pass is mentioned above: Tx descriptor filling for XDP_TX, populating skb with the descriptor data for XDP_PASS, finalizing XDPSQs after the polling loop for XDP_TX (kicking the HW to start sending). The populate callback passes only &libeth_xdp_buff assuming buff::desc pointer is enough, plus you can always get the corresponding Rx queue structure via container_of(buff::rxq). If not, a driver can extend the buff with more fields directly on the stack without touching libeth_xdp definitions.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
3ef2b019 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add helpers for preparing/processing &libeth_xdp_buff
Add convenience helpers to build an &xdp_buff. This means: general initialization before the NAPI loop, adding head, adding frags e
libeth: xdp: add helpers for preparing/processing &libeth_xdp_buff
Add convenience helpers to build an &xdp_buff. This means: general initialization before the NAPI loop, adding head, adding frags etc. libeth_xdp_process_buff() is the same what everybody have in their drivers:
dma_sync_for_cpu();
if (!frag) { add_head(); prefetch(); } else { add_frag(); }
Note that I don't use net_prefetch(), sticking to the original prefetch(). In none of my tests prefetching 128 bytes yielded better perf than 64 bytes. That might differ if the headers are huge enough, but then additional tunneling etc. overhead takes place, you either way won't win a lot.
&libeth_xdp_stash is for cases when you exit the polling loop without finishing building the buff. If that happens, you need to store the buffer in the queue structure until the next loop and then restore it. It makes no sense to place a whole full &xdp_buff there. Define a minimal structure, which would store only the fields essential to restore it. I was able to pack it into 16 bytes, which is only 8 bytes bigger than `struct sk_buff *skb` on x64.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
819bbaef | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add XDPSQ cleanup timers
When XDP Tx queues are not interrupt-driven but use lazy cleaning, i.e. only when there are less than `threshold` free descriptors left, we also need cleanup ti
libeth: xdp: add XDPSQ cleanup timers
When XDP Tx queues are not interrupt-driven but use lazy cleaning, i.e. only when there are less than `threshold` free descriptors left, we also need cleanup timers to avoid &xdp_buff and &xdp_frame stall for too long, especially with Page Pool (it warns every about inflight pages every 60 second). Let's say we sent 256 frames and don't need to send more, but we clean only when the number of pending items >= 384. In that case, those 256 will stall until 128 more are sent. For this, add simple helpers to run a timer which will clean the queue regardless, after 1 second of the last send. The timer is triggered when finalizing the queue. As long as there is regular active traffic, the timer doesn't fire.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
c4ba6a9b | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add XDPSQ locking helpers
Unfortunately, it's not always possible to allocate max(num_rxqs, nr_cpu_ids) even on hi-end NICs. To mitigate this, add simple locking helpers to libeth_xdp.
libeth: xdp: add XDPSQ locking helpers
Unfortunately, it's not always possible to allocate max(num_rxqs, nr_cpu_ids) even on hi-end NICs. To mitigate this, add simple locking helpers to libeth_xdp. As long as XDPSQs are not shared, the whole functionality is gated behind a static lock. Otherwise, each bulk flush locks the queue for the time of cleaning and filling the descriptors. As long as this particular queue is not used by more than 1 CPU, the impact is minimal (runtime check for boolean twice per 16+ descriptors).
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # static key Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
26ce8eb0 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add XDPSQE completion helpers
Similarly to libeth_tx_complete(), add libeth_xdp_complete_tx() to handle XDP_TX and xmit buffers. Both use bulk return under the hood.
Also add out of li
libeth: xdp: add XDPSQE completion helpers
Similarly to libeth_tx_complete(), add libeth_xdp_complete_tx() to handle XDP_TX and xmit buffers. Both use bulk return under the hood.
Also add out of line libeth_tx_complete_any() which handles both regular and XDP frames (if libeth_xdp is loaded), for example, to call on queue destroy, where we don't need inlining but convenience.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
084ceda7 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add .ndo_xdp_xmit() helpers
Add helpers for implementing .ndo_xdp_xmit(). Same as for XDP_TX, accumulate up to 16 DMA-mapped frames on the stack, then flush. If DMA mapping is failed fo
libeth: xdp: add .ndo_xdp_xmit() helpers
Add helpers for implementing .ndo_xdp_xmit(). Same as for XDP_TX, accumulate up to 16 DMA-mapped frames on the stack, then flush. If DMA mapping is failed for some reason, don't try mapping further frames, but still flush what was already prepared. DMA address of a head frame is stored in its headroom, assuming it has enough of it for an 8 (or 4) byte value. In addition to @prep and @xmit driver callbacks in XDP_TX, xmit also needs @finalize to kick the XDPSQ after filling.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
8591c3af | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: xdp: add XDP_TX buffers sending
Start adding XDP-specific code to libeth, namely handling XDP_TX buffers (only sending). The idea is that we accumulate up to 16 buffers on the stack, then, i
libeth: xdp: add XDP_TX buffers sending
Start adding XDP-specific code to libeth, namely handling XDP_TX buffers (only sending). The idea is that we accumulate up to 16 buffers on the stack, then, if either the limit is reached or the polling is finished, flush them at once with only one XDPSQ cleaning (if needed). The main sending function will be aware of the sending budget and already have all the info to send the buffers, so it can't fail. Drivers need to provide 2 inline callbacks to the main sending function: for cleaning an XDPSQ and for filling descriptors; the library code takes care of the rest. Note that unlike the generic code, multi-buffer support is not wrapped here with unlikely() to not hurt header split setups.
&libeth_xdp_buff is a simple extension over &xdp_buff which has a direct pointer to the corresponding Rx descriptor (and, luckily, precisely 1 CL size and 16-byte alignment on x86_64).
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # xmit logic Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
35c64b65 | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: support native XDP and register memory model
Expand libeth's Page Pool functionality by adding native XDP support. This means picking the appropriate headroom and DMA direction. Also, regist
libeth: support native XDP and register memory model
Expand libeth's Page Pool functionality by adding native XDP support. This means picking the appropriate headroom and DMA direction. Also, register all the created &page_pools as XDP memory models. A driver then can call xdp_rxq_info_attach_page_pool() when registering its RxQ info.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
6ad5ff6e | 12-Jun-2025 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: convert to netmem
Back when the libeth Rx core was initially written, devmem was a draft and netmem_ref didn't exist in the mainline. Now that it's here, make libeth MP-agnostic before intro
libeth: convert to netmem
Back when the libeth Rx core was initially written, devmem was a draft and netmem_ref didn't exist in the mainline. Now that it's here, make libeth MP-agnostic before introducing any new code or any new library users. When it's known that the created PP/FQ is for header buffers, use faster "unsafe" underscored netmem <--> virt accessors as netmem_is_net_iov() is always false in that case, but consumes some cycles (bit test + true branch).
Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
e6c91556 | 18-Apr-2024 |
Alexander Lobakin <aleksander.lobakin@intel.com> |
libeth: add Rx buffer management
Add a couple intuitive helpers to hide Rx buffer implementation details in the library and not multiplicate it between drivers. The settings are sorta optimized for
libeth: add Rx buffer management
Add a couple intuitive helpers to hide Rx buffer implementation details in the library and not multiplicate it between drivers. The settings are sorta optimized for 100G+ NICs, but nothing really HW-specific here. Use the new page_pool_dev_alloc() to dynamically switch between split-page and full-page modes depending on MTU, page size, required headroom etc. For example, on x86_64 with the default driver settings each page is shared between 2 buffers. Turning on XDP (not in this series) -> increasing headroom requirement pushes truesize out of 2048 boundary, leading to that each buffer starts getting a full page. The "ceiling" limit is %PAGE_SIZE, as only order-0 pages are used to avoid compound overhead. For the above architecture, this means maximum linear frame size of 3712 w/o XDP. Not that &libeth_buf_queue is not a complete queue/ring structure for now, rather a shim, but eventually the libeth-enabled drivers will move to it, with iavf being the first one.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|