#
1fe7cd02 |
| 21-Jul-2024 |
Vladimir Kondratyev <wulf@FreeBSD.org> |
LinuxKPI: Remove owner argument from class_create function on KBI layer
To chase Linux 6.4
Sponsored by: Serenity Cyber Security, LLC Differential Revision: https://reviews.freebsd.org/D45849
|
#
19782e5b |
| 12-Jun-2024 |
Andrew Turner <andrew@FreeBSD.org> |
ibcore: Mark write-only variables
Some LinuxKPI lock macros pass need a flags field passed in. This is written to but never read from so gcc complains.
Fix this by marking the flags variables as un
ibcore: Mark write-only variables
Some LinuxKPI lock macros pass need a flags field passed in. This is written to but never read from so gcc complains.
Fix this by marking the flags variables as unused to quieten the compiler.
Reviewed by: brooks (earlier version), kib Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D45303
show more ...
|
Revision tags: release/14.1.0, release/13.3.0, release/14.0.0 |
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
Revision tags: release/13.2.0 |
|
#
3e142e07 |
| 09-Feb-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
ofed: Mechanically convert to IfAPI
Summary: Because of the intricacies of this code it wasn't purely scripted, but instead hand-mechanical.
Reviewed by: hselasky Sponsored by: Juniper Networks, In
ofed: Mechanically convert to IfAPI
Summary: Because of the intricacies of this code it wasn't purely scripted, but instead hand-mechanical.
Reviewed by: hselasky Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D38560
show more ...
|
Revision tags: release/12.4.0 |
|
#
c4a41550 |
| 30-May-2022 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
The algorithm pre-allocates a cm_id since allocation cannot be done while holding the cm.lock spinlock, however it doesn't free it on
ibcore: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
The algorithm pre-allocates a cm_id since allocation cannot be done while holding the cm.lock spinlock, however it doesn't free it on one error path, leading to a memory leak.
Linux commit: c14dfddbd869bf0c2bafb7ef260c41d9cebbcfec
PR: 264248 MFC after: 1 week Sponsored by: NVIDIA Networking
show more ...
|
Revision tags: release/13.1.0, release/12.3.0 |
|
#
b633e08c |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Kernel space update based on Linux 5.7-rc1.
Overview:
This is the first stage of a RDMA stack upgrade introducing kernel changes only based on Linux 5.7-rc1.
This patch is based on about f
ibcore: Kernel space update based on Linux 5.7-rc1.
Overview:
This is the first stage of a RDMA stack upgrade introducing kernel changes only based on Linux 5.7-rc1.
This patch is based on about four main areas of work: - Update of the IB uobjects system: - The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects is now managed by ibcore. This also require some changes in the kernel verbs API. The updated verbs changes are typically about initialize and deinitialize objects, and remove allocation and free of memory.
- Update of the uverbs IOCTL framework: - The parsing and handling of user-space commands has been completely refactored to integrate with the updated IB uobjects system.
- Various changes and updates to the generic uverbs interfaces in device drivers including the new uAPI surface.
- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.
Dependencies:
- The mlx4ib driver code has been updated with the minimum changes needed.
- The mlx5ib driver code has been updated with the minimum changes needed including DV support.
Compatibility:
- All user-space facing APIs are backwards compatible after this change.
- All kernel-space facing RDMA APIs are backwards compatible after this change, with exception of ib_create_ah() and ib_destroy_ah() which takes a new flag.
- The "ib_device_ops" structure exist, but only contains the driver ID and some structure sizes.
Differences from Linux:
- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set the sizes needed for allocating various IB objects, when adding IB device instances.
Security:
- PRIV_NET_RAW is needed to use raw ethernet transmit features. - PRIV_DRIVER is needed to use other privileged operations.
Based on upstream Linux, Torvalds (5.7-rc1): 8632e9b5645bbc2331d21d892b0d6961c1a08429
MFC after: 1 week Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31149 Sponsored by: NVIDIA Networking
show more ...
|
#
12a913d2 |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Issue DREQ when receiving REQ/REP for stale QP.
From "InfiBand Architecture Specifications Volume 1":
A QP is said to have a stale connection when only one side has connection informati
ibcore: Issue DREQ when receiving REQ/REP for stale QP.
From "InfiBand Architecture Specifications Volume 1":
A QP is said to have a stale connection when only one side has connection information. A stale connection may result if the remote CM had dropped the connection and sent a DREQ but the DREQ was never received by the local CM. Alternatively the remote CM may have lost all record of past connections because its node crashed and rebooted, while the local CM did not become aware of the remote node's reboot and therefore did not clean up stale connections.
And:
A local CM may receive a REQ/REP for a stale connection. It shall abort the connection issuing REJ to the REQ/REP. It shall then issue DREQ with "DREQ:remote QPN" set to the remote QPN from the REQ/REP.
This patch solves a problem with reuse of QPN. Current codebase, that is IPoIB, relies on a REAP-mechanism to do cleanup of the structures in CM. A problem with this is the timeconstants governing this mechanism; they are up to 768 seconds and the interface may look inresponsive in that period. Issuing a DREQ (and receiving a DREP) does the necessary cleanup and the interface comes up.
Linux commit: 9315bc9a133011fdb084f2626b86db3ebb64661f
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
2f054187 |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Fix memory leak in cm_req_handler error flows.
In the cm_req_handler() error flows, sometimes cm_id_priv->timewait_info isn't free'd.
Linux commit: 8b00914654ef56ff5473f4fe1f1168254dbb8a17
ibcore: Fix memory leak in cm_req_handler error flows.
In the cm_req_handler() error flows, sometimes cm_id_priv->timewait_info isn't free'd.
Linux commit: 8b00914654ef56ff5473f4fe1f1168254dbb8a17
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
f48e85df |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Move debug counters to be under relevant IB device
The sysfs layout is created by CM incorrectly presented RDMA devices with InfiniBand link layer. Layout of such devices represents device t
ibcore: Move debug counters to be under relevant IB device
The sysfs layout is created by CM incorrectly presented RDMA devices with InfiniBand link layer. Layout of such devices represents device tree of connections. By moving CM statistics to be under relevant port of IB device, we will fix the following issues:
* Symlink name - It used device name instead of specific identifier. * Target location - It was supposed to point to PCI-ID/infiniband_cm/ instead of PCI-ID/infiniband/ * Target name - It created extra device file under already existing device folder, e.g. mlx5_0/mlx5_0 * Crash during boot with RDMA persistent naming patches.
sysfs: cannot create duplicate filename '/class/infiniband_cm/mlx5_0' CPU: 29 PID: 433 Comm: modprobe Not tainted 5.0.0-rc5+ #178 Call Trace: dump_stack+0xcc/0x180 sysfs_warn_dup.cold.3+0x17/0x2d sysfs_do_create_link_sd.isra.2+0xd0/0xf0 device_add+0x7cb/0x1450 device_create_groups_vargs+0x1ae/0x220 device_create+0x93/0xc0 cm_add_one+0x38f/0xf60 [ib_cm] add_client_context+0x167/0x210 [ib_core] enable_device_and_get+0x230/0x3f0 [ib_core] ib_register_device+0x823/0xbf0 [ib_core] __mlx5_ib_add+0x45/0x150 [mlx5_ib] mlx5_ib_add+0x1b3/0x5e0 [mlx5_ib] mlx5_add_device+0x130/0x3a0 [mlx5_core] mlx5_register_interface+0x1a9/0x270 [mlx5_core] do_one_initcall+0x14f/0x5de do_init_module+0x247/0x7c0 load_module+0x4c2f/0x60d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe
After this change: [leonro@server ~]$ ls -al /sys/class/infiniband/ibp0s12f0/ports/1/ drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_duplicates drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_rx_msgs drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_msgs drwxr-xr-x 2 root root 0 Mar 11 11:17 cm_tx_retries
Linux commit: c87e65cfb97c7f325132a68288ed76ba7bdcd2c6
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
8d04583d |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Fix memory leak in cm_add/remove_one.
In the process of moving the debug counters sysfs entries, the commit mentioned below eliminated the cm_infiniband sysfs directory.
This sysfs director
ibcore: Fix memory leak in cm_add/remove_one.
In the process of moving the debug counters sysfs entries, the commit mentioned below eliminated the cm_infiniband sysfs directory.
This sysfs directory was tied to the cm_port object allocated in procedure cm_add_one().
Before the commit below, this cm_port object was freed via a call to kobject_put(port->kobj) in procedure cm_remove_port_fs().
Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated. This, however, meant that kfree was never called for the cm_port buffers.
Fix this by adding explicit kfree(port) calls to functions cm_add_one() and cm_remove_one().
Note that the kfree call in the first chunk below, in the cm_add_one error flow, fixes an old, undetected memory leak.
Linux commit: 94635c36f3854934a46d9e812e028d4721bbb0e6
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
0f2e5b43 |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Block processing of alternate path handling in RoCE RX CM messages.
Due to the below reasons, it is better to not support alternate path receive messages for RoCE in near term.
1. Alternate
ibcore: Block processing of alternate path handling in RoCE RX CM messages.
Due to the below reasons, it is better to not support alternate path receive messages for RoCE in near term.
1. Alternate path for RoCE is not supported at rdmacm layer. 2. It is not supported in uverbs/core layer for RoCE. 3. Alternate path for IPv6 for link local address cannot resolve route determinstically without a valid incoming interface ID whose usecase make sense only with dual port mode. 4. init_av_from_path while processing LAP messages for IB and RoCE can lead to adding duplicate entry of AV into the port list, leads to list corruption. 5. rdma-core userspace a well known userspace implementation has removed support of libucm which use ucm.ko module, which is the only module that can trigger alternate path related messages. 6. ucm kernel module is requested to be removed from the IB core in the following patch, https://patchwork.kernel.org/patch/10268503/ .
Linux commit: 97c45c2c28cd291e06778d9d36a0f60ee74726bc
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
bf5075e4 |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Store and restore ah_attr during LAP msg processing.
During CM LAP processing, ah_attr is reinitialized on receiving a LAP request. First likely during CM request processing.
ah_attr might
ibcore: Store and restore ah_attr during LAP msg processing.
During CM LAP processing, ah_attr is reinitialized on receiving a LAP request. First likely during CM request processing.
ah_attr might get zeroed out if LAP processing fails. Therefore, try to create a new ah_attr for the LAP message. If the initialization fails, continue with older ah_attr. If the initialization passes, consider the new ah_attr by overwriting the older one.
Linux commit: 0e225dcb7681c0a8e52fb9dc68bd8ab973de4ca2
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
e25bcf8d |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Add rdma_reject_msg() helper function.
rdma_reject_msg() returns a pointer to a string message associated with the transport reject reason codes.
Linux commit: 77a5db13153906a7e00740b10b273
ibcore: Add rdma_reject_msg() helper function.
rdma_reject_msg() returns a pointer to a string message associated with the transport reject reason codes.
Linux commit: 77a5db13153906a7e00740b10b2730e53385c5a8
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
315627b7 |
| 16-Jun-2021 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
ibcore: Remove unused and erroneous msg sequence encoding.
In cm_form_tid(), a two bit message sequence number is OR'ed into bit 31-30 of the lower TID value.
After Linux commit f06d26537559 ("IB/c
ibcore: Remove unused and erroneous msg sequence encoding.
In cm_form_tid(), a two bit message sequence number is OR'ed into bit 31-30 of the lower TID value.
After Linux commit f06d26537559 ("IB/cm: Randomize starting comm ID"), the local_id is XOR'ed with a 32-bit random value. Hence, bit 31-30 in the lower TID now has an arbitrarily value and it makes no sense to OR in the message sequence number.
Adding to that, the evolution in use of IDR routines in cm_alloc_id() has always had the possibility of returning a value with bit 30 set.
In addition, said bits are never checked.
Hence, remove the encoding and the corresponding enum.
Linux commit: 87a37ce9e400e40daee537ff95343e3c94743c6d
MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
show more ...
|
#
1411f52f |
| 04-Jun-2021 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
mlx4/OFED: replace the struct net_device with struct ifnet
Given all the code does operate on struct ifnet, the last step in this longer series of changes now is to rename struct net_device to struc
mlx4/OFED: replace the struct net_device with struct ifnet
Given all the code does operate on struct ifnet, the last step in this longer series of changes now is to rename struct net_device to struct ifnet (that is what it was defined to in the LinuxKPi code). While mlx4 and OFED are "shared" code the decision was made years ago to not write it based on the netdevice KPI but the native ifnet KPI for most of it. This commit simply spells this out and with that frees "struct netdevice" to be re-done on LinuxKPI to become a more native/mixed implementation over time as needed by, e.g., wireless drivers.
Sponsored by: The FreeBSD Foundation MFC after: 10 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D30515
show more ...
|
#
825b7d4c |
| 26-May-2021 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
OFED: migrate LinuxKPI net_device/ifnet macros into ofed
The LinuxKPI net_device actually is an ifnet; in order to further clean that up so we can extend "net_device" migrate the few macros left int
OFED: migrate LinuxKPI net_device/ifnet macros into ofed
The LinuxKPI net_device actually is an ifnet; in order to further clean that up so we can extend "net_device" migrate the few macros left into ofed and make sure the header is included in all files which need access to the macros.
Sponsored by: The FreeBSD Foundation MFC after: 12 days Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D30477
show more ...
|
Revision tags: release/13.0.0, release/12.2.0 |
|
#
1866c98e |
| 06-Jul-2020 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which order the clients are attached and detached
Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which order the clients are attached and detached. For example one IB client may use resources from another IB client. This can lead to a potential deadlock at shutdown. For example if the ipoib is unregistered after the ib_multicast client is detached, then if ipoib is using multicast addresses a deadlock may happen, because ib_multicast will wait for all its resources to be freed before returning from the remove method.
Fix this by using module_xxx_order() instead of module_xxx().
Differential Revision: https://reviews.freebsd.org/D23973 MFC after: 1 week Sponsored by: Mellanox Technologies
show more ...
|
Revision tags: release/11.4.0, release/12.1.0, release/11.3.0 |
|
#
67350cb5 |
| 09-Dec-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r340918 through r341763.
|
Revision tags: release/12.0.0 |
|
#
5ace00df |
| 05-Dec-2018 |
Slava Shwartsman <slavash@FreeBSD.org> |
ibcore: Fix sleeping in atomic when RoCE is used
A couple of places in the CM do
spin_lock_irq(&cm_id_priv->lock); ... if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
H
ibcore: Fix sleeping in atomic when RoCE is used
A couple of places in the CM do
spin_lock_irq(&cm_id_priv->lock); ... if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
However when the underlying transport is RoCE, this leads to a sleeping function being called with the lock held - the callchain is
cm_alloc_response_msg() -> ib_create_ah_from_wc() -> ib_init_ah_from_wc() -> rdma_addr_find_l2_eth_by_grh() -> rdma_resolve_ip()
and rdma_resolve_ip() starts out by doing
req = kzalloc(sizeof *req, GFP_KERNEL);
not to mention rdma_addr_find_l2_eth_by_grh() doing
wait_for_completion(&ctx.comp);
to wait for the task that rdma_resolve_ip() queues up.
Fix this by moving the AH creation out of the lock.
Linux commit: c76161181193985087cd716fdf69b5cb6cf9ee85
Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies
show more ...
|
#
af609745 |
| 05-Dec-2018 |
Slava Shwartsman <slavash@FreeBSD.org> |
ibcore: Always check return value from ib_init_ah_from_wc().
This prevents code from accepting RoCEv1 connections when only ROCEv2 is enabled and vice versa.
Linux commit: 0c4386ec77cfcd0ccbdbe8c2e
ibcore: Always check return value from ib_init_ah_from_wc().
This prevents code from accepting RoCEv1 connections when only ROCEv2 is enabled and vice versa.
Linux commit: 0c4386ec77cfcd0ccbdbe8c2e67dd3a49b2a4c7f
Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies
show more ...
|
#
9fc98100 |
| 05-Dec-2018 |
Slava Shwartsman <slavash@FreeBSD.org> |
ibcore: Add missing check for failure.
Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies
|
#
3af64f03 |
| 11-Sep-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r338392 through r338594.
|
#
7877f593 |
| 09-Sep-2018 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Introduce and use sgid_index in CM requests in ibcore.
For RoCE, when CM requests are received for RC and UD connections, netdevice of the incoming request is unavailable. Because of that CM request
Introduce and use sgid_index in CM requests in ibcore.
For RoCE, when CM requests are received for RC and UD connections, netdevice of the incoming request is unavailable. Because of that CM requests are always forwarded to init_net namespace.
Now that we have the GID index available, introduce SGID index in incoming CM requests and refer to the netdevice of it.
While at it fix some incorrect uses of init_net and make sure the rdma_create_id() function stores the VNET it is passed.
Based on linux commit: cee104334c98dd04e9dd4d9a4fa4784f7f6aada9
MFC after: 3 days Approved by: re (gjb) Sponsored by: Mellanox Technologies
show more ...
|
#
cda1e10c |
| 17-Jul-2018 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Use __FBSDID() for RCS tags in ibcore.
MFC after: 1 week Sponsored by: Mellanox Technologies
|
Revision tags: release/11.2.0 |
|
#
03ae76a6 |
| 07-Mar-2018 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Fix for use-after-free when using delayed work structures in ibcore.
It is not enough to cancel delayed work structures before freeing. Always cancel delayed work synchronously before freeing!
MFC
Fix for use-after-free when using delayed work structures in ibcore.
It is not enough to cancel delayed work structures before freeing. Always cancel delayed work synchronously before freeing!
MFC after: 1 week Sponsored by: Mellanox Technologies
show more ...
|