a1eda741 | 03-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvmf: The in-kernel NVMe over Fabrics host
This is the client (initiator in SCSI terms) for NVMe over Fabrics. Userland is responsible for creating a set of queue pairs and then handing them off via
nvmf: The in-kernel NVMe over Fabrics host
This is the client (initiator in SCSI terms) for NVMe over Fabrics. Userland is responsible for creating a set of queue pairs and then handing them off via an ioctl to this driver, e.g. via the 'connect' command from nvmecontrol(8). An nvmeX new-bus device is created at the top-level to represent the remote controller similar to PCI nvmeX devices for PCI-express controllers.
As with nvme(4), namespace devices named /dev/nvmeXnsY are created and pass through commands can be submitted to either the namespace devices or the controller device. For example, 'nvmecontrol identify nvmeX' works for a remote Fabrics controller the same as for a PCI-express controller.
nvmf exports remote namespaces via nda(4) devices using the new NVMF CAM transport. nvmf does not support nvd(4), only nda(4).
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44714
show more ...
|
59144db3 | 03-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvmf_tcp: Add a TCP transport for NVMe over Fabrics
Structurally this is very similar to the TCP transport for iSCSI (icl_soft.c). One key difference is that NVMeoF transports use a more abstract i
nvmf_tcp: Add a TCP transport for NVMe over Fabrics
Structurally this is very similar to the TCP transport for iSCSI (icl_soft.c). One key difference is that NVMeoF transports use a more abstract interface working with NVMe commands rather than transport PDUs. Thus, the data transfer for a given command is managed entirely in the transport backend.
Similar to icl_soft.c, separate kthreads are used to handle transmit and receive for each queue pair. On the transmit side, when a capsule is transmitted by an upper layer, it is placed on a queue for processing by the transmit thread. The transmit thread converts command response capsules into suitable TCP PDUs where each PDU is described by an mbuf chain that is then queued to the backing socket's send buffer. Command capsules can embed data along with the NVMe command.
On the receive side, a socket upcall notifies the receive kthread when more data arrives. Once enough data has arrived for a PDU, the PDU is handled synchronously in the kthread. PDUs such as R2T or data related PDUs are handled internally, with callbacks invoked if a data transfer encounters an error, or once the data transfer has completed. Received capsule PDUs invoke the upper layer's capsule_received callback.
struct nvmf_tcp_command_buffer manages a TCP command buffer for data transfers that do not use in-capsule-data as described in the NVMeoF spec. Data related PDUs such as R2T, C2H, and H2C are associated with a command buffer except in the case of the send_controller_data transport method which simply constructs one or more C2H PDUs from the caller's mbuf chain.
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44712
show more ...
|