/freebsd/sys/dev/ena/ |
H A D | ena_sysctl.h | diff 04cf2b885d7dc385ed8e48df1d0218b5e4162869 Thu May 07 13:28:39 CEST 2020 Marcin Wojtas <mw@FreeBSD.org> Optimize ENA Rx refill for low memory conditions
Sometimes, especially when there is not much memory in the system left, allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time and it is not guaranteed that it'll succeed. In that situation, the fallback will work, but if the refill needs to take a place for a lot of descriptors at once, the time spent in m_getjcl looking for memory can cause system unresponsiveness due to high priority of the Rx task. This can also lead to driver reset, because Tx cleanup routine is being blocked and timer service could detect that Tx packets aren't cleaned up. The reset routine can further create another unresponsiveness - Rx rings are being refilled there, so m_getjcl will again burn the CPU. This was causing NVMe driver timeouts and resets, because network driver is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are enough - ENA MTU is being set to 9K anyway, so it's very unlikely that more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the page size mbuf cluster will be used for the Rx descriptors. This can have a small (~2%) impact on the throughput of the device, so to restore original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to "1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver was updated to v2.1.2.
Submitted by: cperciva Reviewed by: Michal Krawczyk <mk@semihalf.com> Reviewed by: Ido Segev <idose@amazon.com> Reviewed by: Guy Tzalik <gtzalik@amazon.com> MFC after: 3 days PR: 225791, 234838, 235856, 236989, 243531 Differential Revision: https://reviews.freebsd.org/D24546
|
H A D | ena_sysctl.c | diff 04cf2b885d7dc385ed8e48df1d0218b5e4162869 Thu May 07 13:28:39 CEST 2020 Marcin Wojtas <mw@FreeBSD.org> Optimize ENA Rx refill for low memory conditions
Sometimes, especially when there is not much memory in the system left, allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time and it is not guaranteed that it'll succeed. In that situation, the fallback will work, but if the refill needs to take a place for a lot of descriptors at once, the time spent in m_getjcl looking for memory can cause system unresponsiveness due to high priority of the Rx task. This can also lead to driver reset, because Tx cleanup routine is being blocked and timer service could detect that Tx packets aren't cleaned up. The reset routine can further create another unresponsiveness - Rx rings are being refilled there, so m_getjcl will again burn the CPU. This was causing NVMe driver timeouts and resets, because network driver is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are enough - ENA MTU is being set to 9K anyway, so it's very unlikely that more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the page size mbuf cluster will be used for the Rx descriptors. This can have a small (~2%) impact on the throughput of the device, so to restore original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to "1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver was updated to v2.1.2.
Submitted by: cperciva Reviewed by: Michal Krawczyk <mk@semihalf.com> Reviewed by: Ido Segev <idose@amazon.com> Reviewed by: Guy Tzalik <gtzalik@amazon.com> MFC after: 3 days PR: 225791, 234838, 235856, 236989, 243531 Differential Revision: https://reviews.freebsd.org/D24546
|
H A D | ena.h | diff 04cf2b885d7dc385ed8e48df1d0218b5e4162869 Thu May 07 13:28:39 CEST 2020 Marcin Wojtas <mw@FreeBSD.org> Optimize ENA Rx refill for low memory conditions
Sometimes, especially when there is not much memory in the system left, allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time and it is not guaranteed that it'll succeed. In that situation, the fallback will work, but if the refill needs to take a place for a lot of descriptors at once, the time spent in m_getjcl looking for memory can cause system unresponsiveness due to high priority of the Rx task. This can also lead to driver reset, because Tx cleanup routine is being blocked and timer service could detect that Tx packets aren't cleaned up. The reset routine can further create another unresponsiveness - Rx rings are being refilled there, so m_getjcl will again burn the CPU. This was causing NVMe driver timeouts and resets, because network driver is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are enough - ENA MTU is being set to 9K anyway, so it's very unlikely that more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the page size mbuf cluster will be used for the Rx descriptors. This can have a small (~2%) impact on the throughput of the device, so to restore original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to "1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver was updated to v2.1.2.
Submitted by: cperciva Reviewed by: Michal Krawczyk <mk@semihalf.com> Reviewed by: Ido Segev <idose@amazon.com> Reviewed by: Guy Tzalik <gtzalik@amazon.com> MFC after: 3 days PR: 225791, 234838, 235856, 236989, 243531 Differential Revision: https://reviews.freebsd.org/D24546
|
H A D | ena.c | diff 04cf2b885d7dc385ed8e48df1d0218b5e4162869 Thu May 07 13:28:39 CEST 2020 Marcin Wojtas <mw@FreeBSD.org> Optimize ENA Rx refill for low memory conditions
Sometimes, especially when there is not much memory in the system left, allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time and it is not guaranteed that it'll succeed. In that situation, the fallback will work, but if the refill needs to take a place for a lot of descriptors at once, the time spent in m_getjcl looking for memory can cause system unresponsiveness due to high priority of the Rx task. This can also lead to driver reset, because Tx cleanup routine is being blocked and timer service could detect that Tx packets aren't cleaned up. The reset routine can further create another unresponsiveness - Rx rings are being refilled there, so m_getjcl will again burn the CPU. This was causing NVMe driver timeouts and resets, because network driver is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are enough - ENA MTU is being set to 9K anyway, so it's very unlikely that more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the page size mbuf cluster will be used for the Rx descriptors. This can have a small (~2%) impact on the throughput of the device, so to restore original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to "1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver was updated to v2.1.2.
Submitted by: cperciva Reviewed by: Michal Krawczyk <mk@semihalf.com> Reviewed by: Ido Segev <idose@amazon.com> Reviewed by: Guy Tzalik <gtzalik@amazon.com> MFC after: 3 days PR: 225791, 234838, 235856, 236989, 243531 Differential Revision: https://reviews.freebsd.org/D24546
|