History log of /freebsd/sys/net/iflib.c (Results 126 – 150 of 370)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 56614414 17-Aug-2019 Eric Joyner <erj@FreeBSD.org>

iflib: add iflib_deregister to help cleanup on exit

Commit message by Jake:
The iflib_register function exists to allocate and setup some common
structures used by both iflib_device_register and ifl

iflib: add iflib_deregister to help cleanup on exit

Commit message by Jake:
The iflib_register function exists to allocate and setup some common
structures used by both iflib_device_register and iflib_pseudo_register.

There is no associated cleanup function used to undo the steps taken in
this function.

Both iflib_device_deregister and iflib_pseudo_deregister have some of
the necessary steps scattered in their flow. However, most of the
necessary cleanup is not done during the error path of
iflib_device_register and iflib_pseudo_register.

Some examples of missed cleanup include:

the ifp pointer is not free'd during error cleanup
the STATE and CTX locks are not destroyed during error cleanup
the vlan event handlers are not removed during error cleanup
media added to the ifmedia structure is not removed
the kobject reference is never deleted
Additionally, when initializing the kobject class reference counter is
increased even though kobj_init already increases it. This results in
the class never being free'd again because the reference count would
never hit zero even after all driver instances are unloaded.

To aid in proper cleanup, implement an iflib_deregister function that
goes through the reverse steps taken by iflib_register.

Call this function during the error cleanup for iflib_device_register
and iflib_pseudo_register. Additionally call the function in the
iflib_device_deregister and iflib_pseudo_deregister functions near the
end of their flow. This helps reduce code duplication and ensures that
proper steps are taken to cleanup allocations and references in both the
regular and error cleanup flows.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: shurd@, erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21005

show more ...


# 197c6798 01-Aug-2019 Eric Joyner <erj@FreeBSD.org>

iflib: Prevent kernel panic caused by loading driver with a specific interrupt configuration

If a device has only 1 MSI-X interrupt available and does not support either
MSI or legacy interrupts, if

iflib: Prevent kernel panic caused by loading driver with a specific interrupt configuration

If a device has only 1 MSI-X interrupt available and does not support either
MSI or legacy interrupts, iflib_device_register() will fail, leak memory and
MSI resources, and the driver will not load. Worse, if another iflib-using
driver tries to unload afterwards, a kernel panic will occur because the
previous failed iflib driver loead did not properly call "taskqgroup_detach()"
during it's cleanup.

This patch is band-aid for this situation -- don't try allocating MSI or legacy
interrupts if a single MSI-X interrupt was allocated, but fail to load instead.
As well, during the cleanup, properly call taskqgroup_detach() on the admin
task to prevent panics when other iflib drivers unload.

This whole interrupt allocation process actually needs re-doing to properly
support devices with only a single MSI-X interrupt, devices that only support
MSI-X, non-PCI devices, and multiple non-MSIX interrupts, as well.

Signed-off-by: Eric Joyner <erj@freebsd.org>

Reviewed by: marius@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D20747

show more ...


# 6a3f243b 01-Aug-2019 Eric Joyner <erj@FreeBSD.org>

iflib: remove kobject class reference increment

Commit message from Jake:
In iflib_register, the context is initialized as a kobject using the
device driver's "driver" kobject class. As part of this

iflib: remove kobject class reference increment

Commit message from Jake:
In iflib_register, the context is initialized as a kobject using the
device driver's "driver" kobject class. As part of this, the function
mistakenly increments the ref counter.

The ref counter is incremented twice, once in the code directly, and
once again by kobj_class_compile. However, there is no associated
decrement in the detach path. Because of this, the ref counter will
never go back down to zero, and thus the kobject method table will never
be released.

Remove this unnecessary reference count increment.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: jhb@, erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21125

show more ...


# a63915c2 28-Jul-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @r350386

Sponsored by: The FreeBSD Foundation


# 7f3f6aad 24-Jul-2019 Eric Joyner <erj@FreeBSD.org>

iflib: fix dangling device softc pointer

Commit text by Jake:
If a driver's IFDI_ATTACH_PRE function fails, the iflib_device_register
function will free the ctx pointer. However, it does not reset t

iflib: fix dangling device softc pointer

Commit text by Jake:
If a driver's IFDI_ATTACH_PRE function fails, the iflib_device_register
function will free the ctx pointer. However, it does not reset the
device softc pointer to NULL.

This will result in memory corruption as a future access to the now
invalid pointer will corrupt memory that is later allocated on top of
the same memory location.

The iflib_device_deregister function correctly resets the softc pointer
by using device_set_softc().

This clears up the invalid dangling pointer and prevents memory
corruption that could lead to a panic or undefined behavior if the
device's driver failed to attach.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: erj@, gallatin@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21003

show more ...


Revision tags: release/11.3.0
# 7f49ce7a 28-Jun-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @349476

Sponsored by: The FreeBSD Foundation


# c2c5d1e7 26-Jun-2019 Marius Strobl <marius@FreeBSD.org>

o In iflib_txq_drain():
- Remove desc_used, which is only ever written to.
- Remove a dead store to reclaimed.
- Don't recycle avail.
- Sort variables according to style(9).
These changes w

o In iflib_txq_drain():
- Remove desc_used, which is only ever written to.
- Remove a dead store to reclaimed.
- Don't recycle avail.
- Sort variables according to style(9).
These changes will make a subsequent commit easier to read.
o In iflib_tx_credits_update(), don't bother checking whether the
ift_txd_credits_update method pointer is NULL; _iflib_pre_assert()
asserts upfront that this method has been assigned and functions
like iflib_{fast_intr_rxtx,netmap_timer_adjust,txq_can_drain}()
and _task_fn_tx() were already unconditionally relying on the
method being callable.

show more ...


# e532a999 20-Jun-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @349234

Sponsored by: The FreeBSD Foundation


# 188adcb7 19-Jun-2019 Marko Zec <zec@FreeBSD.org>

V_ip6_forwarding and V_ipforwarding have been defined in ip6_var.h /
ip_var.h since at least 2008, so make use of those definitions here.

MFC after: 3 days


# 6aee0bfa 19-Jun-2019 Marko Zec <zec@FreeBSD.org>

Evaluating htons() at compile time is more efficient than doing ntohs()
at runtime. This change removes a dependency on a barrel shifter pass
before branch resolution, while reducing the instruction

Evaluating htons() at compile time is more efficient than doing ntohs()
at runtime. This change removes a dependency on a barrel shifter pass
before branch resolution, while reducing the instruction stream size
by 9 bytes on amd64.

MFC after: 3 days

show more ...


# d49e83ea 15-Jun-2019 Marius Strobl <marius@FreeBSD.org>

- Replace unused and only ever written to members of public iflib(9)
structs with placeholders (in the latter case, IFLIB_MAX_TX_BYTES
etc. are also only ever used for these write-only members if

- Replace unused and only ever written to members of public iflib(9)
structs with placeholders (in the latter case, IFLIB_MAX_TX_BYTES
etc. are also only ever used for these write-only members if at all,
so both these macros and members can just go). Using these spares
may render it possible to merge certain iflib(9) fixes to stable/12.
Otherwise, changes extending struct if_irq or struct if_shared_ctx
in any way would break KBI as instances of these are allocated by
the driver front-ends (by contrast, struct if_pkt_info as well as
struct if_softc_ctx instances are provided by iflib(9) and, thus,
may grow at least at the end without breaking KBI).
- Make the pvi_name in struct pci_vendor_info const char * as device
identifiers in hardware lookup tables aren't to be expected to ever
change at runtime.
- Similarly, make the pci_vendor_info_t of struct if_shared_ctx which
is used to point to the struct pci_vendor_info arrays provided by
the driver front-ends const.
- Remove the ETH_ADDR_LEN macro from iflib.h; this was duplicating
ETHER_ADDR_LEN of <net/ethernet.h> with iflib(9) actually only
consuming the latter macro.
- Make the name argument of iflib_io_tqg_attach(9) const, matching
the taskqgroup_attach_cpu(9) this function wraps as well as e. g.
iflib_config_gtask_init(9).
- Remove the orphaned iflib_qset_lock_get() prototype.
- Remove some extraneous empty lines.

show more ...


# 0269ae4c 06-Jun-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @348740

Sponsored by: The FreeBSD Foundation


# 668d6dbb 30-May-2019 Eric Joyner <erj@FreeBSD.org>

iflib: provide probe wrapper for vendor drivers

From Jake:
Vendor drivers that exist out-of-tree generally should return
BUS_PROBE_VENDOR from their device probe functions. This helps ensure
that a

iflib: provide probe wrapper for vendor drivers

From Jake:
Vendor drivers that exist out-of-tree generally should return
BUS_PROBE_VENDOR from their device probe functions. This helps ensure
that a vendor replacement driver will supersede the in-kernel driver for
a given device.

Currently, if a vendor wants to implement a driver based on iflib, it
will always report BUS_PROBE_DEFAULT.

Add a wrapper function, iflib_device_probe_vendor() which can be used in
place of iflib_device_probe(). This function will just return
BUS_PROBE_VENDOR whenever iflib_device_probe() would return
BUS_PROBE_DEFAULT.

While vendor drivers can already implement such a wrapper themselves,
providing it in the iflib.h header makes it easier for the vendor driver
to do the right thing.

Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: erj@, gallatin@, marius@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D20221

show more ...


# 7648bc9f 13-May-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @347527

Sponsored by: The FreeBSD Foundation


# afb77372 10-May-2019 Eric Joyner <erj@FreeBSD.org>

iflib: use default ntxd and nrxd when user value is not power of 2

From Jake:
A user may set a sysctl to override the default number of Tx or Rx
descriptors. However, certain calculations in the ifl

iflib: use default ntxd and nrxd when user value is not power of 2

From Jake:
A user may set a sysctl to override the default number of Tx or Rx
descriptors. However, certain calculations in the iflib core expect the
number of descriptors to be a power of 2.

Update _iflib_assert to verify that all of the shared context parameters
for the number of descriptors are powers of 2.

Modify iflib_reset_qvalues to check that the provided isc_nrxd value is
a power of 2. If it's not, print a warning message and then use the
default value.

An alternative might be to try rounding the number down instead.
However, this creates problems in case the rounded down value is below
the minimum value that the driver would support.

Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: marius@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D19880

show more ...


# 007b804f 08-May-2019 Marius Strobl <marius@FreeBSD.org>

Allow to build without INET and INET6 again after r347221.

Submitted by: cam


# 3d10e9ed 07-May-2019 Marius Strobl <marius@FreeBSD.org>

o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
_task_fn_tx() in addition to _task_fn_rx() if app

o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
_task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
TX throughput of EM-class devices on par with the MSI-X case and, thus,
close to wirespeed/pre-iflib(4) times again. [1]
Note that independently of the interrupt type, the UDP performance with
these MACs still is abysmal and nowhere near to where it was before the
conversion of em(4) to iflib(4).
o In iflib_init_locked(), announce which free list failed to set up.
o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
as the latter is valid with MSI-X only.
o Instead of adding the missing - and apparently convoluted enough that a
DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
make the checks fail gracefully rather than panic. This avoids invoking
the checks at runtime over and over again in iflib_fast_intr_rxtx() and
_task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
these functions more readable.
o In iflib_rx_structures_setup(), only initialize LRO resources if device
and driver have LRO capability in order to not waste memory. Also, free
the LRO resources again if setting them up fails for one of the queues.
However, don't bother invoking iflib_rx_sds_free() in that case because
iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
of failure via iflib_rx_structures_free(), but there definitely is some
asymmetry left to be fixed, though).
o Similarly, free LRO resources again in iflib_rx_structures_free().
o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
instead of panicing (but only in case of INVARIANTS). This is a follow-
up to r344132, as such driver bugs shouldn't be fatal.
o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
gracefully, too.
o Bring yet more sanity to iflib_msix_init():
- If the device doesn't provide enough MSI-X vectors or not all vectors
can be allocate so the expected number of queues in addition to admin
interrupts can't be supported, try MSI next (and then INTx) as proper
MSI-X vector distribution can't be assured in such cases. In essence,
this change brings r254008 forward to iflib(4). Also, this is the fix
alluded to in the commit message of r343934.
- If the MSI-X allocation has failed, don't prematurely announce MSI is
going to be used as the latter in fact may not be available either.
- When falling back to MSI, only release the MSI-X table resource again
if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
driver, in the first place.
o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.

PR: 235031 (likely) [1]
Reviewed by: shurd
Differential Revision: https://reviews.freebsd.org/D20175

show more ...


# 1722eeac 06-May-2019 Marius Strobl <marius@FreeBSD.org>

- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
- Remove the only ever written to ift_db_mtx_name member of struct iflib_txq.
- Remove the unused or only ever written t

- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
- Remove the only ever written to ift_db_mtx_name member of struct iflib_txq.
- Remove the unused or only ever written to ifr_size, ifr_cq_pidx, ifr_cq_gen
and ifr_lro_enabled members of struct iflib_rxq.
- Consistently spell DMA, RX and TX uppercase in comments, messages etc.
instead of mixing with some lowercase variants.
- Consistently use if_t instead of a mix of if_t and struct ifnet pointers.
- Bring the function comments of _iflib_fl_refill(), iflib_rx_sds_free() and
iflib_fl_setup() in line with reality.
- Judging problem reports, people are wondering what on earth messages like:
"TX(0) desc avail = 1024, pidx = 0"
are trying to indicate. Thus, extend this string to be more like that of
non-iflib(4) Ethernet MAC drivers, notifying about a watchdog timeout due
to which the interface will be reset.
- Take advantage of the M_HAS_VLANTAG macro.
- Use false/true rather than FALSE/TRUE for variables of type bool.
- Use FALLTHROUGH as advocated by style(9).

show more ...


# e2621d96 03-May-2019 Matt Macy <mmacy@FreeBSD.org>

Allow iflib drivers to pass a pointer to their own ifmedia structure.

Tested by: emaste@

Differential Revision: https://reviews.freebsd.org/D19946


# ce3da455 02-May-2019 Ed Maste <emaste@FreeBSD.org>

iflib: remove assertion that isc_capabilities is nonzero

It's atypical, but not invalid, for a driver to pass no capabilities.

Submitted by: Gerald Aryeetey <aryeeteygerald_rogers.com>
Reviewed by:

iflib: remove assertion that isc_capabilities is nonzero

It's atypical, but not invalid, for a driver to pass no capabilities.

Submitted by: Gerald Aryeetey <aryeeteygerald_rogers.com>
Reviewed by: shurd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20142

show more ...


# f154ece0 25-Apr-2019 Stephen Hurd <shurd@FreeBSD.org>

iflib: Better control over queue core assignment

By default, cores are now assigned to queues in a sequential
manner rather than all NICs starting at the first core. On a four-core
system with two N

iflib: Better control over queue core assignment

By default, cores are now assigned to queues in a sequential
manner rather than all NICs starting at the first core. On a four-core
system with two NICs each using two queue pairs, the nic:queue -> core
mapping has changed from this:

0:0 -> 0, 0:1 -> 1
1:0 -> 0, 1:1 -> 1

To this:

0:0 -> 0, 0:1 -> 1
1:0 -> 2, 1:1 -> 3

Additionally, a device can now be configured to use separate cores for TX
and RX queues.

Two new tunables have been added, dev.X.Y.iflib.separate_txrx and
dev.X.Y.iflib.core_offset. If core_offset is set, the NIC is not part
of the auto-assigned sequence.

Reviewed by: marius
MFC after: 2 weeks
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D20029

show more ...


# 6d49b41e 24-Apr-2019 Andrew Gallatin <gallatin@FreeBSD.org>

iflib: Add pfil hooks

As with mlx5en, the idea is to drop unwanted traffic as early
in receive as possible, before mbufs are allocated and anything
is passed up the stack. This can save considerabl

iflib: Add pfil hooks

As with mlx5en, the idea is to drop unwanted traffic as early
in receive as possible, before mbufs are allocated and anything
is passed up the stack. This can save considerable CPU time
when a machine is under a flooding style DOS attack.

The major change here is to remove the unneeded abstraction where
callers of rxd_frag_to_sd() get back a pointer to the mbuf ring, and
are responsible for NULL'ing that mbuf themselves. Now this happens
directly in rxd_frag_to_sd(), and it returns an mbuf. This allows us
to use the decision (and potentially mbuf) returned by the pfil
hooks. The driver can now recycle mbufs to avoid re-allocation when
packets are dropped.

Reviewed by: marius (shurd and erj also provided feedback)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19645

show more ...


# 1fd8c72c 17-Apr-2019 Kyle Evans <kevans@FreeBSD.org>

iflib: Use new ether_gen_addr, restricting addresses to that subset

Differential Revision: https://reviews.freebsd.org/D19587


# 415e34c4 29-Mar-2019 Alan Somers <asomers@FreeBSD.org>

MFHead@r345677


# 225eae1b 28-Mar-2019 Eric Joyner <erj@FreeBSD.org>

iflib: return ENETDOWN when the network device is down

From Jake:
iflib_if_transmit returns ENOBUFS when the device is down, or when the
link isn't active.

This was changed in r308792 from return (

iflib: return ENETDOWN when the network device is down

From Jake:
iflib_if_transmit returns ENOBUFS when the device is down, or when the
link isn't active.

This was changed in r308792 from return (0), so that the function
correctly reports an error that it was unable to transmit.

However, using ENOBUFS can cause some network applications to produce
the following or similar errors:

"ping: sendto: No buffer space available"

This is a bit confusing as the real cause of the issue is that the
network device is down.

Replace the ENOBUFS return with ENETDOWN to indicate more clearly that
the reason for the failure to send is due to the network device is
offline.

This will cause the error message to be reported as

"ping: sendto: Network is down"

Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: shurd@, sbruno@, bz@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D19652

show more ...


12345678910>>...15