History log of /freebsd/sys/net/if.c (Results 151 – 175 of 1301)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 8b3bc70a 08-Oct-2019 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r352764 through r353315.


# 1e80e4f2 08-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Remove epoch assertion from if_setlladdr(). Originally this function was
protected by IF_ADDR_LOCK(), which was a mutex, so that two simultaneous
if_setlladdr() can't execute. Later it was switched

Remove epoch assertion from if_setlladdr(). Originally this function was
protected by IF_ADDR_LOCK(), which was a mutex, so that two simultaneous
if_setlladdr() can't execute. Later it was switched to IF_ADDR_RLOCK(),
likely by a mistake. Later it was switched to NET_EPOCH_ENTER(). Then I
incorrectly added NET_EPOCH_ASSERT() here.

In reality ifp->if_addr never goes away and never changes its length. So,
doing bcopy() in it is always "safe", meaning it won't dereference a wrong
pointer or write into someone's else memory. Of course doing two bcopy() in
parallel would result in a mess of two addresses, but net epoch doesn't
protect against that, neither IF_ADDR_RLOCK() did.

So for now, just remove the assertion and leave for later a proper fix.

Reported by: markj

show more ...


# e9dc46cc 08-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

In DIAGNOSTIC block of if_delmulti_ifma_flags() enter the network epoch.
This quickly plugs the regression from r353292. The locking of multicast
definitely needs a broader review today...

Reported

In DIAGNOSTIC block of if_delmulti_ifma_flags() enter the network epoch.
This quickly plugs the regression from r353292. The locking of multicast
definitely needs a broader review today...

Reported by: pho, dhw

show more ...


# b8a6e03f 08-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Widen NET_EPOCH coverage.

When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex cover

Widen NET_EPOCH coverage.

When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by: gallatin, hselasky, cy, adrian, kristof
Differential Revision: https://reviews.freebsd.org/D19111

show more ...


# 204e2f30 07-Oct-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Factor out VNET shutdown check into an own vnet structure field.
Remove the now obsolete vnet_state field. This greatly simplifies the
detection of VNET shutdown and avoids code duplication.

Discuss

Factor out VNET shutdown check into an own vnet structure field.
Remove the now obsolete vnet_state field. This greatly simplifies the
detection of VNET shutdown and avoids code duplication.

Discussed with: bz@
MFC after: 1 week
Sponsored by: Mellanox Technologies

show more ...


# 668ee101 26-Sep-2019 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r352587 through r352763.


# dd902d01 25-Sep-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Add debugging facility EPOCH_TRACE that checks that epochs entered are
properly nested and warns about recursive entrances. Unlike with locks,
there is nothing fundamentally wrong with such use, the

Add debugging facility EPOCH_TRACE that checks that epochs entered are
properly nested and warns about recursive entrances. Unlike with locks,
there is nothing fundamentally wrong with such use, the intent of tracer
is to help to review complex epoch-protected code paths, and we mean the
network stack here.

Reviewed by: hselasky
Sponsored by: Netflix
Pull Request: https://reviews.freebsd.org/D21610

show more ...


# 0f80acb9 19-Sep-2019 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r352436 through r352536.


# 247cf566 17-Sep-2019 Konstantin Belousov <kib@FreeBSD.org>

Add SIOCGIFDOWNREASON.

The ioctl(2) is intended to provide more details about the cause of
the down for the link.

Eventually we might define a comprehensive list of codes for the
situations. But i

Add SIOCGIFDOWNREASON.

The ioctl(2) is intended to provide more details about the cause of
the down for the link.

Eventually we might define a comprehensive list of codes for the
situations. But interface also allows the driver to provide free-form
null-terminated ASCII string to provide arbitrary non-formalized
information. Sample implementation exists for mlx5(4), where the
string is fetched from firmware controlling the port.

Reviewed by: hselasky, rrs
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21527

show more ...


# 61c1328e 13-Sep-2019 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r352105 through r352307.


# 40b1c921 12-Sep-2019 Kyle Evans <kevans@FreeBSD.org>

SIOCSIFNAME: Do nothing if we're not actually changing

Instead of throwing EEXIST, just succeed if the name isn't actually
changing. We don't need to trigger departure or any of that because there's

SIOCSIFNAME: Do nothing if we're not actually changing

Instead of throwing EEXIST, just succeed if the name isn't actually
changing. We don't need to trigger departure or any of that because there's
no change from consumers' perspective.

PR: 240539
Reviewed by: brooks
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D21618

show more ...


# a63915c2 28-Jul-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @r350386

Sponsored by: The FreeBSD Foundation


Revision tags: release/11.3.0
# 0dbdf041 28-Jun-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Need to wait for epoch callbacks to complete before detaching a
network interface.

This particularly manifests itself when an INP has multicast options
attached during a network interface detach. Th

Need to wait for epoch callbacks to complete before detaching a
network interface.

This particularly manifests itself when an INP has multicast options
attached during a network interface detach. Then the IPv4 and IPv6
leave group call which results from freeing the multicast address, may
access a freed ifnet structure. These are the steps to reproduce:

service mdnsd onestart # installed from ports

ifconfig epair create
ifconfig epair0a 0/24 up
ifconfig epair0a destroy

Tested by: pho @
MFC after: 1 week
Sponsored by: Mellanox Technologies

show more ...


# 0269ae4c 06-Jun-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @348740

Sponsored by: The FreeBSD Foundation


# fb3bc596 25-May-2019 John Baldwin <jhb@FreeBSD.org>

Restructure mbuf send tags to provide stronger guarantees.

- Perform ifp mismatch checks (to determine if a send tag is allocated
for a different ifp than the one the packet is being output on), i

Restructure mbuf send tags to provide stronger guarantees.

- Perform ifp mismatch checks (to determine if a send tag is allocated
for a different ifp than the one the packet is being output on), in
ip_output() and ip6_output(). This avoids sending packets with send
tags to ifnet drivers that don't support send tags.

Since we are now checking for ifp mismatches before invoking
if_output, we can now try to allocate a new tag before invoking
if_output sending the original packet on the new tag if allocation
succeeds.

To avoid code duplication for the fragment and unfragmented cases,
add ip_output_send() and ip6_output_send() as wrappers around
if_output and nd6_output_ifp, respectively. All of the logic for
setting send tags and dealing with send tag-related errors is done
in these wrapper functions.

For pseudo interfaces that wrap other network interfaces (vlan and
lagg), wrapper send tags are now allocated so that ip*_output see
the wrapper ifp as the ifp in the send tag. The if_transmit
routines rewrite the send tags after performing an ifp mismatch
check. If an ifp mismatch is detected, the transmit routines fail
with EAGAIN.

- To provide clearer life cycle management of send tags, especially
in the presence of vlan and lagg wrapper tags, add a reference count
to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
Provide a helper function (m_snd_tag_init()) for use by drivers
supporting send tags. m_snd_tag_init() takes care of the if_ref
on the ifp meaning that code alloating send tags via if_snd_tag_alloc
no longer has to manage that manually. Similarly, m_snd_tag_rele
drops the refcount on the ifp after invoking if_snd_tag_free when
the last reference to a send tag is dropped.

This also closes use after free races if there are pending packets in
driver tx rings after the socket is closed (e.g. from tcpdrop).

In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
Drivers now also check this flag instead of checking snd_tag against
NULL. This avoids false positive matches when a forwarded packet
has a non-NULL rcvif that was treated as a send tag.

- cxgbe was relying on snd_tag_free being called when the inp was
detached so that it could kick the firmware to flush any pending
work on the flow. This is because the driver doesn't require ACK
messages from the firmware for every request, but instead does a
kind of manual interrupt coalescing by only setting a flag to
request a completion on a subset of requests. If all of the
in-flight requests don't have the flag when the tag is detached from
the inp, the flow might never return the credits. The current
snd_tag_free command issues a flush command to force the credits to
return. However, the credit return is what also frees the mbufs,
and since those mbufs now hold references on the tag, this meant
that snd_tag_free would never be called.

To fix, explicitly drop the mbuf's reference on the snd tag when the
mbuf is queued in the firmware work queue. This means that once the
inp's reference on the tag goes away and all in-flight mbufs have
been queued to the firmware, tag's refcount will drop to zero and
snd_tag_free will kick in and send the flush request. Note that we
need to avoid doing this in the middle of ethofld_tx(), so the
driver grabs a temporary reference on the tag around that loop to
defer the free to the end of the function in case it sends the last
mbuf to the queue after the inp has dropped its reference on the
tag.

- mlx5 preallocates send tags and was using the ifp pointer even when
the send tag wasn't in use. Explicitly use the ifp from other data
structures instead.

- Sprinkle some assertions in various places to assert that received
packets don't have a send tag, and that other places that overwrite
rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.

Reviewed by: gallatin, hselasky, rgrimes, ae
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20117

show more ...


# e2e050c8 20-May-2019 Conrad Meyer <cem@FreeBSD.org>

Extract eventfilter declarations to sys/_eventfilter.h

This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces hea

Extract eventfilter declarations to sys/_eventfilter.h

This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.

show more ...


# 2ad7ed6e 19-May-2019 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix rt_ifa selection during loopback route insertion process.
Currently such routes are added with a link-level IFA, which is
plain wrong. Only after the insertion they get fixed by the special

Fix rt_ifa selection during loopback route insertion process.
Currently such routes are added with a link-level IFA, which is
plain wrong. Only after the insertion they get fixed by the special
link_rtrequest() ifa handler. This behaviour complicates routing code
and makes ifa selection more complex.
Streamline this process by explicitly moving link_rtrequest() logic
to the pre-insertion rt_getifa_fib() ifa selector. Avoid calling all
this logic in the loopback route case by explicitly specifying
proper rt_ifa inside the ifa_maintain_loopback_route().§

MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20076

show more ...


# 7648bc9f 13-May-2019 Alan Somers <asomers@FreeBSD.org>

MFHead @347527

Sponsored by: The FreeBSD Foundation


# 7687707d 22-Apr-2019 Andrew Gallatin <gallatin@FreeBSD.org>

Track device's NUMA domain in ifnet & alloc ifnet from NUMA local memory

This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets. When called with a domain on a NUMA ma

Track device's NUMA domain in ifnet & alloc ifnet from NUMA local memory

This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets. When called with a domain on a NUMA machine,
ifalloc_domain() will record the NUMA domain in the ifnet, and it will
allocate the ifnet struct from memory which is local to that NUMA
node. Similarly, if_alloc_dev() is a wrapper for if_alloc_domain
which uses a driver supplied device_t to call ifalloc_domain() with
the appropriate domain.

Note that the new if_numa_domain field fits in an alignment pad in
struct ifnet, and so does not alter the size of the structure.

Reviewed by: glebius, kib, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19930

show more ...


# 88148a07 22-Jan-2019 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r343202 through r343319.


# bc6f170e 22-Jan-2019 Brooks Davis <brooks@FreeBSD.org>

Rework CASE_IOC_IFGROUPREQ() to require a case before the macro.

This is more compatible with formatting tools and looks more normal.

Reported by: jhb (on a different review)
Sponsored by: DARPA, A

Rework CASE_IOC_IFGROUPREQ() to require a case before the macro.

This is more compatible with formatting tools and looks more normal.

Reported by: jhb (on a different review)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18442

show more ...


# a68cc388 09-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanical cleanup of epoch(9) usage in network stack.

- Remove macros that covertly create epoch_tracker on thread stack. Such
macros a quite unsafe, e.g. will produce a buggy code if same macro

Mechanical cleanup of epoch(9) usage in network stack.

- Remove macros that covertly create epoch_tracker on thread stack. Such
macros a quite unsafe, e.g. will produce a buggy code if same macro is
used in embedded scopes. Explicitly declare epoch_tracker always.

- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
locking macros to what they actually are - the net_epoch.
Keeping them as is is very misleading. They all are named FOO_RLOCK(),
while they no longer have lock semantics. Now they allow recursion and
what's more important they now no longer guarantee protection against
their companion WLOCK macros.
Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.

This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.

Discussed with: jtl, gallatin

show more ...


# 21ee61a6 09-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Remove part of comment that doesn't match reality.


# 67350cb5 09-Dec-2018 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r340918 through r341763.


Revision tags: release/12.0.0
# 85107355 30-Nov-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Adapt the fix in r341008 to correctly work with EBR.

IFNET_RLOCK_NOSLEEP() is epoch_enter_preempt() in FreeBSD 12+. Holding
it in sysctl_rtsock() doesn't protect us from ifnet unlinking, because
unl

Adapt the fix in r341008 to correctly work with EBR.

IFNET_RLOCK_NOSLEEP() is epoch_enter_preempt() in FreeBSD 12+. Holding
it in sysctl_rtsock() doesn't protect us from ifnet unlinking, because
unlinking occurs with IFNET_WLOCK(), that is rw_wlock+sx_xlock, and it
doesn check that concurrent code is running in epoch section. But while
we are in epoch section, we should be able to do access to ifnet's
fields, even it was unlinked. Thus do not change if_addr and if_hw_addr
fields in ifnet_detach_internal() to NULL, since rtsock code can do
access to these fields and this is allowed while it is running in epoch
section.

This should fix the race, when ifnet_detach_internal() unlinks ifnet
after we checked it for IFF_DYING in sysctl_dumpentry.

Move free(ifp->if_hw_addr) into ifnet_free_internal(). Also remove the
NULL check for ifp->if_description, since free(9) can correctly handle
NULL pointer.

MFC after: 1 week

show more ...


12345678910>>...53