History log of /freebsd/sys/net/route.c (Results 1 – 25 of 691)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 3360a158 24-Oct-2024 Kyle Evans <kevans@FreeBSD.org>

net: route: convert routing statistics to a sysctl

Exporting the relevant pcpustat is trivial, so let's do that. We will
use it in a near-future change in netstat to avoid having to dig around
in m

net: route: convert routing statistics to a sysctl

Exporting the relevant pcpustat is trivial, so let's do that. We will
use it in a near-future change in netstat to avoid having to dig around
in mem(4) for live kernel statistics.

Differential Revision: https://reviews.freebsd.org/D47231

show more ...


Revision tags: release/13.4.0, release/14.1.0, release/13.3.0
# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl s

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix

show more ...


Revision tags: release/14.0.0
# 2ff63af9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .h pattern

Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/


Revision tags: release/13.2.0
# 19e43c16 27-Mar-2023 Alexander V. Chernikov <melifaro@FreeBSD.org>

netlink: add netlink KPI to the kernel by default

This change does the following:

Base Netlink KPIs (ability to register the family, parse and/or
write a Netlink message) are always present in the

netlink: add netlink KPI to the kernel by default

This change does the following:

Base Netlink KPIs (ability to register the family, parse and/or
write a Netlink message) are always present in the kernel. Specifically,
* Implementation of genetlink family/group registration/removal,
some base accessors (netlink_generic_kpi.c, 260 LoC) are compiled in
unconditionally.
* Basic TLV parser functions (netlink_message_parser.c, 507 LoC) are
compiled in unconditionally.
* Glue functions (netlink<>rtsock), malloc/core sysctl definitions
(netlink_glue.c, 259 LoC) are compiled in unconditionally.
* The rest of the KPI _functions_ are defined in the netlink_glue.c,
but their implementation calls a pointer to either the stub function
or the actual function, depending on whether the module is loaded or not.

This approach allows to have only 1k LoC out of ~3.7k LoC (current
sys/netlink implementation) in the kernel, which will not grow further.
It also allows for the generic netlink kernel customers to load
successfully without requiring Netlink module and operate correctly
once Netlink module is loaded.

Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D39269

show more ...


# 2c2b37ad 13-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

ifnet/API: Move struct ifnet definition to a <net/if_private.h>

Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail. Include it temporarily

ifnet/API: Move struct ifnet definition to a <net/if_private.h>

Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail. Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.

Reviewed by: glebius
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046

show more ...


# 3636a967 15-Dec-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

route: allow RTM_CHANGE notifications in rt_routemsg().

MFC after: 2 weeks


# 1bcd230f 03-Dec-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

netlink: add interface notification on link status / flags change.

* Add link-state change notifications by subscribing to ifnet_link_event.
In the Linux netlink model, link state is reported in 2

netlink: add interface notification on link status / flags change.

* Add link-state change notifications by subscribing to ifnet_link_event.
In the Linux netlink model, link state is reported in 2 places: first is
the IFLA_OPERSTATE, which stores state per RFC2863.
The second is an IFF_LOWER_UP interface flag. As many applications rely
on the latter, reserve 1 bit from if_flags, named as IFF_NETLINK_1.
This flag is mapped to IFF_LOWER_UP in the netlink headers. This is done
to avoid making applications think this flag is actually
supported / presented in non-netlink outputs.
* Add flag change notifications, by hooking into rt_ifmsg().
In the netlink model, notification should include the bitmask for the
change flags. Update rt_ifmsg() to include such bitmask.

Differential Revision: https://reviews.freebsd.org/D37597

show more ...


Revision tags: release/12.4.0, release/13.1.0
# 7e5bf684 20-Jan-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

netlink: add netlink support

Netlinks is a communication protocol currently used in Linux kernel to modify,
read and subscribe for nearly all networking state. Interfaces, addresses, routes,
firew

netlink: add netlink support

Netlinks is a communication protocol currently used in Linux kernel to modify,
read and subscribe for nearly all networking state. Interfaces, addresses, routes,
firewall, fibs, vnets, etc are controlled via netlink.
It is async, TLV-based protocol, providing 1-1 and 1-many communications.

The current implementation supports the subset of NETLINK_ROUTE
family. To be more specific, the following is supported:
* Dumps:
- routes
- nexthops / nexthop groups
- interfaces
- interface addresses
- neighbors (arp/ndp)
* Notifications:
- interface arrival/departure
- interface address arrival/departure
- route addition/deletion
* Modifications:
- adding/deleting routes
- adding/deleting nexthops/nexthops groups
- adding/deleting neghbors
- adding/deleting interfaces (basic support only)
* Rtsock interaction
- route events are bridged both ways

The implementation also supports the NETLINK_GENERIC family framework.

Implementation notes:
Netlink is implemented via loadable/unloadable kernel module,
not touching many kernel parts.
Each netlink socket uses dedicated taskqueue to support async operations
that can sleep, such as interface creation. All message processing is
performed within these taskqueues.

Compatibility:
Most of the Netlink data models specified above maps to FreeBSD concepts
nicely. Unmodified ip(8) binary correctly works with
interfaces, addresses, routes, nexthops and nexthop groups. Some
software such as net/bird require header-only modifications to compile
and work with FreeBSD netlink.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D36002
MFC after: 2 months

show more ...


# 000250be 08-Sep-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add abitity to set the protocol that installed route/nexthop.

Routing daemons such as bird need to know if they install certain route
so they can clean it up on startup, as a form of achie

routing: add abitity to set the protocol that installed route/nexthop.

Routing daemons such as bird need to know if they install certain route
so they can clean it up on startup, as a form of achieving consistent
state during the crash recovery.
Currently they use combination of routing flags (RTF_PROTO1) to detect
these routes when interacting via route(4) rtsock protocol.
Netlink protocol has a special "rtm_protocol" field that is filled and
checked by the route originator. To prepare for the upcoming netlink
introduction, add ability to record origing to both nexthops and
nexthop groups via <nhop|nhgrp>_<get|set>_origin() KPI. The actual
calls will be used in the followup commits.

MFC after: 1 month

show more ...


# 6d4f6e4c 09-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: make rib_add_redirect() use new nhop-based KPI

MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D36169


# 88a782fc 15-Aug-2022 Mateusz Guzik <mjg@FreeBSD.org>

routing: G/C rt_exportinfo declaration

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 036f1bc6 14-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: retire rib_lookup_info()

This function was added in pre-epoch era ( 9a1b64d5a0224 ) to
provide public rtentry access interface & hide rtentry internals.
The implementation is based on the

routing: retire rib_lookup_info()

This function was added in pre-epoch era ( 9a1b64d5a0224 ) to
provide public rtentry access interface & hide rtentry internals.
The implementation is based on the large on-stack copying and
refcounting of the referenced objects (ifa/ifp).
It has become obsolete after epoch & nexthop introduction. Convert
the last remaining user and remove the function itself.

Differential Revision: https://reviews.freebsd.org/D36197

show more ...


# 66230639 04-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: split nexthop creation and rtentry creation.

This change is required for the upcoming introduction of the next
nexhop-based operations KPI, as it will create rtentry and nexthops
at diffe

routing: split nexthop creation and rtentry creation.

This change is required for the upcoming introduction of the next
nexhop-based operations KPI, as it will create rtentry and nexthops
at different stages of route table modification.

Differential Revision: https://reviews.freebsd.org/D36072
MFC after: 2 weeks

show more ...


# 800c6846 29-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add nhop(9) kpi.

Differential Revision: https://reviews.freebsd.org/D35985
MFC after: 1 month


Revision tags: release/12.3.0
# 4b631fc8 07-Sep-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: fix source address selection rules for IPv4 over IPv6.

Current logic always selects an IFA of the same family from the
outgoing interfaces. In IPv4 over IPv6 setup there can be just
singl

routing: fix source address selection rules for IPv4 over IPv6.

Current logic always selects an IFA of the same family from the
outgoing interfaces. In IPv4 over IPv6 setup there can be just
single non-127.0.0.1 ifa, attached to the loopback interface.

Create a separate rt_getifa_family() to handle entire ifa selection
for the IPv4 over IPv6.

Differential Revision: https://reviews.freebsd.org/D31868
MFC after: 1 week

show more ...


# d98954e2 29-Aug-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: Bring back the ability to specify transmit interface via its name.

Some software references outgoing interfaces by specifying name instead of
index.

Use rti_ifp from rt_addrinfo if provid

routing: Bring back the ability to specify transmit interface via its name.

Some software references outgoing interfaces by specifying name instead of
index.

Use rti_ifp from rt_addrinfo if provided instead of always using
address interface when constructing nexthop.

PR: 255678
Reported by: martin.larsson2 at gmail.com
MFC after: 1 week

show more ...


# a7581946 23-Jun-2021 Rozhuk Ivan <rozhuk.im@gmail.com>

devctl: add ADDR_ADD and ADDR_DEL devctl event for IFNET

Add devd event on network iface address add/remove. Can be used to
automate actions on any address change.

Reviewed by: imp@ (and minor st

devctl: add ADDR_ADD and ADDR_DEL devctl event for IFNET

Add devd event on network iface address add/remove. Can be used to
automate actions on any address change.

Reviewed by: imp@ (and minor style tweaks)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30840

show more ...


# 8e8f1cc9 23-Apr-2021 Mark Johnston <markj@FreeBSD.org>

Re-enable network ioctls in capability mode

This reverts a portion of 274579831b61 ("capsicum: Limit socket
operations in capability mode") as at least rtsol and dhcpcd rely on
being able to configu

Re-enable network ioctls in capability mode

This reverts a portion of 274579831b61 ("capsicum: Limit socket
operations in capability mode") as at least rtsol and dhcpcd rely on
being able to configure network interfaces while in capability mode.

Reported by: bapt, Greg V
Sponsored by: The FreeBSD Foundation

show more ...


Revision tags: release/13.0.0
# 27457983 07-Apr-2021 Mark Johnston <markj@FreeBSD.org>

capsicum: Limit socket operations in capability mode

Capsicum did not prevent certain privileged networking operations,
specifically creation of raw sockets and network configuration ioctls.
However

capsicum: Limit socket operations in capability mode

Capsicum did not prevent certain privileged networking operations,
specifically creation of raw sockets and network configuration ioctls.
However, these facilities can be used to circumvent some of the
restrictions that capability mode is supposed to enforce.

Add capability mode checks to disallow network configuration ioctls and
creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET
internet sockets.

Reviewed by: oshogbo
Discussed with: emaste
Reported by: manu
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29423

show more ...


# b1d63265 08-Mar-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Flush remaining routes from the routing table during VNET shutdown.

Summary:
This fixes rtentry leak for the cloned interfaces created inside the
VNET.

PR: 253998
Reported by: rashey at superbox.p

Flush remaining routes from the routing table during VNET shutdown.

Summary:
This fixes rtentry leak for the cloned interfaces created inside the
VNET.

PR: 253998
Reported by: rashey at superbox.pl
MFC after: 3 days

Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown).
Thus, any route table operations are too late to schedule.
As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`.
It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish.

Test Plan:
```
set_skip:set_skip_group_lo -> passed [0.053s]
tail -n 200 /var/log/messages | grep rtentry
```

Reviewers: #network, kp, bz

Reviewed By: kp

Subscribers: imp, ae

Differential Revision: https://reviews.freebsd.org/D29116

show more ...


# 59641728 22-Feb-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Simplify ifa/ifp refcounting in the routing stack.

The routing stack control depends on quite a tree of functions to
determine the proper attributes of a route such as a source address (ifa)
or tr

Simplify ifa/ifp refcounting in the routing stack.

The routing stack control depends on quite a tree of functions to
determine the proper attributes of a route such as a source address (ifa)
or transmit ifp of a route.

When actually inserting a route, the stack needs to ensure that ifa and ifp
points to the entities that are still valid.
Validity means slightly more than just pointer validity - stack need guarantee
that the provided objects are not scheduled for deletion.

Currently, callers either ignore it (most ifp parts, historically) or try to
use refcounting (ifa parts). Even in case of ifa refcounting it's not always
implemented in fully-safe manner. For example, some codepaths inside
rt_getifa_fib() are referencing ifa while not holding any locks, resulting in
possibility of referencing scheduled-for-deletion ifa.

Instead of trying to fix all of the callers by enforcing proper refcounting,
switch to a different model.
As the rib_action() already requires epoch, do not require any stability guarantees
other than the epoch-provided one.
Use newly-added conditional versions of the refcounting functions
(ifa_try_ref(), if_try_ref()) and fail if any of these fails.

Reviewed by: donner
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28837

show more ...


# cb984c62 30-Jan-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix multipath support for rib_lookup_info().

The initial plan was to remove rib_lookup_info() before
FreeBSD 13. As several customers are still remaining,
fix rib_lookup_info() for the multipath u

Fix multipath support for rib_lookup_info().

The initial plan was to remove rib_lookup_info() before
FreeBSD 13. As several customers are still remaining,
fix rib_lookup_info() for the multipath use case.

show more ...


# 81728a53 09-Jan-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Split rtinit() into multiple functions.

rtinit[1]() is a function used to add or remove interface address prefix routes,
similar to ifa_maintain_loopback_route().
It was intended to be family-agno

Split rtinit() into multiple functions.

rtinit[1]() is a function used to add or remove interface address prefix routes,
similar to ifa_maintain_loopback_route().
It was intended to be family-agnostic. There is a problem with this approach
in reality.

1) IPv6 code does not use it for the ifa routes. There is a separate layer,
nd6_prelist_(), providing interface for maintaining interface routes. Its part,
responsible for the actual route table interaction, mimics rtenty() code.

2) rtinit tries to combine multiple actions in the same function: constructing
proper route attributes and handling iterations over multiple fibs, for the
non-zero net.add_addr_allfibs use case. It notably increases the code complexity.

3) dstaddr handling. flags parameter re-uses RTF_ flags. As there is no special flag
for p2p connections, host routes and p2p routes are handled in the same way.
Additionally, mapping IFA flags to RTF flags makes the interface pretty messy.
It make rtinit() to clash with ifa_mainain_loopback_route() for IPV4 interface
aliases.

4) rtinit() is the last customer passing non-masked prefixes to rib_action(),
complicating rib_action() implementation.

5) rtinit() coupled ifa announce/withdrawal notifications, producing "false positive"
ifa messages in certain corner cases.

To address all these points, the following has been done:

* rtinit() has been split into multiple functions:
- Route attribute construction were moved to the per-address-family functions,
dealing with (2), (3) and (4).
- funnction providing net.add_addr_allfibs handling and route rtsock notificaions
is the new routing table inteface.
- rtsock ifa notificaion has been moved out as well. resulting set of funcion are only
responsible for the actual route notifications.

Side effects:
* /32 alias does not result in interface routes (/32 route and "host" route)
* RTF_PINNED is now set for IPv6 prefixes corresponding to the interface addresses

Differential revision: https://reviews.freebsd.org/D28186

show more ...


# d68cf57b 07-Jan-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Refactor rt_addrmsg() and rt_routemsg().

Summary:
* Refactor rt_addrmsg(): make V_rt_add_addr_allfibs decision locally.
* Fix rt_routemsg() and multipath by accepting nexthop instead of interface po

Refactor rt_addrmsg() and rt_routemsg().

Summary:
* Refactor rt_addrmsg(): make V_rt_add_addr_allfibs decision locally.
* Fix rt_routemsg() and multipath by accepting nexthop instead of interface pointer.
* Refactor rtsock_routemsg(): avoid accessing rtentry fields directly.
* Simplify in_addprefix() by moving prefix search to a separate function.

Reviewers: #network

Subscribers: imp, ae, bz

Differential Revision: https://reviews.freebsd.org/D28011

show more ...


# f5baf8bb 25-Dec-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add modular fib lookup framework.

This change introduces framework that allows to dynamically
attach or detach longest prefix match (lpm) lookup algorithms
to speed up datapath route tables lookup

Add modular fib lookup framework.

This change introduces framework that allows to dynamically
attach or detach longest prefix match (lpm) lookup algorithms
to speed up datapath route tables lookups.

Framework takes care of handling initial synchronisation,
route subscription, nhop/nhop groups reference and indexing,
dataplane attachments and fib instance algorithm setup/teardown.
Framework features automatic algorithm selection, allowing for
picking the best matching algorithm on-the-fly based on the
amount of routes in the routing table.

Currently framework code is guarded under FIB_ALGO config option.
An idea is to enable it by default in the next couple of weeks.

The following algorithms are provided by default:
IPv4:
* bsearch4 (lockless binary search in a special IP array), tailored for
small-fib (<16 routes)
* radix4_lockless (lockless immutable radix, re-created on every rtable change),
tailored for small-fib (<1000 routes)
* radix4 (base system radix backend)
* dpdk_lpm4 (DPDK DIR24-8-based lookups), lockless datastrucure, optimized
for large-fib (D27412)
IPv6:
* radix6_lockless (lockless immutable radix, re-created on every rtable change),
tailed for small-fib (<1000 routes)
* radix6 (base system radix backend)
* dpdk_lpm6 (DPDK DIR24-8-based lookups), lockless datastrucure, optimized
for large-fib (D27412)

Performance changes:
Micro benchmarks (I7-7660U, single-core lookups, 2048k dst, code in D27604):
IPv4:
8 routes:
radix4: ~20mpps
radix4_lockless: ~24.8mpps
bsearch4: ~69mpps
dpdk_lpm4: ~67 mpps
700k routes:
radix4_lockless: 3.3mpps
dpdk_lpm4: 46mpps

IPv6:
8 routes:
radix6_lockless: ~20mpps
dpdk_lpm6: ~70mpps
100k routes:
radix6_lockless: 13.9mpps
dpdk_lpm6: 57mpps

Forwarding benchmarks:
+ 10-15% IPv4 forwarding performance (small-fib, bsearch4)
+ 25% IPv4 forwarding performance (full-view, dpdk_lpm4)
+ 20% IPv6 forwarding performance (full-view, dpdk_lpm6)

Control:
Framwork adds the following runtime sysctls:

List algos
* net.route.algo.inet.algo_list: bsearch4, radix4_lockless, radix4
* net.route.algo.inet6.algo_list: radix6_lockless, radix6, dpdk_lpm6
Debug level (7=LOG_DEBUG, per-route)
net.route.algo.debug_level: 5
Algo selection (currently only for fib 0):
net.route.algo.inet.algo: bsearch4
net.route.algo.inet6.algo: radix6_lockless

Support for manually changing algos in non-default fib will be added
soon. Some sysctl names will be changed in the near future.

Differential Revision: https://reviews.freebsd.org/D27401

show more ...


12345678910>>...28