History log of /freebsd/sys/net/route/route_var.h (Results 1 – 25 of 49)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: release/14.0.0
# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


Revision tags: release/13.2.0, release/12.4.0
# fe05d1dd 29-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: extend nhop(9) kpi

* add nhop_get_unlinked() used to prepare referenced but not
linked nexthop, that can later be used as a clone source.
* add nhop_check_gateway() to check for allowed ad

routing: extend nhop(9) kpi

* add nhop_get_unlinked() used to prepare referenced but not
linked nexthop, that can later be used as a clone source.
* add nhop_check_gateway() to check for allowed address family
combinations between the rib family and neighbor family (useful
for 4o6 or direct routes)
* add nhop_set_upper_family() to allow copying IPv6 nexthops to
IPv4 rib.
* add rt_get_rnd() wrapper, returning both nexthop/group and its
weight attached to the rtentry.
* Add CHT_SLIST_FOREACH_SAFE(), allowing to delete items during
iteration.

MFC after: 2 weeks

show more ...


# db4ca190 29-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add ability to store opaque indentifiers in nhops/nhgs

This is a pre-requisite for the direct nexthop/nexhop group operations
via netlink.

MFC after: 2 weeks


# 40503b79 07-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: populate fibs with interface routes after growing net.fibs.

Currently it is possible to extend number of fibs in runtime, but this
functionality is of limited use when net.add_addrs_all_fi

routing: populate fibs with interface routes after growing net.fibs.

Currently it is possible to extend number of fibs in runtime, but this
functionality is of limited use when net.add_addrs_all_fibs is
non-zero, as the routing tables are created empty.

This change automatically populate newly-created fibs with the kernel-originated
interface routes (filtered by RTF_PINNED flag) if net.add_addrs_all_fibs
is set.

```
-> sysctl net.add_addr_allfibs=1
net.add_addr_allfibs: 0 -> 1
-> sysctl net.fibs
net.fibs: 2
-> sysctl net.fibs=3
net.fibs: 2 -> 3

BEFORE:
-> setfib 2 netstat -rn
Routing tables (fib: 2)

AFTER:
-> setfib 2 netstat -rn
Routing tables (fib: 2)

Internet:
Destination Gateway Flags Netif Expire
10.0.0.0/24 link#1 U vtnet0
10.0.0.5 link#1 UHS lo0
127.0.0.1 link#2 UH lo0

Internet6:
Destination Gateway Flags Netif Expire
::1 link#2 UHS lo0
2a01:4f9:3a:fa00::/64 link#1 U vtnet0
2a01:4f9:3a:fa00:5054:ff:fe15:4a3b link#1 UHS lo0
fe80::%vtnet0/64 link#1 U vtnet0
fe80::5054:ff:fe15:4a3b%vtnet0 link#1 UHS lo0
fe80::%lo0/64 link#2 U lo0
fe80::1%lo0 link#2 UHS lo0
```

Differential Revision: https://reviews.freebsd.org/D36075
MFC after: 1 month

show more ...


# 5c4d2252 08-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: move rtentry and subscription code out of route_ctl.c

route_ctl.c size has grown considerably since initial introduction.
Factor out non-relevant parts:
* all rtentry logic, such as creatio

routing: move rtentry and subscription code out of route_ctl.c

route_ctl.c size has grown considerably since initial introduction.
Factor out non-relevant parts:
* all rtentry logic, such as creation/destruction and accessors
goes to net/route/route_rtentry.c
* all rtable subscription logic goes to net/route/route_subscription.c

Differential Revision: https://reviews.freebsd.org/D36074
MFC after: 1 month

show more ...


# dedeec11 03-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: refactor #2

* Use same filter func (rib_filter_f_t) for nexhtop groups to
simplify callbacks.
* simplify conditional route deletion & remove the need to pass
rt_addrinfo to the low-level

routing: refactor #2

* Use same filter func (rib_filter_f_t) for nexhtop groups to
simplify callbacks.
* simplify conditional route deletion & remove the need to pass
rt_addrinfo to the low-level deletion functions
* speedup rib_walk_del() by removing an additional per-prefix lookup

Differential Revision: https://reviews.freebsd.org/D36071
MFC after: 1 month

show more ...


# 0d60e88b 02-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: refactor control cmds #1

This and the follow-up routing-related changes target to remove or
reduce `struct rt_addrinfo` usage and use recently-landed nhop(9)
KPI instead.
Traditionally `r

routing: refactor control cmds #1

This and the follow-up routing-related changes target to remove or
reduce `struct rt_addrinfo` usage and use recently-landed nhop(9)
KPI instead.
Traditionally `rt_addrinfo` structure has been used to propagate all necessary
information between the protocol/rtsock and a routing layer. Many
functions inside routing subsystem uses it internally. However, using
this structure became somewhat complicated, as there are too many ways
of specifying a single state and verifying data consistency is hard.
For example, arerouting flgs consistent with mask/gateway sockaddr pointers?
Is mask really a host mask? Are sockaddr "valid" (e.g. properly zeroed, masked,
have proper length)? Are they mutable? Is the suggested interface specified
by the interface index embedded into the sockadd_dl gateway, or passed
as RTAX_IFP parameter, or directly provided by rti_ifp or it needs to
be derived from the ifa?
These (and other similar) questions have to be considered every time when
a function has `rt_addrinfo` pointer as an argument.

The new approach is to bring more control back to the protocols and
construct the desired routing objects themselves - in the end, it's the
protocol/subsystem who knows the desired outcome.

This specific diff changes the following:
* add explicit basic low-level radix operations:
add_route() (renamed from add_route_nhop())
delete_route() (factored from change_route_nhop())
change_route() (renamed from change_route_nhop)
* remove "info" parameter from change_route_conditional() as a part
of reducing rt_addrinfo usage in the internal KPIs
* add lookup_prefix_rt() wrapper for doing re-lookups after
RIB lock/unlock

Differential Revision: https://reviews.freebsd.org/D36070
MFC after: 2 weeks

show more ...


# ae6bfd12 01-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: refactor private KPI
* Make nhgrp_get_nhops() return const struct weightened_nhop to
indicate that the list is immutable
* Make nhgrp_get_group() return the actual group, instead of
group+

routing: refactor private KPI
* Make nhgrp_get_nhops() return const struct weightened_nhop to
indicate that the list is immutable
* Make nhgrp_get_group() return the actual group, instead of
group+weight.

MFC after: 2 weeks

show more ...


# 5c23343b 29-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: convert remnants of DPRINTF to FIB_CTL_LOG().

Convert the last remaining pieces of old-style debug messages
to the new debugging framework.

Differential Revision: https://reviews.freebsd.

routing: convert remnants of DPRINTF to FIB_CTL_LOG().

Convert the last remaining pieces of old-style debug messages
to the new debugging framework.

Differential Revision: https://reviews.freebsd.org/D35994
MFC after: 2 weeks

show more ...


# 800c6846 29-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add nhop(9) kpi.

Differential Revision: https://reviews.freebsd.org/D35985
MFC after: 1 month


# 29029b06 28-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: remove info argument from add/change_route_nhop().

Currently, rt_addrinfo(info) serves as a main "transport" moving
state between various functions inside the routing subsystem.
As all of

routing: remove info argument from add/change_route_nhop().

Currently, rt_addrinfo(info) serves as a main "transport" moving
state between various functions inside the routing subsystem.
As all of the fields are filled in directly by the customers, it
is problematic to maintain consistency, resulting in repeated checks
inside many functions. Additionally, there are multiple ways of
specifying the same value (RTAX_IFP vs rti_ifp / rti_ifa) and so on.
With the upcoming nhop(9) kpi it is possible to store all of the
required state in the nexthops in the consistent fashion, reducing the
need to use "info" in the KPI calls.
Finally, rt_addrinfo structure format was derived from the rtsock wire
format, which is different from other kernel routing users or netlink.

This cleanup simplifies upcoming nhop(9) kpi and netlink introduction.

Reviewed by: zlei.huang@gmail.com
Differential Revision: https://reviews.freebsd.org/D35972
MFC after: 2 weeks

show more ...


# 2717e958 28-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: move route expiration time to its nexthop

Expiration time is actually a path property, not a route property.
Move its storage to nexthop to simplify upcoming nhop(9) KPI changes
and netlin

routing: move route expiration time to its nexthop

Expiration time is actually a path property, not a route property.
Move its storage to nexthop to simplify upcoming nhop(9) KPI changes
and netlink introduction.

Differential Revision: https://reviews.freebsd.org/D35970
MFC after: 2 weeks

show more ...


Revision tags: release/13.1.0, release/12.3.0
# 8a0d57ba 25-Apr-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

[fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.

Currently, most of the rib(9) KPI does not use rnh pointers, using
fibnum and family parameters to determine the rib po

[fib algo] Delay algo init at fib growth to to allow to reliably use rib KPI.

Currently, most of the rib(9) KPI does not use rnh pointers, using
fibnum and family parameters to determine the rib pointer instead.
This works well except for the case when we initialize new rib pointers
during fib growth.
In that case, there is no mapping between fib/family and the new rib,
as an entirely new rib pointer array is populated.

Address this by delaying fib algo initialization till after switching
to the new pointer array and updating the number of fibs.
Set datapath pointer to the dummy function, so the potential callers
won't crash the kernel in the brief moment when the rib exists, but
no fib algo is attached.

This change allows to avoid creating duplicates of existing rib functions,
with altered signature.

Differential Revision: https://reviews.freebsd.org/D29969
MFC after: 1 week

show more ...


# bc5ef45a 27-Apr-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix drace CTF for the rib_head.

33cb3cb2e321 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
of the files including `route_var.h` do

Fix drace CTF for the rib_head.

33cb3cb2e321 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
of the files including `route_var.h` do not have `fib_algo`
defined.

Make dtrace happy by making the field unconditional.

Suggested by: markj

show more ...


# 33cb3cb2 17-Apr-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix rib generation count for fib algo.

Currently, PCB caching mechanism relies on the rib generation
counter (rnh_gen) to invalidate cached nhops/LLE entries.

With certain fib algorithms, it is no

Fix rib generation count for fib algo.

Currently, PCB caching mechanism relies on the rib generation
counter (rnh_gen) to invalidate cached nhops/LLE entries.

With certain fib algorithms, it is now possible that the
datapath lookup state applies RIB changes with some delay.
In that scenario, PCB cache will invalidate on the RIB change,
but the new lookup may result in the same nexthop being returned.
When fib algo finally gets in sync with the RIB changes, PCB cache
will not receive any notification and will end up caching the stale data.

To fix this, introduce additional counter, rnh_gen_rib, which is used
only when FIB_ALGO is enabled.
This counter is incremented by the control plane. Each time when fib algo
synchronises with the RIB, it updates rnh_gen to the current rnh_gen_rib value.

Differential Revision: https://reviews.freebsd.org/D29812
Reviewed by: donner
MFC after: 2 weeks

show more ...


Revision tags: release/13.0.0
# d68cf57b 07-Jan-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Refactor rt_addrmsg() and rt_routemsg().

Summary:
* Refactor rt_addrmsg(): make V_rt_add_addr_allfibs decision locally.
* Fix rt_routemsg() and multipath by accepting nexthop instead of interface po

Refactor rt_addrmsg() and rt_routemsg().

Summary:
* Refactor rt_addrmsg(): make V_rt_add_addr_allfibs decision locally.
* Fix rt_routemsg() and multipath by accepting nexthop instead of interface pointer.
* Refactor rtsock_routemsg(): avoid accessing rtentry fields directly.
* Simplify in_addprefix() by moving prefix search to a separate function.

Reviewers: #network

Subscribers: imp, ae, bz

Differential Revision: https://reviews.freebsd.org/D28011

show more ...


# 833dbf1e 28-Dec-2020 Ryan Libby <rlibby@FreeBSD.org>

route: quiet -Wredundant-decls

Remove declaration duplicated in
f5baf8bb12f39d0e8d64508c47eb6c4386ef716d

Reviewed by: melifaro
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.f

route: quiet -Wredundant-decls

Remove declaration duplicated in
f5baf8bb12f39d0e8d64508c47eb6c4386ef716d

Reviewed by: melifaro
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D27790

show more ...


# f5baf8bb 25-Dec-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add modular fib lookup framework.

This change introduces framework that allows to dynamically
attach or detach longest prefix match (lpm) lookup algorithms
to speed up datapath route tables lookup

Add modular fib lookup framework.

This change introduces framework that allows to dynamically
attach or detach longest prefix match (lpm) lookup algorithms
to speed up datapath route tables lookups.

Framework takes care of handling initial synchronisation,
route subscription, nhop/nhop groups reference and indexing,
dataplane attachments and fib instance algorithm setup/teardown.
Framework features automatic algorithm selection, allowing for
picking the best matching algorithm on-the-fly based on the
amount of routes in the routing table.

Currently framework code is guarded under FIB_ALGO config option.
An idea is to enable it by default in the next couple of weeks.

The following algorithms are provided by default:
IPv4:
* bsearch4 (lockless binary search in a special IP array), tailored for
small-fib (<16 routes)
* radix4_lockless (lockless immutable radix, re-created on every rtable change),
tailored for small-fib (<1000 routes)
* radix4 (base system radix backend)
* dpdk_lpm4 (DPDK DIR24-8-based lookups), lockless datastrucure, optimized
for large-fib (D27412)
IPv6:
* radix6_lockless (lockless immutable radix, re-created on every rtable change),
tailed for small-fib (<1000 routes)
* radix6 (base system radix backend)
* dpdk_lpm6 (DPDK DIR24-8-based lookups), lockless datastrucure, optimized
for large-fib (D27412)

Performance changes:
Micro benchmarks (I7-7660U, single-core lookups, 2048k dst, code in D27604):
IPv4:
8 routes:
radix4: ~20mpps
radix4_lockless: ~24.8mpps
bsearch4: ~69mpps
dpdk_lpm4: ~67 mpps
700k routes:
radix4_lockless: 3.3mpps
dpdk_lpm4: 46mpps

IPv6:
8 routes:
radix6_lockless: ~20mpps
dpdk_lpm6: ~70mpps
100k routes:
radix6_lockless: 13.9mpps
dpdk_lpm6: 57mpps

Forwarding benchmarks:
+ 10-15% IPv4 forwarding performance (small-fib, bsearch4)
+ 25% IPv4 forwarding performance (full-view, dpdk_lpm4)
+ 20% IPv6 forwarding performance (full-view, dpdk_lpm6)

Control:
Framwork adds the following runtime sysctls:

List algos
* net.route.algo.inet.algo_list: bsearch4, radix4_lockless, radix4
* net.route.algo.inet6.algo_list: radix6_lockless, radix6, dpdk_lpm6
Debug level (7=LOG_DEBUG, per-route)
net.route.algo.debug_level: 5
Algo selection (currently only for fib 0):
net.route.algo.inet.algo: bsearch4
net.route.algo.inet6.algo: radix6_lockless

Support for manually changing algos in non-default fib will be added
soon. Some sysctl names will be changed in the near future.

Differential Revision: https://reviews.freebsd.org/D27401

show more ...


# df905392 03-Dec-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add IPv4/IPv6 rtentry prefix accessors.

Multiple consumers like ipfw, netflow or new route lookup algorithms
need to get the prefix data out of struct rtentry.
Instead of providing direct access to

Add IPv4/IPv6 rtentry prefix accessors.

Multiple consumers like ipfw, netflow or new route lookup algorithms
need to get the prefix data out of struct rtentry.
Instead of providing direct access to the rtentry, create IPv4/IPv6
accessors to abstract struct rtentry internals and avoid including
internal routing headers for external consumers.

While here, move struct route_nhop_data to the public header, so external
customers can actually use lookup functions returning rt&nhop data.

Differential Revision: https://reviews.freebsd.org/D27416

show more ...


# f47fa260 29-Nov-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add nhop_ref_any() to unify referencing nhop or nexthop group.

It allows code within routing subsystem to transparently reference nexthops
and nexthop groups, similar to nhop_free_any(), abstractin

Add nhop_ref_any() to unify referencing nhop or nexthop group.

It allows code within routing subsystem to transparently reference nexthops
and nexthop groups, similar to nhop_free_any(), abstracting ROUTE_MPATH
details.

Differential Revision: https://reviews.freebsd.org/D27410

show more ...


# 98d5c4e5 29-Nov-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add tracking for rib/nhops/nhgrp objects and provide cumulative number accessors.

The resulting KPI can be used by routing table consumers to estimate the required
scale for route table export.

*

Add tracking for rib/nhops/nhgrp objects and provide cumulative number accessors.

The resulting KPI can be used by routing table consumers to estimate the required
scale for route table export.

* Add tracking for rib routes
* Add accessors for number of nexthops/nexthop objects
* Simplify rib_unsubscribe: store rnh we're attached to instead of requiring it up
again on destruction. This helps in the cases when rnh is not linked yet/already unlinked.

Differential Revision: https://reviews.freebsd.org/D27404

show more ...


# ef6ef7e5 28-Nov-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add nhgrp_get_idx() as a counterpart for nhop_get_idx().

It allows the routing-related code to reference nexthop groups by index
instead of storing a pointer.


Revision tags: release/12.2.0
# 0c325f53 18-Oct-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Implement flowid calculation for outbound connections to balance
connections over multiple paths.

Multipath routing relies on mbuf flowid data for both transit
and outbound traffic. Current code f

Implement flowid calculation for outbound connections to balance
connections over multiple paths.

Multipath routing relies on mbuf flowid data for both transit
and outbound traffic. Current code fills mbuf flowid from inp_flowid
for connection-oriented sockets. However, inp_flowid is currently
not calculated for outbound connections.

This change creates simple hashing functions and starts calculating hashes
for TCP,UDP/UDP-Lite and raw IP if multipath routes are present in the
system.

Reviewed by: glebius (previous version),ae
Differential Revision: https://reviews.freebsd.org/D26523

show more ...


# fedeb08b 03-Oct-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Introduce scalable route multipath.

This change is based on the nexthop objects landed in D24232.

The change introduces the concept of nexthop groups.
Each group contains the collection of nexthops

Introduce scalable route multipath.

This change is based on the nexthop objects landed in D24232.

The change introduces the concept of nexthop groups.
Each group contains the collection of nexthops with their
relative weights and a dataplane-optimized structure to enable
efficient nexthop selection.

Simular to the nexthops, nexthop groups are immutable. Dataplane part
gets compiled during group creation and is basically an array of
nexthop pointers, compiled w.r.t their weights.

With this change, `rt_nhop` field of `struct rtentry` contains either
nexthop or nexthop group. They are distinguished by the presense of
NHF_MULTIPATH flag.
All dataplane lookup functions returns pointer to the nexthop object,
leaving nexhop groups details inside routing subsystem.

User-visible changes:

The change is intended to be backward-compatible: all non-mpath operations
should work as before with ROUTE_MPATH and net.route.multipath=1.

All routes now comes with weight, default weight is 1, maximum is 2^24-1.

Current maximum multipath group width is statically set to 64.
This will become sysctl-tunable in the followup changes.

Using functionality:
* Recompile kernel with ROUTE_MPATH
* set net.route.multipath to 1

route add -6 2001:db8::/32 2001:db8::2 -weight 10
route add -6 2001:db8::/32 2001:db8::3 -weight 20

netstat -6On

Nexthop groups data

Internet6:
GrpIdx NhIdx Weight Slots Gateway Netif Refcnt
1 ------- ------- ------- --------------------------------------- --------- 1
13 10 1 2001:db8::2 vlan2
14 20 2 2001:db8::3 vlan2

Next steps:
* Land outbound hashing for locally-originated routes ( D26523 ).
* Fix net/bird multipath (net/frr seems to work fine)
* Add ROUTE_MPATH to GENERIC
* Set net.route.multipath=1 by default

Tested by: olivier
Reviewed by: glebius
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D26449

show more ...


# 2259a030 21-Sep-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Rework part of routing code to reduce difference to D26449.

* Split rt_setmetrics into get_info_weight() and rt_set_expire_info(),
as these two can be applied at different entities and at different

Rework part of routing code to reduce difference to D26449.

* Split rt_setmetrics into get_info_weight() and rt_set_expire_info(),
as these two can be applied at different entities and at different times.
* Start filling route weight in route change notifications
* Pass flowid to UDP/raw IP route lookups
* Rework nd6_subscription_cb() and sysctl_dumpentry() to prepare for the fact
that rtentry can contain multiple nexthops.

Differential Revision: https://reviews.freebsd.org/D26497

show more ...


12