xref: /freebsd/share/man/man4/rtnetlink.4 (revision 2e3f49888ec8851bafb22011533217487764fdb0)
1.\"
2.\" Copyright (C) 2022 Alexander Chernikov <melifaro@FreeBSD.org>.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.Dd November 1, 2022
26.Dt RTNETLINK 4
27.Os
28.Sh NAME
29.Nm RTNetlink
30.Nd Network configuration-specific Netlink family
31.Sh SYNOPSIS
32.In netlink/netlink.h
33.In netlink/netlink_route.h
34.Ft int
35.Fn socket AF_NETLINK SOCK_RAW NETLINK_ROUTE
36.Sh DESCRIPTION
37The
38.Dv NETLINK_ROUTE
39family aims to be the primary configuration mechanism for all
40network-related tasks.
41Currently it supports configuring interfaces, interface addresses, routes,
42nexthops and arp/ndp neighbors.
43.Sh ROUTES
44All route configuration messages share the common header:
45.Bd -literal
46struct rtmsg {
47	unsigned char	rtm_family;	/* address family */
48	unsigned char	rtm_dst_len;	/* Prefix length */
49	unsigned char	rtm_src_len;	/* Deprecated, set to 0 */
50	unsigned char	rtm_tos;	/* Type of service (not used) */
51	unsigned char	rtm_table;	/* deprecated, set to 0 */
52	unsigned char	rtm_protocol;	/* Routing protocol id (RTPROT_) */
53	unsigned char	rtm_scope;	/* Route distance (RT_SCOPE_) */
54	unsigned char	rtm_type;	/* Route type (RTN_) */
55	unsigned 	rtm_flags;	/* Route flags (not supported) */
56};
57.Ed
58.Pp
59The
60.Va rtm_family
61specifies the route family to be operated on.
62Currently,
63.Dv AF_INET6
64and
65.Dv AF_INET
66are the only supported families.
67The route prefix length is stored in
68.Va rtm_dst_len
69.
70The caller should set the originator identity (one of the
71.Dv RTPROT_
72values) in
73.Va rtm_protocol
74.
75It is useful for users and for the application itself, allowing for easy
76identification of self-originated routes.
77The route scope has to be set via
78.Va rtm_scope
79field.
80The supported values are:
81.Bd -literal -offset indent -compact
82RT_SCOPE_UNIVERSE	Global scope
83RT_SCOPE_LINK		Link scope
84.Ed
85.Pp
86Route type needs to be set.
87The defined values are:
88.Bd -literal -offset indent -compact
89RTN_UNICAST	Unicast route
90RTN_MULTICAST	Multicast route
91RTN_BLACKHOLE	Drops traffic towards destination
92RTN_PROHIBIT	Drops traffic and sends reject
93.Ed
94.Pp
95The following messages are supported:
96.Ss RTM_NEWROUTE
97Adds a new route.
98All NL flags are supported.
99Extending a multipath route requires NLM_F_APPEND flag.
100.Ss RTM_DELROUTE
101Tries to delete a route.
102The route is specified using a combination of
103.Dv RTA_DST
104TLV and
105.Va rtm_dst_len .
106.Ss RTM_GETROUTE
107Fetches a single route or all routes in the current VNET, depending on the
108.Dv NLM_F_DUMP
109flag.
110Each route is reported as
111.Dv RTM_NEWROUTE
112message.
113The following filters are recognised by the kernel:
114.Pp
115.Bd -literal -offset indent -compact
116rtm_family	required family or AF_UNSPEC
117RTA_TABLE	fib number or RT_TABLE_UNSPEC to return all fibs
118.Ed
119.Ss TLVs
120.Bl -tag -width indent
121.It Dv RTA_DST
122(binary) IPv4/IPv6 address, depending on the
123.Va rtm_family .
124.It Dv RTA_OIF
125(uint32_t) transmit interface index.
126.It Dv RTA_GATEWAY
127(binary) IPv4/IPv6 gateway address, depending on the
128.Va rtm_family .
129.It Dv RTA_METRICS
130(nested) Container attribute, listing route properties.
131The only supported sub-attribute is
132.Dv RTAX_MTU , which stores path MTU as  uint32_t.
133.It Dv RTA_MULTIPATH
134This attribute contains multipath route nexthops with their weights.
135These nexthops are represented as a sequence of
136.Va rtnexthop
137structures, each followed by
138.Dv RTA_GATEWAY
139or
140.Dv RTA_VIA
141attributes.
142.Bd -literal
143struct rtnexthop {
144	unsigned short		rtnh_len;
145	unsigned char		rtnh_flags;
146	unsigned char		rtnh_hops;	/* nexthop weight */
147	int			rtnh_ifindex;
148};
149.Ed
150.Pp
151The
152.Va rtnh_len
153field specifies the total nexthop info length, including both
154.Va struct rtnexthop
155and the following TLVs.
156The
157.Va rtnh_hops
158field stores relative nexthop weight, used for load balancing between group
159members.
160The
161.Va rtnh_ifindex
162field contains the index of the transmit interface.
163.Pp
164The following TLVs can follow the structure:
165.Bd -literal -offset indent -compact
166RTA_GATEWAY	IPv4/IPv6 nexthop address of the gateway
167RTA_VIA		IPv6 nexthop address for IPv4 route
168RTA_KNH_ID	Kernel-specific index of the nexthop
169.Ed
170.It Dv RTA_KNH_ID
171(uint32_t) (FreeBSD-specific) Auto-allocated kernel index of the nexthop.
172.It Dv RTA_RTFLAGS
173(uint32_t) (FreeBSD-specific) rtsock route flags.
174.It Dv RTA_TABLE
175(uint32_t) Fib number of the route.
176Default route table is
177.Dv RT_TABLE_MAIN .
178To explicitly specify "all tables" one needs to set the value to
179.Dv RT_TABLE_UNSPEC .
180.It Dv RTA_EXPIRES
181(uint32_t) seconds till path expiration.
182.It Dv RTA_NH_ID
183(uint32_t) useland nexthop or nexthop group index.
184.El
185.Ss Groups
186The following groups are defined:
187.Bd -literal -offset indent -compact
188RTNLGRP_IPV4_ROUTE	Notifies on IPv4 route arrival/removal/change
189RTNLGRP_IPV6_ROUTE	Notifies on IPv6 route arrival/removal/change
190.Ed
191.Sh NEXTHOPS
192All nexthop/nexthop group configuration messages share the common header:
193.Bd -literal
194struct nhmsg {
195        unsigned char	nh_family;	/* transport family */
196	unsigned char	nh_scope;	/* ignored on RX, filled by kernel */
197	unsigned char	nh_protocol;	/* Routing protocol that installed nh */
198	unsigned char	resvd;
199	unsigned int	nh_flags;	/* RTNH_F_* flags from route.h */
200};
201.Ed
202The
203.Va nh_family
204specifies the gateway address family.
205It can be different from route address family for IPv4 routes with IPv6
206nexthops.
207The
208.Va nh_protocol
209is similar to
210.Va rtm_protocol
211field, which designates originator application identity.
212.Pp
213The following messages are supported:
214.Ss RTM_NEWNEXTHOP
215Creates a new nexthop or nexthop group.
216.Ss RTM_DELNEXTHOP
217Deletes nexthop or nexthhop group.
218The required object is specified by the
219.Dv RTA_NH_ID
220attribute.
221.Ss RTM_GETNEXTHOP
222Fetches a single nexthop or all nexthops/nexthop groups, depending on the
223.Dv NLM_F_DUMP
224flag.
225The following filters are recognised by the kernel:
226.Pp
227.Bd -literal -offset indent -compact
228RTA_NH_ID	nexthop or nexthtop group id
229NHA_GROUPS	match only nexthtop groups
230.Ed
231.Ss TLVs
232.Bl -tag -width indent
233.It Dv RTA_NH_ID
234(uint32_t) Nexthhop index used to identify particular nexthop or nexthop group.
235Should be provided by userland at the nexthtop creation time.
236.It Dv NHA_GROUP
237This attribute designates the nexthtop group and contains all of its nexthtops
238and their relative weights.
239The attribute consists of a list of
240.Va nexthop_grp
241structures:
242.Bd -literal
243struct nexthop_grp {
244	uint32_t	id;		/* nexhop userland index */
245	uint8_t		weight;         /* weight of this nexthop */
246	uint8_t		resvd1;
247	uint16_t	resvd2;
248};
249.Ed
250.It Dv NHA_GROUP_TYPE
251(uint16_t) Nexthtop group type, set to one of the following types:
252.Bd -literal -offset indent -compact
253NEXTHOP_GRP_TYPE_MPATH	default multipath group
254.Ed
255.It Dv NHA_BLACKHOLE
256(flag) Marks the nexthtop as blackhole.
257.It Dv NHA_OIF
258(uint32_t) Transmit interface index of the nexthtop.
259.It Dv NHA_GATEWAY
260(binary) IPv4/IPv6 gateway address
261.It Dv NHA_GROUPS
262(flag) Matches nexthtop groups during dump.
263.El
264.Ss Groups
265The following groups are defined:
266.Bd -literal -offset indent -compact
267RTNLGRP_NEXTHOP		Notifies on nexthop/groups arrival/removal/change
268.Ed
269.Sh INTERFACES
270All interface configuration messages share the common header:
271.Bd -literal
272struct ifinfomsg {
273	unsigned char	ifi_family;	/* not used, set to 0 */
274	unsigned char	__ifi_pad;
275	unsigned short	ifi_type;	/* ARPHRD_* */
276	int		ifi_index;	/* Interface index */
277	unsigned	ifi_flags;	/* IFF_* flags */
278	unsigned	ifi_change;	/* IFF_* change mask */
279};
280.Ed
281.Ss RTM_NEWLINK
282Creates a new interface.
283The only mandatory TLV is
284.Dv IFLA_IFNAME .
285The following attributes are returned inside the nested
286.Dv NLMSGERR_ATTR_COOKIE :
287.Pp
288.Bd -literal -offset indent -compact
289IFLA_NEW_IFINDEX	(uint32) created interface index
290IFLA_IFNAME		(string) created interface name
291.Ed
292.Ss RTM_DELLINK
293Deletes the interface specified by
294.Dv IFLA_IFNAME .
295.Ss RTM_GETLINK
296Fetches a single interface or all interfaces in the current VNET, depending on the
297.Dv NLM_F_DUMP
298flag.
299Each interface is reported as a
300.Dv RTM_NEWLINK
301message.
302The following filters are recognised by the kernel:
303.Pp
304.Bd -literal -offset indent -compact
305ifi_index	interface index
306IFLA_IFNAME	interface name
307IFLA_ALT_IFNAME	interface name
308.Ed
309.Ss TLVs
310.Bl -tag -width indent
311.It Dv IFLA_ADDRESS
312(binary) Llink-level interface address (MAC).
313.It Dv IFLA_BROADCAST
314(binary) (readonly) Link-level broadcast address.
315.It Dv IFLA_IFNAME
316(string) New interface name.
317.It Dv IFLA_IFALIAS
318(string) Interface description.
319.It Dv IFLA_LINK
320(uint32_t) (readonly) Interface index.
321.It Dv IFLA_MASTER
322(uint32_t) Parent interface index.
323.It Dv IFLA_LINKINFO
324(nested) Interface type-specific attributes:
325.Bd -literal -offset indent -compact
326IFLA_INFO_KIND		(string) interface type ("vlan")
327IFLA_INFO_DATA		(nested) custom attributes
328.Ed
329The following types and attributes are supported:
330.Bl -tag -width indent
331.It Dv vlan
332.Bd -literal -offset indent -compact
333IFLA_VLAN_ID		(uint16_t) 802.1Q vlan id
334IFLA_VLAN_PROTOCOL	(uint16_t) Protocol: ETHERTYPE_VLAN or ETHERTYPE_QINQ
335.Ed
336.El
337.It Dv IFLA_OPERSTATE
338(uint8_t) Interface operational state per RFC 2863.
339Can be one of the following:
340.Bd -literal -offset indent -compact
341IF_OPER_UNKNOWN		status can not be determined
342IF_OPER_NOTPRESENT	some (hardware) component not present
343IF_OPER_DOWN		down
344IF_OPER_LOWERLAYERDOWN	some lower-level interface is down
345IF_OPER_TESTING		in some test mode
346IF_OPER_DORMANT		"up" but waiting for some condition (802.1X)
347IF_OPER_UP		ready to pass packets
348.Ed
349.It Dv IFLA_STATS64
350(readonly) Consists of the following 64-bit counters structure:
351.Bd -literal
352struct rtnl_link_stats64 {
353	uint64_t rx_packets;	/* total RX packets (IFCOUNTER_IPACKETS) */
354	uint64_t tx_packets;	/* total TX packets (IFCOUNTER_OPACKETS) */
355	uint64_t rx_bytes;	/* total RX bytes (IFCOUNTER_IBYTES) */
356	uint64_t tx_bytes;	/* total TX bytes (IFCOUNTER_OBYTES) */
357	uint64_t rx_errors;	/* RX errors (IFCOUNTER_IERRORS) */
358	uint64_t tx_errors;	/* RX errors (IFCOUNTER_OERRORS) */
359	uint64_t rx_dropped;	/* RX drop (no space in ring/no bufs) (IFCOUNTER_IQDROPS) */
360	uint64_t tx_dropped;	/* TX drop (IFCOUNTER_OQDROPS) */
361	uint64_t multicast;	/* RX multicast packets (IFCOUNTER_IMCASTS) */
362	uint64_t collisions;	/* not supported */
363	uint64_t rx_length_errors;	/* not supported */
364	uint64_t rx_over_errors;	/* not supported */
365	uint64_t rx_crc_errors;		/* not supported */
366	uint64_t rx_frame_errors;	/* not supported */
367	uint64_t rx_fifo_errors;	/* not supported */
368	uint64_t rx_missed_errors;	/* not supported */
369	uint64_t tx_aborted_errors;	/* not supported */
370	uint64_t tx_carrier_errors;	/* not supported */
371	uint64_t tx_fifo_errors;	/* not supported */
372	uint64_t tx_heartbeat_errors;	/* not supported */
373	uint64_t tx_window_errors;	/* not supported */
374	uint64_t rx_compressed;		/* not supported */
375	uint64_t tx_compressed;		/* not supported */
376	uint64_t rx_nohandler;	/* dropped due to no proto handler (IFCOUNTER_NOPROTO) */
377};
378.Ed
379.El
380.Ss Groups
381The following groups are defined:
382.Bd -literal -offset indent -compact
383RTNLGRP_LINK		Notifies on interface arrival/removal/change
384.Ed
385.Sh INTERFACE ADDRESSES
386All interface address configuration messages share the common header:
387.Bd -literal
388struct ifaddrmsg {
389	uint8_t		ifa_family;	/* Address family */
390	uint8_t		ifa_prefixlen;	/* Prefix length */
391	uint8_t		ifa_flags;	/* Address-specific flags */
392	uint8_t		ifa_scope;	/* Address scope */
393	uint32_t	ifa_index;	/* Link ifindex */
394};
395.Ed
396.Pp
397The
398.Va ifa_family
399specifies the address family of the interface address.
400The
401.Va ifa_prefixlen
402specifies the prefix length if applicable for the address family.
403The
404.Va ifa_index
405specifies the interface index of the target interface.
406.Ss RTM_NEWADDR
407Not supported
408.Ss RTM_DELADDR
409Not supported
410.Ss RTM_GETADDR
411Fetches interface addresses in the current VNET matching conditions.
412Each address is reported as a
413.Dv RTM_NEWADDR
414message.
415The following filters are recognised by the kernel:
416.Pp
417.Bd -literal -offset indent -compact
418ifa_family	required family or AF_UNSPEC
419ifa_index	matching interface index or 0
420.Ed
421.Ss TLVs
422.Bl -tag -width indent
423.It Dv IFA_ADDRESS
424(binary) masked interface address or destination address for p2p interfaces.
425.It Dv IFA_LOCAL
426(binary) local interface address.
427Set for IPv4 and p2p addresses.
428.It Dv IFA_LABEL
429(string) interface name.
430.It Dv IFA_BROADCAST
431(binary) broadcast interface address.
432.El
433.Ss Groups
434The following groups are defined:
435.Bd -literal -offset indent -compact
436RTNLGRP_IPV4_IFADDR	Notifies on IPv4 ifaddr arrival/removal/change
437RTNLGRP_IPV6_IFADDR	Notifies on IPv6 ifaddr arrival/removal/change
438.Ed
439.Sh NEIGHBORS
440All neighbor configuration messages share the common header:
441.Bd -literal
442struct ndmsg {
443	uint8_t		ndm_family;
444	uint8_t		ndm_pad1;
445	uint16_t	ndm_pad2;
446	int32_t		ndm_ifindex;
447	uint16_t	ndm_state;
448	uint8_t		ndm_flags;
449	uint8_t		ndm_type;
450};
451.Ed
452.Pp
453The
454.Va ndm_family
455field specifies the address family (IPv4 or IPv6) of the neighbor.
456The
457.Va ndm_ifindex
458specifies the interface to operate on.
459The
460.Va ndm_state
461represents the entry state according to the neighbor model.
462The state can be one of the following:
463.Bd -literal -offset indent -compact
464NUD_INCOMPLETE		No lladdr, address resolution in progress
465NUD_REACHABLE		reachable & recently resolved
466NUD_STALE		has lladdr but it's stale
467NUD_DELAY		has lladdr, is stale, probes delayed
468NUD_PROBE		has lladdr, is stale, probes sent
469NUD_FAILED		unused
470.Ed
471.Pp
472The
473.Va ndm_flags
474field stores the options specific to this entry.
475Available flags:
476.Bd -literal -offset indent -compact
477NTF_SELF		local station (LLE_IFADDR)
478NTF_PROXY		proxy entry (LLE_PUB)
479NTF_STICKY		permanent entry (LLE_STATIC)
480NTF_ROUTER		dst indicated itself as a router
481.Ed
482.Ss RTM_NEWNEIGH
483Creates new neighbor entry.
484The mandatory options are
485.Dv NDA_DST ,
486.Dv NDA_LLADDR
487and
488.Dv NDA_IFINDEX .
489.Ss RTM_DELNEIGH
490Deletes the neighbor entry.
491The entry is specified by the combination of
492.Dv NDA_DST
493and
494.Dv NDA_IFINDEX .
495.Ss RTM_GETNEIGH
496Fetches a single neighbor or all neighbors in the current VNET, depending on the
497.Dv NLM_F_DUMP
498flag.
499Each entry is reported as
500.Dv RTM_NEWNEIGH
501message.
502The following filters are recognised by the kernel:
503.Pp
504.Bd -literal -offset indent -compact
505ndm_family	required family or AF_UNSPEC
506ndm_ifindex	target ifindex
507NDA_IFINDEX	target ifindex
508.Ed
509.Ss TLVs
510.Bl -tag -width indent
511.It Dv NDA_DST
512(binary) neighbor IPv4/IPv6 address.
513.It Dv NDA_LLADDR
514(binary) neighbor link-level address.
515.It Dv NDA_IFINDEX
516(uint32_t) interface index.
517.It Dv NDA_FLAGS_EXT
518(uint32_t) extended version of
519.Va ndm_flags .
520.El
521.Ss Groups
522The following groups are defined:
523.Bd -literal -offset indent -compact
524RTNLGRP_NEIGH	Notifies on ARP/NDP neighbor  arrival/removal/change
525.Ed
526.Sh SEE ALSO
527.Xr netlink 4 ,
528.Xr route 4
529.Sh HISTORY
530The
531.Dv NETLINK_ROUTE
532protocol family appeared in
533.Fx 13.2 .
534.Sh AUTHORS
535The netlink was implemented by
536.An -nosplit
537.An Alexander Chernikov Aq Mt melifaro@FreeBSD.org .
538It was derived from the Google Summer of Code 2021 project by
539.An Ng Peng Nam Sean .
540