xref: /freebsd/share/man/man4/ip.4 (revision bc7512cc58af2e8bbe5bbf5ca0059b1daa1da897)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. Neither the name of the University nor the names of its contributors
13.\"    may be used to endorse or promote products derived from this software
14.\"    without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
29.\" $FreeBSD$
30.\"
31.Dd August 9, 2021
32.Dt IP 4
33.Os
34.Sh NAME
35.Nm ip
36.Nd Internet Protocol
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/socket.h
40.In netinet/in.h
41.Ft int
42.Fn socket AF_INET SOCK_RAW proto
43.Sh DESCRIPTION
44.Tn IP
45is the transport layer protocol used
46by the Internet protocol family.
47Options may be set at the
48.Tn IP
49level
50when using higher-level protocols that are based on
51.Tn IP
52(such as
53.Tn TCP
54and
55.Tn UDP ) .
56It may also be accessed
57through a
58.Dq raw socket
59when developing new protocols, or
60special-purpose applications.
61.Pp
62There are several
63.Tn IP-level
64.Xr setsockopt 2
65and
66.Xr getsockopt 2
67options.
68.Dv IP_OPTIONS
69may be used to provide
70.Tn IP
71options to be transmitted in the
72.Tn IP
73header of each outgoing packet
74or to examine the header options on incoming packets.
75.Tn IP
76options may be used with any socket type in the Internet family.
77The format of
78.Tn IP
79options to be sent is that specified by the
80.Tn IP
81protocol specification (RFC-791), with one exception:
82the list of addresses for Source Route options must include the first-hop
83gateway at the beginning of the list of gateways.
84The first-hop gateway address will be extracted from the option list
85and the size adjusted accordingly before use.
86To disable previously specified options,
87use a zero-length buffer:
88.Bd -literal
89setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
90.Ed
91.Pp
92.Dv IP_TOS
93may be used to set the differential service codepoint (DSCP) and the
94explicit congestion notfication (ECN) codepoint.
95Setting the ECN codepoint - the two least significant bits - on a
96socket using a transport protocol implementing ECN has no effect.
97.Pp
98.Dv IP_TTL
99configures the time-to-live (TTL) field in the
100.Tn IP
101header for
102.Dv SOCK_STREAM , SOCK_DGRAM ,
103and certain types of
104.Dv SOCK_RAW
105sockets.
106For example,
107.Bd -literal
108int tos = IPTOS_DSCP_EF;       /* see <netinet/ip.h> */
109setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
110
111int ttl = 60;                   /* max = 255 */
112setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
113.Ed
114.Pp
115.Dv IP_IPSEC_POLICY
116controls IPSec policy for sockets.
117For example,
118.Bd -literal
119const char *policy = "in ipsec ah/transport//require";
120char *buf = ipsec_set_policy(policy, strlen(policy));
121setsockopt(s, IPPROTO_IP, IP_IPSEC_POLICY, buf, ipsec_get_policylen(buf));
122.Ed
123.Pp
124.Dv IP_MINTTL
125may be used to set the minimum acceptable TTL a packet must have when
126received on a socket.
127All packets with a lower TTL are silently dropped.
128This option is only really useful when set to 255, preventing packets
129from outside the directly connected networks reaching local listeners
130on sockets.
131.Pp
132.Dv IP_DONTFRAG
133may be used to set the Don't Fragment flag on IP packets.
134Currently this option is respected only on
135.Xr udp 4
136and raw
137.Nm
138sockets, unless the
139.Dv IP_HDRINCL
140option has been set.
141On
142.Xr tcp 4
143sockets, the Don't Fragment flag is controlled by the Path
144MTU Discovery option.
145Sending a packet larger than the MTU size of the egress interface,
146determined by the destination address, returns an
147.Er EMSGSIZE
148error.
149.Pp
150If the
151.Dv IP_ORIGDSTADDR
152option is enabled on a
153.Dv SOCK_DGRAM
154socket,
155the
156.Xr recvmsg 2
157call will return the destination
158.Tn IP
159address and destination port for a
160.Tn UDP
161datagram.
162The
163.Vt msg_control
164field in the
165.Vt msghdr
166structure points to a buffer
167that contains a
168.Vt cmsghdr
169structure followed by the
170.Tn sockaddr_in
171structure.
172The
173.Vt cmsghdr
174fields have the following values:
175.Bd -literal
176cmsg_len = CMSG_LEN(sizeof(struct sockaddr_in))
177cmsg_level = IPPROTO_IP
178cmsg_type = IP_ORIGDSTADDR
179.Ed
180.Pp
181If the
182.Dv IP_RECVDSTADDR
183option is enabled on a
184.Dv SOCK_DGRAM
185socket,
186the
187.Xr recvmsg 2
188call will return the destination
189.Tn IP
190address for a
191.Tn UDP
192datagram.
193The
194.Vt msg_control
195field in the
196.Vt msghdr
197structure points to a buffer
198that contains a
199.Vt cmsghdr
200structure followed by the
201.Tn IP
202address.
203The
204.Vt cmsghdr
205fields have the following values:
206.Bd -literal
207cmsg_len = CMSG_LEN(sizeof(struct in_addr))
208cmsg_level = IPPROTO_IP
209cmsg_type = IP_RECVDSTADDR
210.Ed
211.Pp
212The source address to be used for outgoing
213.Tn UDP
214datagrams on a socket can be specified as ancillary data with a type code of
215.Dv IP_SENDSRCADDR .
216The msg_control field in the msghdr structure should point to a buffer
217that contains a
218.Vt cmsghdr
219structure followed by the
220.Tn IP
221address.
222The cmsghdr fields should have the following values:
223.Bd -literal
224cmsg_len = CMSG_LEN(sizeof(struct in_addr))
225cmsg_level = IPPROTO_IP
226cmsg_type = IP_SENDSRCADDR
227.Ed
228.Pp
229The socket should be either bound to
230.Dv INADDR_ANY
231and a local port, and the address supplied with
232.Dv IP_SENDSRCADDR
233should't be
234.Dv INADDR_ANY ,
235or the socket should be bound to a local address and the address supplied with
236.Dv IP_SENDSRCADDR
237should be
238.Dv INADDR_ANY .
239In the latter case bound address is overridden via generic source address
240selection logic, which would choose IP address of interface closest to
241destination.
242.Pp
243For convenience,
244.Dv IP_SENDSRCADDR
245is defined to have the same value as
246.Dv IP_RECVDSTADDR ,
247so the
248.Dv IP_RECVDSTADDR
249control message from
250.Xr recvmsg 2
251can be used directly as a control message for
252.Xr sendmsg 2 .
253.\"
254.Pp
255If the
256.Dv IP_ONESBCAST
257option is enabled on a
258.Dv SOCK_DGRAM
259or a
260.Dv SOCK_RAW
261socket, the destination address of outgoing
262broadcast datagrams on that socket will be forced
263to the undirected broadcast address,
264.Dv INADDR_BROADCAST ,
265before transmission.
266This is in contrast to the default behavior of the
267system, which is to transmit undirected broadcasts
268via the first network interface with the
269.Dv IFF_BROADCAST
270flag set.
271.Pp
272This option allows applications to choose which
273interface is used to transmit an undirected broadcast
274datagram.
275For example, the following code would force an
276undirected broadcast to be transmitted via the interface
277configured with the broadcast address 192.168.2.255:
278.Bd -literal
279char msg[512];
280struct sockaddr_in sin;
281int onesbcast = 1;	/* 0 = disable (default), 1 = enable */
282
283setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
284sin.sin_addr.s_addr = inet_addr("192.168.2.255");
285sin.sin_port = htons(1234);
286sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
287.Ed
288.Pp
289It is the application's responsibility to set the
290.Dv IP_TTL
291option
292to an appropriate value in order to prevent broadcast storms.
293The application must have sufficient credentials to set the
294.Dv SO_BROADCAST
295socket level option, otherwise the
296.Dv IP_ONESBCAST
297option has no effect.
298.Pp
299If the
300.Dv IP_BINDANY
301option is enabled on a
302.Dv SOCK_STREAM ,
303.Dv SOCK_DGRAM
304or a
305.Dv SOCK_RAW
306socket, one can
307.Xr bind 2
308to any address, even one not bound to any available network interface in the
309system.
310This functionality (in conjunction with special firewall rules) can be used for
311implementing a transparent proxy.
312The
313.Dv PRIV_NETINET_BINDANY
314privilege is needed to set this option.
315.Pp
316If the
317.Dv IP_RECVTTL
318option is enabled on a
319.Dv SOCK_DGRAM
320socket, the
321.Xr recvmsg 2
322call will return the
323.Tn IP
324.Tn TTL
325(time to live) field for a
326.Tn UDP
327datagram.
328The msg_control field in the msghdr structure points to a buffer
329that contains a cmsghdr structure followed by the
330.Tn TTL .
331The cmsghdr fields have the following values:
332.Bd -literal
333cmsg_len = CMSG_LEN(sizeof(u_char))
334cmsg_level = IPPROTO_IP
335cmsg_type = IP_RECVTTL
336.Ed
337.\"
338.Pp
339If the
340.Dv IP_RECVTOS
341option is enabled on a
342.Dv SOCK_DGRAM
343socket, the
344.Xr recvmsg 2
345call will return the
346.Tn IP
347.Tn TOS
348(type of service) field for a
349.Tn UDP
350datagram.
351The msg_control field in the msghdr structure points to a buffer
352that contains a cmsghdr structure followed by the
353.Tn TOS .
354The cmsghdr fields have the following values:
355.Bd -literal
356cmsg_len = CMSG_LEN(sizeof(u_char))
357cmsg_level = IPPROTO_IP
358cmsg_type = IP_RECVTOS
359.Ed
360.\"
361.Pp
362If the
363.Dv IP_RECVIF
364option is enabled on a
365.Dv SOCK_DGRAM
366socket, the
367.Xr recvmsg 2
368call returns a
369.Vt "struct sockaddr_dl"
370corresponding to the interface on which the
371packet was received.
372The
373.Va msg_control
374field in the
375.Vt msghdr
376structure points to a buffer that contains a
377.Vt cmsghdr
378structure followed by the
379.Vt "struct sockaddr_dl" .
380The
381.Vt cmsghdr
382fields have the following values:
383.Bd -literal
384cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl))
385cmsg_level = IPPROTO_IP
386cmsg_type = IP_RECVIF
387.Ed
388.Pp
389.Dv IP_PORTRANGE
390may be used to set the port range used for selecting a local port number
391on a socket with an unspecified (zero) port number.
392It has the following
393possible values:
394.Bl -tag -width IP_PORTRANGE_DEFAULT
395.It Dv IP_PORTRANGE_DEFAULT
396use the default range of values, normally
397.Dv IPPORT_HIFIRSTAUTO
398through
399.Dv IPPORT_HILASTAUTO .
400This is adjustable through the sysctl setting:
401.Va net.inet.ip.portrange.first
402and
403.Va net.inet.ip.portrange.last .
404.It Dv IP_PORTRANGE_HIGH
405use a high range of values, normally
406.Dv IPPORT_HIFIRSTAUTO
407and
408.Dv IPPORT_HILASTAUTO .
409This is adjustable through the sysctl setting:
410.Va net.inet.ip.portrange.hifirst
411and
412.Va net.inet.ip.portrange.hilast .
413.It Dv IP_PORTRANGE_LOW
414use a low range of ports, which are normally restricted to
415privileged processes on
416.Ux
417systems.
418The range is normally from
419.Dv IPPORT_RESERVED
420\- 1 down to
421.Li IPPORT_RESERVEDSTART
422in descending order.
423This is adjustable through the sysctl setting:
424.Va net.inet.ip.portrange.lowfirst
425and
426.Va net.inet.ip.portrange.lowlast .
427.El
428.Pp
429The range of privileged ports which only may be opened by
430root-owned processes may be modified by the
431.Va net.inet.ip.portrange.reservedlow
432and
433.Va net.inet.ip.portrange.reservedhigh
434sysctl settings.
435The values default to the traditional range,
4360 through
437.Dv IPPORT_RESERVED
438\- 1
439(0 through 1023), respectively.
440Note that these settings do not affect and are not accounted for in the
441use or calculation of the other
442.Va net.inet.ip.portrange
443values above.
444Changing these values departs from
445.Ux
446tradition and has security
447consequences that the administrator should carefully evaluate before
448modifying these settings.
449.Pp
450Ports are allocated at random within the specified port range in order
451to increase the difficulty of random spoofing attacks.
452In scenarios such as benchmarking, this behavior may be undesirable.
453In these cases,
454.Va net.inet.ip.portrange.randomized
455can be used to toggle randomization off.
456If more than
457.Va net.inet.ip.portrange.randomcps
458ports have been allocated in the last second, then return to sequential
459port allocation.
460Return to random allocation only once the current port allocation rate
461drops below
462.Va net.inet.ip.portrange.randomcps
463for at least
464.Va net.inet.ip.portrange.randomtime
465seconds.
466The default values for
467.Va net.inet.ip.portrange.randomcps
468and
469.Va net.inet.ip.portrange.randomtime
470are 10 port allocations per second and 45 seconds correspondingly.
471.Ss "Multicast Options"
472.Tn IP
473multicasting is supported only on
474.Dv AF_INET
475sockets of type
476.Dv SOCK_DGRAM
477and
478.Dv SOCK_RAW ,
479and only on networks where the interface
480driver supports multicasting.
481.Pp
482The
483.Dv IP_MULTICAST_TTL
484option changes the time-to-live (TTL)
485for outgoing multicast datagrams
486in order to control the scope of the multicasts:
487.Bd -literal
488u_char ttl;	/* range: 0 to 255, default = 1 */
489setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
490.Ed
491.Pp
492Datagrams with a TTL of 1 are not forwarded beyond the local network.
493Multicast datagrams with a TTL of 0 will not be transmitted on any network,
494but may be delivered locally if the sending host belongs to the destination
495group and if multicast loopback has not been disabled on the sending socket
496(see below).
497Multicast datagrams with TTL greater than 1 may be forwarded
498to other networks if a multicast router is attached to the local network.
499.Pp
500For hosts with multiple interfaces, where an interface has not
501been specified for a multicast group membership,
502each multicast transmission is sent from the primary network interface.
503The
504.Dv IP_MULTICAST_IF
505option overrides the default for
506subsequent transmissions from a given socket:
507.Bd -literal
508struct in_addr addr;
509setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
510.Ed
511.Pp
512where "addr" is the local
513.Tn IP
514address of the desired interface or
515.Dv INADDR_ANY
516to specify the default interface.
517.Pp
518To specify an interface by index, an instance of
519.Vt ip_mreqn
520may be passed instead.
521The
522.Vt imr_ifindex
523member should be set to the index of the desired interface,
524or 0 to specify the default interface.
525The kernel differentiates between these two structures by their size.
526.Pp
527The use of
528.Vt IP_MULTICAST_IF
529is
530.Em not recommended ,
531as multicast memberships are scoped to each
532individual interface.
533It is supported for legacy use only by applications,
534such as routing daemons, which expect to
535be able to transmit link-local IPv4 multicast datagrams (224.0.0.0/24)
536on multiple interfaces,
537without requesting an individual membership for each interface.
538.Pp
539.\"
540An interface's local IP address and multicast capability can
541be obtained via the
542.Dv SIOCGIFCONF
543and
544.Dv SIOCGIFFLAGS
545ioctls.
546Normal applications should not need to use this option.
547.Pp
548If a multicast datagram is sent to a group to which the sending host itself
549belongs (on the outgoing interface), a copy of the datagram is, by default,
550looped back by the IP layer for local delivery.
551The
552.Dv IP_MULTICAST_LOOP
553option gives the sender explicit control
554over whether or not subsequent datagrams are looped back:
555.Bd -literal
556u_char loop;	/* 0 = disable, 1 = enable (default) */
557setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
558.Ed
559.Pp
560This option
561improves performance for applications that may have no more than one
562instance on a single host (such as a routing daemon), by eliminating
563the overhead of receiving their own transmissions.
564It should generally not
565be used by applications for which there may be more than one instance on a
566single host (such as a conferencing program) or for which the sender does
567not belong to the destination group (such as a time querying program).
568.Pp
569The sysctl setting
570.Va net.inet.ip.mcast.loop
571controls the default setting of the
572.Dv IP_MULTICAST_LOOP
573socket option for new sockets.
574.Pp
575A multicast datagram sent with an initial TTL greater than 1 may be delivered
576to the sending host on a different interface from that on which it was sent,
577if the host belongs to the destination group on that other interface.
578The loopback control option has no effect on such delivery.
579.Pp
580A host must become a member of a multicast group before it can receive
581datagrams sent to the group.
582To join a multicast group, use the
583.Dv IP_ADD_MEMBERSHIP
584option:
585.Bd -literal
586struct ip_mreqn mreqn;
587setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreqn, sizeof(mreqn));
588.Ed
589.Pp
590where
591.Fa mreqn
592is the following structure:
593.Bd -literal
594struct ip_mreqn {
595    struct in_addr imr_multiaddr; /* IP multicast address of group */
596    struct in_addr imr_interface; /* local IP address of interface */
597    int            imr_ifindex;   /* interface index */
598}
599.Ed
600.Pp
601.Va imr_ifindex
602should be set to the index of a particular multicast-capable interface if
603the host is multihomed.
604If
605.Va imr_ifindex
606is non-zero, value of
607.Va imr_interface
608is ignored.
609Otherwise, if
610.Va imr_ifindex
611is 0, kernel will use IP address from
612.Va imr_interface
613to lookup the interface.
614Value of
615.Va imr_interface
616may be set to
617.Va INADDR_ANY
618to choose the default interface, although this is not recommended; this is
619considered to be the first interface corresponding to the default route.
620Otherwise, the first multicast-capable interface configured in the system
621will be used.
622.Pp
623Legacy
624.Vt "struct ip_mreq" ,
625that lacks
626.Va imr_ifindex
627field is also supported by
628.Dv IP_ADD_MEMBERSHIP
629setsockopt.
630In this case kernel would behave as if
631.Va imr_ifindex
632was set to zero:
633.Va imr_interface
634will be used to lookup interface.
635.Pp
636Prior to
637.Fx 7.0 ,
638if the
639.Va imr_interface
640member is within the network range
641.Li 0.0.0.0/8 ,
642it is treated as an interface index in the system interface MIB,
643as per the RIP Version 2 MIB Extension (RFC-1724).
644In versions of
645.Fx
646since 7.0, this behavior is no longer supported.
647Developers should
648instead use the RFC 3678 multicast source filter APIs; in particular,
649.Dv MCAST_JOIN_GROUP .
650.Pp
651Up to
652.Dv IP_MAX_MEMBERSHIPS
653memberships may be added on a single socket.
654Membership is associated with a single interface;
655programs running on multihomed hosts may need to
656join the same group on more than one interface.
657.Pp
658To drop a membership, use:
659.Bd -literal
660struct ip_mreq mreq;
661setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
662.Ed
663.Pp
664where
665.Fa mreq
666contains the same values as used to add the membership.
667Memberships are dropped when the socket is closed or the process exits.
668.\" TODO: Update this piece when IPv4 source-address selection is implemented.
669.Pp
670The IGMP protocol uses the primary IP address of the interface
671as its identifier for group membership.
672This is the first IP address configured on the interface.
673If this address is removed or changed, the results are
674undefined, as the IGMP membership state will then be inconsistent.
675If multiple IP aliases are configured on the same interface,
676they will be ignored.
677.Pp
678This shortcoming was addressed in IPv6; MLDv2 requires
679that the unique link-local address for an interface is
680used to identify an MLDv2 listener.
681.Ss "Source-Specific Multicast Options"
682Since
683.Fx 8.0 ,
684the use of Source-Specific Multicast (SSM) is supported.
685These extensions require an IGMPv3 multicast router in order to
686make best use of them.
687If a legacy multicast router is present on the link,
688.Fx
689will simply downgrade to the version of IGMP spoken by the router,
690and the benefits of source filtering on the upstream link
691will not be present, although the kernel will continue to
692squelch transmissions from blocked sources.
693.Pp
694Each group membership on a socket now has a filter mode:
695.Bl -tag -width MCAST_EXCLUDE
696.It Dv MCAST_EXCLUDE
697Datagrams sent to this group are accepted,
698unless the source is in a list of blocked source addresses.
699.It Dv MCAST_INCLUDE
700Datagrams sent to this group are accepted
701only if the source is in a list of accepted source addresses.
702.El
703.Pp
704Groups joined using the legacy
705.Dv IP_ADD_MEMBERSHIP
706option are placed in exclusive-mode,
707and are able to request that certain sources are blocked or allowed.
708This is known as the
709.Em delta-based API .
710.Pp
711To block a multicast source on an existing group membership:
712.Bd -literal
713struct ip_mreq_source mreqs;
714setsockopt(s, IPPROTO_IP, IP_BLOCK_SOURCE, &mreqs, sizeof(mreqs));
715.Ed
716.Pp
717where
718.Fa mreqs
719is the following structure:
720.Bd -literal
721struct ip_mreq_source {
722    struct in_addr imr_multiaddr; /* IP multicast address of group */
723    struct in_addr imr_sourceaddr; /* IP address of source */
724    struct in_addr imr_interface; /* local IP address of interface */
725}
726.Ed
727.Va imr_sourceaddr
728should be set to the address of the source to be blocked.
729.Pp
730To unblock a multicast source on an existing group:
731.Bd -literal
732struct ip_mreq_source mreqs;
733setsockopt(s, IPPROTO_IP, IP_UNBLOCK_SOURCE, &mreqs, sizeof(mreqs));
734.Ed
735.Pp
736The
737.Dv IP_BLOCK_SOURCE
738and
739.Dv IP_UNBLOCK_SOURCE
740options are
741.Em not permitted
742for inclusive-mode group memberships.
743.Pp
744To join a multicast group in
745.Dv MCAST_INCLUDE
746mode with a single source,
747or add another source to an existing inclusive-mode membership:
748.Bd -literal
749struct ip_mreq_source mreqs;
750setsockopt(s, IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
751.Ed
752.Pp
753To leave a single source from an existing group in inclusive mode:
754.Bd -literal
755struct ip_mreq_source mreqs;
756setsockopt(s, IPPROTO_IP, IP_DROP_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
757.Ed
758If this is the last accepted source for the group, the membership
759will be dropped.
760.Pp
761The
762.Dv IP_ADD_SOURCE_MEMBERSHIP
763and
764.Dv IP_DROP_SOURCE_MEMBERSHIP
765options are
766.Em not accepted
767for exclusive-mode group memberships.
768However, both exclusive and inclusive mode memberships
769support the use of the
770.Em full-state API
771documented in RFC 3678.
772For management of source filter lists using this API,
773please refer to
774.Xr sourcefilter 3 .
775.Pp
776The sysctl settings
777.Va net.inet.ip.mcast.maxsocksrc
778and
779.Va net.inet.ip.mcast.maxgrpsrc
780are used to specify an upper limit on the number of per-socket and per-group
781source filter entries which the kernel may allocate.
782.\"-----------------------
783.Ss "Raw IP Sockets"
784Raw
785.Tn IP
786sockets are connectionless,
787and are normally used with the
788.Xr sendto 2
789and
790.Xr recvfrom 2
791calls, though the
792.Xr connect 2
793call may also be used to fix the destination for future
794packets (in which case the
795.Xr read 2
796or
797.Xr recv 2
798and
799.Xr write 2
800or
801.Xr send 2
802system calls may be used).
803.Pp
804If
805.Fa proto
806is 0, the default protocol
807.Dv IPPROTO_RAW
808is used for outgoing
809packets, and only incoming packets destined for that protocol
810are received.
811If
812.Fa proto
813is non-zero, that protocol number will be used on outgoing packets
814and to filter incoming packets.
815.Pp
816Outgoing packets automatically have an
817.Tn IP
818header prepended to
819them (based on the destination address and the protocol
820number the socket is created with),
821unless the
822.Dv IP_HDRINCL
823option has been set.
824Unlike in previous
825.Bx
826releases, incoming packets are received with
827.Tn IP
828header and options intact, leaving all fields in network byte order.
829.Pp
830.Dv IP_HDRINCL
831indicates the complete IP header is included with the data
832and may be used only with the
833.Dv SOCK_RAW
834type.
835.Bd -literal
836#include <netinet/in_systm.h>
837#include <netinet/ip.h>
838
839int hincl = 1;                  /* 1 = on, 0 = off */
840setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
841.Ed
842.Pp
843Unlike previous
844.Bx
845releases, the program must set all
846the fields of the IP header, including the following:
847.Bd -literal
848ip->ip_v = IPVERSION;
849ip->ip_hl = hlen >> 2;
850ip->ip_id = 0;  /* 0 means kernel set appropriate value */
851ip->ip_off = htons(offset);
852ip->ip_len = htons(len);
853.Ed
854.Pp
855The packet should be provided as is to be sent over wire.
856This implies all fields, including
857.Va ip_len
858and
859.Va ip_off
860to be in network byte order.
861See
862.Xr byteorder 3
863for more information on network byte order.
864If the
865.Va ip_id
866field is set to 0 then the kernel will choose an
867appropriate value.
868If the header source address is set to
869.Dv INADDR_ANY ,
870the kernel will choose an appropriate address.
871.Sh ERRORS
872A socket operation may fail with one of the following errors returned:
873.Bl -tag -width Er
874.It Bq Er EISCONN
875when trying to establish a connection on a socket which
876already has one, or when trying to send a datagram with the destination
877address specified and the socket is already connected;
878.It Bq Er ENOTCONN
879when trying to send a datagram, but
880no destination address is specified, and the socket has not been
881connected;
882.It Bq Er ENOBUFS
883when the system runs out of memory for
884an internal data structure;
885.It Bq Er EADDRNOTAVAIL
886when an attempt is made to create a
887socket with a network address for which no network interface
888exists.
889.It Bq Er EACCES
890when an attempt is made to create
891a raw IP socket by a non-privileged process.
892.El
893.Pp
894The following errors specific to
895.Tn IP
896may occur when setting or getting
897.Tn IP
898options:
899.Bl -tag -width Er
900.It Bq Er EINVAL
901An unknown socket option name was given.
902.It Bq Er EINVAL
903The IP option field was improperly formed;
904an option field was shorter than the minimum value
905or longer than the option buffer provided.
906.El
907.Pp
908The following errors may occur when attempting to send
909.Tn IP
910datagrams via a
911.Dq raw socket
912with the
913.Dv IP_HDRINCL
914option set:
915.Bl -tag -width Er
916.It Bq Er EINVAL
917The user-supplied
918.Va ip_len
919field was not equal to the length of the datagram written to the socket.
920.El
921.Sh SEE ALSO
922.Xr getsockopt 2 ,
923.Xr recv 2 ,
924.Xr send 2 ,
925.Xr byteorder 3 ,
926.Xr CMSG_DATA 3 ,
927.Xr sourcefilter 3 ,
928.Xr icmp 4 ,
929.Xr igmp 4 ,
930.Xr inet 4 ,
931.Xr intro 4 ,
932.Xr multicast 4
933.Rs
934.%A D. Thaler
935.%A B. Fenner
936.%A B. Quinn
937.%T "Socket Interface Extensions for Multicast Source Filters"
938.%N RFC 3678
939.%D Jan 2004
940.Re
941.Sh HISTORY
942The
943.Nm
944protocol appeared in
945.Bx 4.2 .
946The
947.Vt ip_mreqn
948structure appeared in
949.Tn Linux 2.4 .
950.Sh BUGS
951Before
952.Fx 10.0
953packets received on raw IP sockets had the
954.Va ip_hl
955subtracted from the
956.Va ip_len
957field.
958.Pp
959Before
960.Fx 11.0
961packets received on raw IP sockets had the
962.Va ip_len
963and
964.Va ip_off
965fields converted to host byte order.
966Packets written to raw IP sockets were expected to have
967.Va ip_len
968and
969.Va ip_off
970in host byte order.
971