xref: /freebsd/share/man/man4/ip.4 (revision b3d14eaccc5f606690d99b1998bfdf32a22404f6)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. Neither the name of the University nor the names of its contributors
13.\"    may be used to endorse or promote products derived from this software
14.\"    without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
29.\" $FreeBSD$
30.\"
31.Dd August 9, 2021
32.Dt IP 4
33.Os
34.Sh NAME
35.Nm ip
36.Nd Internet Protocol
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/socket.h
40.In netinet/in.h
41.Ft int
42.Fn socket AF_INET SOCK_RAW proto
43.Sh DESCRIPTION
44.Tn IP
45is the transport layer protocol used
46by the Internet protocol family.
47Options may be set at the
48.Tn IP
49level
50when using higher-level protocols that are based on
51.Tn IP
52(such as
53.Tn TCP
54and
55.Tn UDP ) .
56It may also be accessed
57through a
58.Dq raw socket
59when developing new protocols, or
60special-purpose applications.
61.Pp
62There are several
63.Tn IP-level
64.Xr setsockopt 2
65and
66.Xr getsockopt 2
67options.
68.Dv IP_OPTIONS
69may be used to provide
70.Tn IP
71options to be transmitted in the
72.Tn IP
73header of each outgoing packet
74or to examine the header options on incoming packets.
75.Tn IP
76options may be used with any socket type in the Internet family.
77The format of
78.Tn IP
79options to be sent is that specified by the
80.Tn IP
81protocol specification (RFC-791), with one exception:
82the list of addresses for Source Route options must include the first-hop
83gateway at the beginning of the list of gateways.
84The first-hop gateway address will be extracted from the option list
85and the size adjusted accordingly before use.
86To disable previously specified options,
87use a zero-length buffer:
88.Bd -literal
89setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
90.Ed
91.Pp
92.Dv IP_TOS
93and
94.Dv IP_TTL
95may be used to set the type-of-service and time-to-live
96fields in the
97.Tn IP
98header for
99.Dv SOCK_STREAM , SOCK_DGRAM ,
100and certain types of
101.Dv SOCK_RAW
102sockets.
103For example,
104.Bd -literal
105int tos = IPTOS_LOWDELAY;       /* see <netinet/ip.h> */
106setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
107
108int ttl = 60;                   /* max = 255 */
109setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
110.Ed
111.Pp
112.Dv IP_IPSEC_POLICY
113controls IPSec policy for sockets.
114For example,
115.Bd -literal
116const char *policy = "in ipsec ah/transport//require";
117char *buf = ipsec_set_policy(policy, strlen(policy));
118setsockopt(s, IPPROTO_IP, IP_IPSEC_POLICY, buf, ipsec_get_policylen(buf));
119.Ed
120.Pp
121.Dv IP_MINTTL
122may be used to set the minimum acceptable TTL a packet must have when
123received on a socket.
124All packets with a lower TTL are silently dropped.
125This option is only really useful when set to 255, preventing packets
126from outside the directly connected networks reaching local listeners
127on sockets.
128.Pp
129.Dv IP_DONTFRAG
130may be used to set the Don't Fragment flag on IP packets.
131Currently this option is respected only on
132.Xr udp 4
133and raw
134.Nm
135sockets, unless the
136.Dv IP_HDRINCL
137option has been set.
138On
139.Xr tcp 4
140sockets, the Don't Fragment flag is controlled by the Path
141MTU Discovery option.
142Sending a packet larger than the MTU size of the egress interface,
143determined by the destination address, returns an
144.Er EMSGSIZE
145error.
146.Pp
147If the
148.Dv IP_ORIGDSTADDR
149option is enabled on a
150.Dv SOCK_DGRAM
151socket,
152the
153.Xr recvmsg 2
154call will return the destination
155.Tn IP
156address and destination port for a
157.Tn UDP
158datagram.
159The
160.Vt msg_control
161field in the
162.Vt msghdr
163structure points to a buffer
164that contains a
165.Vt cmsghdr
166structure followed by the
167.Tn sockaddr_in
168structure.
169The
170.Vt cmsghdr
171fields have the following values:
172.Bd -literal
173cmsg_len = CMSG_LEN(sizeof(struct sockaddr_in))
174cmsg_level = IPPROTO_IP
175cmsg_type = IP_ORIGDSTADDR
176.Ed
177.Pp
178If the
179.Dv IP_RECVDSTADDR
180option is enabled on a
181.Dv SOCK_DGRAM
182socket,
183the
184.Xr recvmsg 2
185call will return the destination
186.Tn IP
187address for a
188.Tn UDP
189datagram.
190The
191.Vt msg_control
192field in the
193.Vt msghdr
194structure points to a buffer
195that contains a
196.Vt cmsghdr
197structure followed by the
198.Tn IP
199address.
200The
201.Vt cmsghdr
202fields have the following values:
203.Bd -literal
204cmsg_len = CMSG_LEN(sizeof(struct in_addr))
205cmsg_level = IPPROTO_IP
206cmsg_type = IP_RECVDSTADDR
207.Ed
208.Pp
209The source address to be used for outgoing
210.Tn UDP
211datagrams on a socket can be specified as ancillary data with a type code of
212.Dv IP_SENDSRCADDR .
213The msg_control field in the msghdr structure should point to a buffer
214that contains a
215.Vt cmsghdr
216structure followed by the
217.Tn IP
218address.
219The cmsghdr fields should have the following values:
220.Bd -literal
221cmsg_len = CMSG_LEN(sizeof(struct in_addr))
222cmsg_level = IPPROTO_IP
223cmsg_type = IP_SENDSRCADDR
224.Ed
225.Pp
226The socket should be either bound to
227.Dv INADDR_ANY
228and a local port, and the address supplied with
229.Dv IP_SENDSRCADDR
230should't be
231.Dv INADDR_ANY ,
232or the socket should be bound to a local address and the address supplied with
233.Dv IP_SENDSRCADDR
234should be
235.Dv INADDR_ANY .
236In the latter case bound address is overridden via generic source address
237selection logic, which would choose IP address of interface closest to
238destination.
239.Pp
240For convenience,
241.Dv IP_SENDSRCADDR
242is defined to have the same value as
243.Dv IP_RECVDSTADDR ,
244so the
245.Dv IP_RECVDSTADDR
246control message from
247.Xr recvmsg 2
248can be used directly as a control message for
249.Xr sendmsg 2 .
250.\"
251.Pp
252If the
253.Dv IP_ONESBCAST
254option is enabled on a
255.Dv SOCK_DGRAM
256or a
257.Dv SOCK_RAW
258socket, the destination address of outgoing
259broadcast datagrams on that socket will be forced
260to the undirected broadcast address,
261.Dv INADDR_BROADCAST ,
262before transmission.
263This is in contrast to the default behavior of the
264system, which is to transmit undirected broadcasts
265via the first network interface with the
266.Dv IFF_BROADCAST
267flag set.
268.Pp
269This option allows applications to choose which
270interface is used to transmit an undirected broadcast
271datagram.
272For example, the following code would force an
273undirected broadcast to be transmitted via the interface
274configured with the broadcast address 192.168.2.255:
275.Bd -literal
276char msg[512];
277struct sockaddr_in sin;
278int onesbcast = 1;	/* 0 = disable (default), 1 = enable */
279
280setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
281sin.sin_addr.s_addr = inet_addr("192.168.2.255");
282sin.sin_port = htons(1234);
283sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
284.Ed
285.Pp
286It is the application's responsibility to set the
287.Dv IP_TTL
288option
289to an appropriate value in order to prevent broadcast storms.
290The application must have sufficient credentials to set the
291.Dv SO_BROADCAST
292socket level option, otherwise the
293.Dv IP_ONESBCAST
294option has no effect.
295.Pp
296If the
297.Dv IP_BINDANY
298option is enabled on a
299.Dv SOCK_STREAM ,
300.Dv SOCK_DGRAM
301or a
302.Dv SOCK_RAW
303socket, one can
304.Xr bind 2
305to any address, even one not bound to any available network interface in the
306system.
307This functionality (in conjunction with special firewall rules) can be used for
308implementing a transparent proxy.
309The
310.Dv PRIV_NETINET_BINDANY
311privilege is needed to set this option.
312.Pp
313If the
314.Dv IP_RECVTTL
315option is enabled on a
316.Dv SOCK_DGRAM
317socket, the
318.Xr recvmsg 2
319call will return the
320.Tn IP
321.Tn TTL
322(time to live) field for a
323.Tn UDP
324datagram.
325The msg_control field in the msghdr structure points to a buffer
326that contains a cmsghdr structure followed by the
327.Tn TTL .
328The cmsghdr fields have the following values:
329.Bd -literal
330cmsg_len = CMSG_LEN(sizeof(u_char))
331cmsg_level = IPPROTO_IP
332cmsg_type = IP_RECVTTL
333.Ed
334.\"
335.Pp
336If the
337.Dv IP_RECVTOS
338option is enabled on a
339.Dv SOCK_DGRAM
340socket, the
341.Xr recvmsg 2
342call will return the
343.Tn IP
344.Tn TOS
345(type of service) field for a
346.Tn UDP
347datagram.
348The msg_control field in the msghdr structure points to a buffer
349that contains a cmsghdr structure followed by the
350.Tn TOS .
351The cmsghdr fields have the following values:
352.Bd -literal
353cmsg_len = CMSG_LEN(sizeof(u_char))
354cmsg_level = IPPROTO_IP
355cmsg_type = IP_RECVTOS
356.Ed
357.\"
358.Pp
359If the
360.Dv IP_RECVIF
361option is enabled on a
362.Dv SOCK_DGRAM
363socket, the
364.Xr recvmsg 2
365call returns a
366.Vt "struct sockaddr_dl"
367corresponding to the interface on which the
368packet was received.
369The
370.Va msg_control
371field in the
372.Vt msghdr
373structure points to a buffer that contains a
374.Vt cmsghdr
375structure followed by the
376.Vt "struct sockaddr_dl" .
377The
378.Vt cmsghdr
379fields have the following values:
380.Bd -literal
381cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl))
382cmsg_level = IPPROTO_IP
383cmsg_type = IP_RECVIF
384.Ed
385.Pp
386.Dv IP_PORTRANGE
387may be used to set the port range used for selecting a local port number
388on a socket with an unspecified (zero) port number.
389It has the following
390possible values:
391.Bl -tag -width IP_PORTRANGE_DEFAULT
392.It Dv IP_PORTRANGE_DEFAULT
393use the default range of values, normally
394.Dv IPPORT_HIFIRSTAUTO
395through
396.Dv IPPORT_HILASTAUTO .
397This is adjustable through the sysctl setting:
398.Va net.inet.ip.portrange.first
399and
400.Va net.inet.ip.portrange.last .
401.It Dv IP_PORTRANGE_HIGH
402use a high range of values, normally
403.Dv IPPORT_HIFIRSTAUTO
404and
405.Dv IPPORT_HILASTAUTO .
406This is adjustable through the sysctl setting:
407.Va net.inet.ip.portrange.hifirst
408and
409.Va net.inet.ip.portrange.hilast .
410.It Dv IP_PORTRANGE_LOW
411use a low range of ports, which are normally restricted to
412privileged processes on
413.Ux
414systems.
415The range is normally from
416.Dv IPPORT_RESERVED
417\- 1 down to
418.Li IPPORT_RESERVEDSTART
419in descending order.
420This is adjustable through the sysctl setting:
421.Va net.inet.ip.portrange.lowfirst
422and
423.Va net.inet.ip.portrange.lowlast .
424.El
425.Pp
426The range of privileged ports which only may be opened by
427root-owned processes may be modified by the
428.Va net.inet.ip.portrange.reservedlow
429and
430.Va net.inet.ip.portrange.reservedhigh
431sysctl settings.
432The values default to the traditional range,
4330 through
434.Dv IPPORT_RESERVED
435\- 1
436(0 through 1023), respectively.
437Note that these settings do not affect and are not accounted for in the
438use or calculation of the other
439.Va net.inet.ip.portrange
440values above.
441Changing these values departs from
442.Ux
443tradition and has security
444consequences that the administrator should carefully evaluate before
445modifying these settings.
446.Pp
447Ports are allocated at random within the specified port range in order
448to increase the difficulty of random spoofing attacks.
449In scenarios such as benchmarking, this behavior may be undesirable.
450In these cases,
451.Va net.inet.ip.portrange.randomized
452can be used to toggle randomization off.
453If more than
454.Va net.inet.ip.portrange.randomcps
455ports have been allocated in the last second, then return to sequential
456port allocation.
457Return to random allocation only once the current port allocation rate
458drops below
459.Va net.inet.ip.portrange.randomcps
460for at least
461.Va net.inet.ip.portrange.randomtime
462seconds.
463The default values for
464.Va net.inet.ip.portrange.randomcps
465and
466.Va net.inet.ip.portrange.randomtime
467are 10 port allocations per second and 45 seconds correspondingly.
468.Ss "Multicast Options"
469.Tn IP
470multicasting is supported only on
471.Dv AF_INET
472sockets of type
473.Dv SOCK_DGRAM
474and
475.Dv SOCK_RAW ,
476and only on networks where the interface
477driver supports multicasting.
478.Pp
479The
480.Dv IP_MULTICAST_TTL
481option changes the time-to-live (TTL)
482for outgoing multicast datagrams
483in order to control the scope of the multicasts:
484.Bd -literal
485u_char ttl;	/* range: 0 to 255, default = 1 */
486setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
487.Ed
488.Pp
489Datagrams with a TTL of 1 are not forwarded beyond the local network.
490Multicast datagrams with a TTL of 0 will not be transmitted on any network,
491but may be delivered locally if the sending host belongs to the destination
492group and if multicast loopback has not been disabled on the sending socket
493(see below).
494Multicast datagrams with TTL greater than 1 may be forwarded
495to other networks if a multicast router is attached to the local network.
496.Pp
497For hosts with multiple interfaces, where an interface has not
498been specified for a multicast group membership,
499each multicast transmission is sent from the primary network interface.
500The
501.Dv IP_MULTICAST_IF
502option overrides the default for
503subsequent transmissions from a given socket:
504.Bd -literal
505struct in_addr addr;
506setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
507.Ed
508.Pp
509where "addr" is the local
510.Tn IP
511address of the desired interface or
512.Dv INADDR_ANY
513to specify the default interface.
514.Pp
515To specify an interface by index, an instance of
516.Vt ip_mreqn
517may be passed instead.
518The
519.Vt imr_ifindex
520member should be set to the index of the desired interface,
521or 0 to specify the default interface.
522The kernel differentiates between these two structures by their size.
523.Pp
524The use of
525.Vt IP_MULTICAST_IF
526is
527.Em not recommended ,
528as multicast memberships are scoped to each
529individual interface.
530It is supported for legacy use only by applications,
531such as routing daemons, which expect to
532be able to transmit link-local IPv4 multicast datagrams (224.0.0.0/24)
533on multiple interfaces,
534without requesting an individual membership for each interface.
535.Pp
536.\"
537An interface's local IP address and multicast capability can
538be obtained via the
539.Dv SIOCGIFCONF
540and
541.Dv SIOCGIFFLAGS
542ioctls.
543Normal applications should not need to use this option.
544.Pp
545If a multicast datagram is sent to a group to which the sending host itself
546belongs (on the outgoing interface), a copy of the datagram is, by default,
547looped back by the IP layer for local delivery.
548The
549.Dv IP_MULTICAST_LOOP
550option gives the sender explicit control
551over whether or not subsequent datagrams are looped back:
552.Bd -literal
553u_char loop;	/* 0 = disable, 1 = enable (default) */
554setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
555.Ed
556.Pp
557This option
558improves performance for applications that may have no more than one
559instance on a single host (such as a routing daemon), by eliminating
560the overhead of receiving their own transmissions.
561It should generally not
562be used by applications for which there may be more than one instance on a
563single host (such as a conferencing program) or for which the sender does
564not belong to the destination group (such as a time querying program).
565.Pp
566The sysctl setting
567.Va net.inet.ip.mcast.loop
568controls the default setting of the
569.Dv IP_MULTICAST_LOOP
570socket option for new sockets.
571.Pp
572A multicast datagram sent with an initial TTL greater than 1 may be delivered
573to the sending host on a different interface from that on which it was sent,
574if the host belongs to the destination group on that other interface.
575The loopback control option has no effect on such delivery.
576.Pp
577A host must become a member of a multicast group before it can receive
578datagrams sent to the group.
579To join a multicast group, use the
580.Dv IP_ADD_MEMBERSHIP
581option:
582.Bd -literal
583struct ip_mreqn mreqn;
584setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreqn, sizeof(mreqn));
585.Ed
586.Pp
587where
588.Fa mreqn
589is the following structure:
590.Bd -literal
591struct ip_mreqn {
592    struct in_addr imr_multiaddr; /* IP multicast address of group */
593    struct in_addr imr_interface; /* local IP address of interface */
594    int            imr_ifindex;   /* interface index */
595}
596.Ed
597.Pp
598.Va imr_ifindex
599should be set to the index of a particular multicast-capable interface if
600the host is multihomed.
601If
602.Va imr_ifindex
603is non-zero, value of
604.Va imr_interface
605is ignored.
606Otherwise, if
607.Va imr_ifindex
608is 0, kernel will use IP address from
609.Va imr_interface
610to lookup the interface.
611Value of
612.Va imr_interface
613may be set to
614.Va INADDR_ANY
615to choose the default interface, although this is not recommended; this is
616considered to be the first interface corresponding to the default route.
617Otherwise, the first multicast-capable interface configured in the system
618will be used.
619.Pp
620Legacy
621.Vt "struct ip_mreq" ,
622that lacks
623.Va imr_ifindex
624field is also supported by
625.Dv IP_ADD_MEMBERSHIP
626setsockopt.
627In this case kernel would behave as if
628.Va imr_ifindex
629was set to zero:
630.Va imr_interface
631will be used to lookup interface.
632.Pp
633Prior to
634.Fx 7.0 ,
635if the
636.Va imr_interface
637member is within the network range
638.Li 0.0.0.0/8 ,
639it is treated as an interface index in the system interface MIB,
640as per the RIP Version 2 MIB Extension (RFC-1724).
641In versions of
642.Fx
643since 7.0, this behavior is no longer supported.
644Developers should
645instead use the RFC 3678 multicast source filter APIs; in particular,
646.Dv MCAST_JOIN_GROUP .
647.Pp
648Up to
649.Dv IP_MAX_MEMBERSHIPS
650memberships may be added on a single socket.
651Membership is associated with a single interface;
652programs running on multihomed hosts may need to
653join the same group on more than one interface.
654.Pp
655To drop a membership, use:
656.Bd -literal
657struct ip_mreq mreq;
658setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
659.Ed
660.Pp
661where
662.Fa mreq
663contains the same values as used to add the membership.
664Memberships are dropped when the socket is closed or the process exits.
665.\" TODO: Update this piece when IPv4 source-address selection is implemented.
666.Pp
667The IGMP protocol uses the primary IP address of the interface
668as its identifier for group membership.
669This is the first IP address configured on the interface.
670If this address is removed or changed, the results are
671undefined, as the IGMP membership state will then be inconsistent.
672If multiple IP aliases are configured on the same interface,
673they will be ignored.
674.Pp
675This shortcoming was addressed in IPv6; MLDv2 requires
676that the unique link-local address for an interface is
677used to identify an MLDv2 listener.
678.Ss "Source-Specific Multicast Options"
679Since
680.Fx 8.0 ,
681the use of Source-Specific Multicast (SSM) is supported.
682These extensions require an IGMPv3 multicast router in order to
683make best use of them.
684If a legacy multicast router is present on the link,
685.Fx
686will simply downgrade to the version of IGMP spoken by the router,
687and the benefits of source filtering on the upstream link
688will not be present, although the kernel will continue to
689squelch transmissions from blocked sources.
690.Pp
691Each group membership on a socket now has a filter mode:
692.Bl -tag -width MCAST_EXCLUDE
693.It Dv MCAST_EXCLUDE
694Datagrams sent to this group are accepted,
695unless the source is in a list of blocked source addresses.
696.It Dv MCAST_INCLUDE
697Datagrams sent to this group are accepted
698only if the source is in a list of accepted source addresses.
699.El
700.Pp
701Groups joined using the legacy
702.Dv IP_ADD_MEMBERSHIP
703option are placed in exclusive-mode,
704and are able to request that certain sources are blocked or allowed.
705This is known as the
706.Em delta-based API .
707.Pp
708To block a multicast source on an existing group membership:
709.Bd -literal
710struct ip_mreq_source mreqs;
711setsockopt(s, IPPROTO_IP, IP_BLOCK_SOURCE, &mreqs, sizeof(mreqs));
712.Ed
713.Pp
714where
715.Fa mreqs
716is the following structure:
717.Bd -literal
718struct ip_mreq_source {
719    struct in_addr imr_multiaddr; /* IP multicast address of group */
720    struct in_addr imr_sourceaddr; /* IP address of source */
721    struct in_addr imr_interface; /* local IP address of interface */
722}
723.Ed
724.Va imr_sourceaddr
725should be set to the address of the source to be blocked.
726.Pp
727To unblock a multicast source on an existing group:
728.Bd -literal
729struct ip_mreq_source mreqs;
730setsockopt(s, IPPROTO_IP, IP_UNBLOCK_SOURCE, &mreqs, sizeof(mreqs));
731.Ed
732.Pp
733The
734.Dv IP_BLOCK_SOURCE
735and
736.Dv IP_UNBLOCK_SOURCE
737options are
738.Em not permitted
739for inclusive-mode group memberships.
740.Pp
741To join a multicast group in
742.Dv MCAST_INCLUDE
743mode with a single source,
744or add another source to an existing inclusive-mode membership:
745.Bd -literal
746struct ip_mreq_source mreqs;
747setsockopt(s, IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
748.Ed
749.Pp
750To leave a single source from an existing group in inclusive mode:
751.Bd -literal
752struct ip_mreq_source mreqs;
753setsockopt(s, IPPROTO_IP, IP_DROP_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
754.Ed
755If this is the last accepted source for the group, the membership
756will be dropped.
757.Pp
758The
759.Dv IP_ADD_SOURCE_MEMBERSHIP
760and
761.Dv IP_DROP_SOURCE_MEMBERSHIP
762options are
763.Em not accepted
764for exclusive-mode group memberships.
765However, both exclusive and inclusive mode memberships
766support the use of the
767.Em full-state API
768documented in RFC 3678.
769For management of source filter lists using this API,
770please refer to
771.Xr sourcefilter 3 .
772.Pp
773The sysctl settings
774.Va net.inet.ip.mcast.maxsocksrc
775and
776.Va net.inet.ip.mcast.maxgrpsrc
777are used to specify an upper limit on the number of per-socket and per-group
778source filter entries which the kernel may allocate.
779.\"-----------------------
780.Ss "Raw IP Sockets"
781Raw
782.Tn IP
783sockets are connectionless,
784and are normally used with the
785.Xr sendto 2
786and
787.Xr recvfrom 2
788calls, though the
789.Xr connect 2
790call may also be used to fix the destination for future
791packets (in which case the
792.Xr read 2
793or
794.Xr recv 2
795and
796.Xr write 2
797or
798.Xr send 2
799system calls may be used).
800.Pp
801If
802.Fa proto
803is 0, the default protocol
804.Dv IPPROTO_RAW
805is used for outgoing
806packets, and only incoming packets destined for that protocol
807are received.
808If
809.Fa proto
810is non-zero, that protocol number will be used on outgoing packets
811and to filter incoming packets.
812.Pp
813Outgoing packets automatically have an
814.Tn IP
815header prepended to
816them (based on the destination address and the protocol
817number the socket is created with),
818unless the
819.Dv IP_HDRINCL
820option has been set.
821Unlike in previous
822.Bx
823releases, incoming packets are received with
824.Tn IP
825header and options intact, leaving all fields in network byte order.
826.Pp
827.Dv IP_HDRINCL
828indicates the complete IP header is included with the data
829and may be used only with the
830.Dv SOCK_RAW
831type.
832.Bd -literal
833#include <netinet/in_systm.h>
834#include <netinet/ip.h>
835
836int hincl = 1;                  /* 1 = on, 0 = off */
837setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
838.Ed
839.Pp
840Unlike previous
841.Bx
842releases, the program must set all
843the fields of the IP header, including the following:
844.Bd -literal
845ip->ip_v = IPVERSION;
846ip->ip_hl = hlen >> 2;
847ip->ip_id = 0;  /* 0 means kernel set appropriate value */
848ip->ip_off = htons(offset);
849ip->ip_len = htons(len);
850.Ed
851.Pp
852The packet should be provided as is to be sent over wire.
853This implies all fields, including
854.Va ip_len
855and
856.Va ip_off
857to be in network byte order.
858See
859.Xr byteorder 3
860for more information on network byte order.
861If the
862.Va ip_id
863field is set to 0 then the kernel will choose an
864appropriate value.
865If the header source address is set to
866.Dv INADDR_ANY ,
867the kernel will choose an appropriate address.
868.Sh ERRORS
869A socket operation may fail with one of the following errors returned:
870.Bl -tag -width Er
871.It Bq Er EISCONN
872when trying to establish a connection on a socket which
873already has one, or when trying to send a datagram with the destination
874address specified and the socket is already connected;
875.It Bq Er ENOTCONN
876when trying to send a datagram, but
877no destination address is specified, and the socket has not been
878connected;
879.It Bq Er ENOBUFS
880when the system runs out of memory for
881an internal data structure;
882.It Bq Er EADDRNOTAVAIL
883when an attempt is made to create a
884socket with a network address for which no network interface
885exists.
886.It Bq Er EACCES
887when an attempt is made to create
888a raw IP socket by a non-privileged process.
889.El
890.Pp
891The following errors specific to
892.Tn IP
893may occur when setting or getting
894.Tn IP
895options:
896.Bl -tag -width Er
897.It Bq Er EINVAL
898An unknown socket option name was given.
899.It Bq Er EINVAL
900The IP option field was improperly formed;
901an option field was shorter than the minimum value
902or longer than the option buffer provided.
903.El
904.Pp
905The following errors may occur when attempting to send
906.Tn IP
907datagrams via a
908.Dq raw socket
909with the
910.Dv IP_HDRINCL
911option set:
912.Bl -tag -width Er
913.It Bq Er EINVAL
914The user-supplied
915.Va ip_len
916field was not equal to the length of the datagram written to the socket.
917.El
918.Sh SEE ALSO
919.Xr getsockopt 2 ,
920.Xr recv 2 ,
921.Xr send 2 ,
922.Xr byteorder 3 ,
923.Xr CMSG_DATA 3 ,
924.Xr sourcefilter 3 ,
925.Xr icmp 4 ,
926.Xr igmp 4 ,
927.Xr inet 4 ,
928.Xr intro 4 ,
929.Xr multicast 4
930.Rs
931.%A D. Thaler
932.%A B. Fenner
933.%A B. Quinn
934.%T "Socket Interface Extensions for Multicast Source Filters"
935.%N RFC 3678
936.%D Jan 2004
937.Re
938.Sh HISTORY
939The
940.Nm
941protocol appeared in
942.Bx 4.2 .
943The
944.Vt ip_mreqn
945structure appeared in
946.Tn Linux 2.4 .
947.Sh BUGS
948Before
949.Fx 10.0
950packets received on raw IP sockets had the
951.Va ip_hl
952subtracted from the
953.Va ip_len
954field.
955.Pp
956Before
957.Fx 11.0
958packets received on raw IP sockets had the
959.Va ip_len
960and
961.Va ip_off
962fields converted to host byte order.
963Packets written to raw IP sockets were expected to have
964.Va ip_len
965and
966.Va ip_off
967in host byte order.
968