xref: /freebsd/share/man/man4/ip.4 (revision 3c4ba5f55438f7afd4f4b0b56f88f2bb505fd6a6)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. Neither the name of the University nor the names of its contributors
13.\"    may be used to endorse or promote products derived from this software
14.\"    without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
29.\" $FreeBSD$
30.\"
31.Dd August 9, 2021
32.Dt IP 4
33.Os
34.Sh NAME
35.Nm ip
36.Nd Internet Protocol
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/socket.h
40.In netinet/in.h
41.Ft int
42.Fn socket AF_INET SOCK_RAW proto
43.Sh DESCRIPTION
44.Tn IP
45is the transport layer protocol used
46by the Internet protocol family.
47Options may be set at the
48.Tn IP
49level
50when using higher-level protocols that are based on
51.Tn IP
52(such as
53.Tn TCP
54and
55.Tn UDP ) .
56It may also be accessed
57through a
58.Dq raw socket
59when developing new protocols, or
60special-purpose applications.
61.Pp
62There are several
63.Tn IP-level
64.Xr setsockopt 2
65and
66.Xr getsockopt 2
67options.
68.Dv IP_OPTIONS
69may be used to provide
70.Tn IP
71options to be transmitted in the
72.Tn IP
73header of each outgoing packet
74or to examine the header options on incoming packets.
75.Tn IP
76options may be used with any socket type in the Internet family.
77The format of
78.Tn IP
79options to be sent is that specified by the
80.Tn IP
81protocol specification (RFC-791), with one exception:
82the list of addresses for Source Route options must include the first-hop
83gateway at the beginning of the list of gateways.
84The first-hop gateway address will be extracted from the option list
85and the size adjusted accordingly before use.
86To disable previously specified options,
87use a zero-length buffer:
88.Bd -literal
89setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
90.Ed
91.Pp
92.Dv IP_TOS
93may be used to set the differential service codepoint (DSCP) and the
94explicit congestion notfication (ECN) codepoint.
95Setting the ECN codepoint - the two least significant bits - on a
96socket using a transport protocol implementing ECN has no effect.
97.Pp
98.Dv IP_TTL
99configures the time-to-live (TTL) field in the
100.Tn IP
101header for
102.Dv SOCK_STREAM , SOCK_DGRAM ,
103and certain types of
104.Dv SOCK_RAW
105sockets.
106For example,
107.Bd -literal
108int tos = IPTOS_DSCP_EF;       /* see <netinet/ip.h> */
109setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
110
111int ttl = 60;                   /* max = 255 */
112setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
113.Ed
114.Pp
115.Dv IP_IPSEC_POLICY
116controls IPSec policy for sockets.
117For example,
118.Bd -literal
119const char *policy = "in ipsec ah/transport//require";
120char *buf = ipsec_set_policy(policy, strlen(policy));
121setsockopt(s, IPPROTO_IP, IP_IPSEC_POLICY, buf, ipsec_get_policylen(buf));
122.Ed
123.Pp
124.Dv IP_MINTTL
125may be used to set the minimum acceptable TTL a packet must have when
126received on a socket.
127All packets with a lower TTL are silently dropped.
128This option is only really useful when set to 255, preventing packets
129from outside the directly connected networks reaching local listeners
130on sockets.
131.Pp
132.Dv IP_DONTFRAG
133may be used to set the Don't Fragment flag on IP packets.
134Currently this option is respected only on
135.Xr udp 4
136and raw
137.Nm
138sockets, unless the
139.Dv IP_HDRINCL
140option has been set.
141On
142.Xr tcp 4
143sockets, the Don't Fragment flag is controlled by the Path
144MTU Discovery option.
145Sending a packet larger than the MTU size of the egress interface,
146determined by the destination address, returns an
147.Er EMSGSIZE
148error.
149.Pp
150If the
151.Dv IP_ORIGDSTADDR
152option is enabled on a
153.Dv SOCK_DGRAM
154socket,
155the
156.Xr recvmsg 2
157call will return the destination
158.Tn IP
159address and destination port for a
160.Tn UDP
161datagram.
162The
163.Vt msg_control
164field in the
165.Vt msghdr
166structure points to a buffer
167that contains a
168.Vt cmsghdr
169structure followed by the
170.Tn sockaddr_in
171structure.
172The
173.Vt cmsghdr
174fields have the following values:
175.Bd -literal
176cmsg_len = CMSG_LEN(sizeof(struct sockaddr_in))
177cmsg_level = IPPROTO_IP
178cmsg_type = IP_ORIGDSTADDR
179.Ed
180.Pp
181If the
182.Dv IP_RECVDSTADDR
183option is enabled on a
184.Dv SOCK_DGRAM
185socket,
186the
187.Xr recvmsg 2
188call will return the destination
189.Tn IP
190address for a
191.Tn UDP
192datagram.
193The
194.Vt msg_control
195field in the
196.Vt msghdr
197structure points to a buffer
198that contains a
199.Vt cmsghdr
200structure followed by the
201.Tn IP
202address.
203The
204.Vt cmsghdr
205fields have the following values:
206.Bd -literal
207cmsg_len = CMSG_LEN(sizeof(struct in_addr))
208cmsg_level = IPPROTO_IP
209cmsg_type = IP_RECVDSTADDR
210.Ed
211.Pp
212The source address to be used for outgoing
213.Tn UDP
214datagrams on a socket can be specified as ancillary data with a type code of
215.Dv IP_SENDSRCADDR .
216The msg_control field in the msghdr structure should point to a buffer
217that contains a
218.Vt cmsghdr
219structure followed by the
220.Tn IP
221address.
222The cmsghdr fields should have the following values:
223.Bd -literal
224cmsg_len = CMSG_LEN(sizeof(struct in_addr))
225cmsg_level = IPPROTO_IP
226cmsg_type = IP_SENDSRCADDR
227.Ed
228.Pp
229The socket should be either bound to
230.Dv INADDR_ANY
231and a local port, and the address supplied with
232.Dv IP_SENDSRCADDR
233should't be
234.Dv INADDR_ANY ,
235or the socket should be bound to a local address and the address supplied with
236.Dv IP_SENDSRCADDR
237should be
238.Dv INADDR_ANY .
239In the latter case bound address is overridden via generic source address
240selection logic, which would choose IP address of interface closest to
241destination.
242.Pp
243For convenience,
244.Dv IP_SENDSRCADDR
245is defined to have the same value as
246.Dv IP_RECVDSTADDR ,
247so the
248.Dv IP_RECVDSTADDR
249control message from
250.Xr recvmsg 2
251can be used directly as a control message for
252.Xr sendmsg 2 .
253.\"
254.Pp
255If the
256.Dv IP_ONESBCAST
257option is enabled on a
258.Dv SOCK_DGRAM
259or a
260.Dv SOCK_RAW
261socket, the destination address of outgoing
262broadcast datagrams on that socket will be forced
263to the undirected broadcast address,
264.Dv INADDR_BROADCAST ,
265before transmission.
266This is in contrast to the default behavior of the
267system, which is to transmit undirected broadcasts
268via the first network interface with the
269.Dv IFF_BROADCAST
270flag set.
271.Pp
272This option allows applications to choose which
273interface is used to transmit an undirected broadcast
274datagram.
275For example, the following code would force an
276undirected broadcast to be transmitted via the interface
277configured with the broadcast address 192.168.2.255:
278.Bd -literal
279char msg[512];
280struct sockaddr_in sin;
281int onesbcast = 1;	/* 0 = disable (default), 1 = enable */
282
283setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
284sin.sin_addr.s_addr = inet_addr("192.168.2.255");
285sin.sin_port = htons(1234);
286sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
287.Ed
288.Pp
289It is the application's responsibility to set the
290.Dv IP_TTL
291option
292to an appropriate value in order to prevent broadcast storms.
293The application must have sufficient credentials to set the
294.Dv SO_BROADCAST
295socket level option, otherwise the
296.Dv IP_ONESBCAST
297option has no effect.
298.Pp
299If the
300.Dv IP_BINDANY
301option is enabled on a
302.Dv SOCK_STREAM ,
303.Dv SOCK_DGRAM
304or a
305.Dv SOCK_RAW
306socket, one can
307.Xr bind 2
308to any address, even one not bound to any available network interface in the
309system.
310This functionality (in conjunction with special firewall rules) can be used for
311implementing a transparent proxy.
312The
313.Dv PRIV_NETINET_BINDANY
314privilege is needed to set this option.
315.Pp
316If the
317.Dv IP_RECVTTL
318option is enabled on a
319.Dv SOCK_DGRAM
320socket, the
321.Xr recvmsg 2
322call will return the
323.Tn IP
324.Tn TTL
325(time to live) field for a
326.Tn UDP
327datagram.
328The msg_control field in the msghdr structure points to a buffer
329that contains a cmsghdr structure followed by the
330.Tn TTL .
331The cmsghdr fields have the following values:
332.Bd -literal
333cmsg_len = CMSG_LEN(sizeof(u_char))
334cmsg_level = IPPROTO_IP
335cmsg_type = IP_RECVTTL
336.Ed
337.\"
338.Pp
339If the
340.Dv IP_RECVTOS
341option is enabled on a
342.Dv SOCK_DGRAM
343socket, the
344.Xr recvmsg 2
345call will return the
346.Tn IP
347.Tn TOS
348(type of service) field for a
349.Tn UDP
350datagram.
351The msg_control field in the msghdr structure points to a buffer
352that contains a cmsghdr structure followed by the
353.Tn TOS .
354The cmsghdr fields have the following values:
355.Bd -literal
356cmsg_len = CMSG_LEN(sizeof(u_char))
357cmsg_level = IPPROTO_IP
358cmsg_type = IP_RECVTOS
359.Ed
360.\"
361.Pp
362If the
363.Dv IP_RECVIF
364option is enabled on a
365.Dv SOCK_DGRAM
366socket, the
367.Xr recvmsg 2
368call returns a
369.Vt "struct sockaddr_dl"
370corresponding to the interface on which the
371packet was received.
372The
373.Va msg_control
374field in the
375.Vt msghdr
376structure points to a buffer that contains a
377.Vt cmsghdr
378structure followed by the
379.Vt "struct sockaddr_dl" .
380The
381.Vt cmsghdr
382fields have the following values:
383.Bd -literal
384cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl))
385cmsg_level = IPPROTO_IP
386cmsg_type = IP_RECVIF
387.Ed
388.Pp
389.Dv IP_PORTRANGE
390may be used to set the port range used for selecting a local port number
391on a socket with an unspecified (zero) port number.
392It has the following
393possible values:
394.Bl -tag -width IP_PORTRANGE_DEFAULT
395.It Dv IP_PORTRANGE_DEFAULT
396use the default range of values, normally
397.Dv IPPORT_HIFIRSTAUTO
398through
399.Dv IPPORT_HILASTAUTO .
400This is adjustable through the sysctl setting:
401.Va net.inet.ip.portrange.first
402and
403.Va net.inet.ip.portrange.last .
404.It Dv IP_PORTRANGE_HIGH
405use a high range of values, normally
406.Dv IPPORT_HIFIRSTAUTO
407and
408.Dv IPPORT_HILASTAUTO .
409This is adjustable through the sysctl setting:
410.Va net.inet.ip.portrange.hifirst
411and
412.Va net.inet.ip.portrange.hilast .
413.It Dv IP_PORTRANGE_LOW
414use a low range of ports, which are normally restricted to
415privileged processes on
416.Ux
417systems.
418The range is normally from
419.Dv IPPORT_RESERVED
420\- 1 down to
421.Li IPPORT_RESERVEDSTART
422in descending order.
423This is adjustable through the sysctl setting:
424.Va net.inet.ip.portrange.lowfirst
425and
426.Va net.inet.ip.portrange.lowlast .
427.El
428.Pp
429The range of privileged ports which only may be opened by
430root-owned processes may be modified by the
431.Va net.inet.ip.portrange.reservedlow
432and
433.Va net.inet.ip.portrange.reservedhigh
434sysctl settings.
435The values default to the traditional range,
4360 through
437.Dv IPPORT_RESERVED
438\- 1
439(0 through 1023), respectively.
440Note that these settings do not affect and are not accounted for in the
441use or calculation of the other
442.Va net.inet.ip.portrange
443values above.
444Changing these values departs from
445.Ux
446tradition and has security
447consequences that the administrator should carefully evaluate before
448modifying these settings.
449.Pp
450Ports are allocated at random within the specified port range in order
451to increase the difficulty of random spoofing attacks.
452In scenarios such as benchmarking, this behavior may be undesirable.
453In these cases,
454.Va net.inet.ip.portrange.randomized
455can be used to toggle randomization off.
456.Ss "Multicast Options"
457.Tn IP
458multicasting is supported only on
459.Dv AF_INET
460sockets of type
461.Dv SOCK_DGRAM
462and
463.Dv SOCK_RAW ,
464and only on networks where the interface
465driver supports multicasting.
466.Pp
467The
468.Dv IP_MULTICAST_TTL
469option changes the time-to-live (TTL)
470for outgoing multicast datagrams
471in order to control the scope of the multicasts:
472.Bd -literal
473u_char ttl;	/* range: 0 to 255, default = 1 */
474setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
475.Ed
476.Pp
477Datagrams with a TTL of 1 are not forwarded beyond the local network.
478Multicast datagrams with a TTL of 0 will not be transmitted on any network,
479but may be delivered locally if the sending host belongs to the destination
480group and if multicast loopback has not been disabled on the sending socket
481(see below).
482Multicast datagrams with TTL greater than 1 may be forwarded
483to other networks if a multicast router is attached to the local network.
484.Pp
485For hosts with multiple interfaces, where an interface has not
486been specified for a multicast group membership,
487each multicast transmission is sent from the primary network interface.
488The
489.Dv IP_MULTICAST_IF
490option overrides the default for
491subsequent transmissions from a given socket:
492.Bd -literal
493struct in_addr addr;
494setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
495.Ed
496.Pp
497where "addr" is the local
498.Tn IP
499address of the desired interface or
500.Dv INADDR_ANY
501to specify the default interface.
502.Pp
503To specify an interface by index, an instance of
504.Vt ip_mreqn
505may be passed instead.
506The
507.Vt imr_ifindex
508member should be set to the index of the desired interface,
509or 0 to specify the default interface.
510The kernel differentiates between these two structures by their size.
511.Pp
512The use of
513.Vt IP_MULTICAST_IF
514is
515.Em not recommended ,
516as multicast memberships are scoped to each
517individual interface.
518It is supported for legacy use only by applications,
519such as routing daemons, which expect to
520be able to transmit link-local IPv4 multicast datagrams (224.0.0.0/24)
521on multiple interfaces,
522without requesting an individual membership for each interface.
523.Pp
524.\"
525An interface's local IP address and multicast capability can
526be obtained via the
527.Dv SIOCGIFCONF
528and
529.Dv SIOCGIFFLAGS
530ioctls.
531Normal applications should not need to use this option.
532.Pp
533If a multicast datagram is sent to a group to which the sending host itself
534belongs (on the outgoing interface), a copy of the datagram is, by default,
535looped back by the IP layer for local delivery.
536The
537.Dv IP_MULTICAST_LOOP
538option gives the sender explicit control
539over whether or not subsequent datagrams are looped back:
540.Bd -literal
541u_char loop;	/* 0 = disable, 1 = enable (default) */
542setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
543.Ed
544.Pp
545This option
546improves performance for applications that may have no more than one
547instance on a single host (such as a routing daemon), by eliminating
548the overhead of receiving their own transmissions.
549It should generally not
550be used by applications for which there may be more than one instance on a
551single host (such as a conferencing program) or for which the sender does
552not belong to the destination group (such as a time querying program).
553.Pp
554The sysctl setting
555.Va net.inet.ip.mcast.loop
556controls the default setting of the
557.Dv IP_MULTICAST_LOOP
558socket option for new sockets.
559.Pp
560A multicast datagram sent with an initial TTL greater than 1 may be delivered
561to the sending host on a different interface from that on which it was sent,
562if the host belongs to the destination group on that other interface.
563The loopback control option has no effect on such delivery.
564.Pp
565A host must become a member of a multicast group before it can receive
566datagrams sent to the group.
567To join a multicast group, use the
568.Dv IP_ADD_MEMBERSHIP
569option:
570.Bd -literal
571struct ip_mreqn mreqn;
572setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreqn, sizeof(mreqn));
573.Ed
574.Pp
575where
576.Fa mreqn
577is the following structure:
578.Bd -literal
579struct ip_mreqn {
580    struct in_addr imr_multiaddr; /* IP multicast address of group */
581    struct in_addr imr_interface; /* local IP address of interface */
582    int            imr_ifindex;   /* interface index */
583}
584.Ed
585.Pp
586.Va imr_ifindex
587should be set to the index of a particular multicast-capable interface if
588the host is multihomed.
589If
590.Va imr_ifindex
591is non-zero, value of
592.Va imr_interface
593is ignored.
594Otherwise, if
595.Va imr_ifindex
596is 0, kernel will use IP address from
597.Va imr_interface
598to lookup the interface.
599Value of
600.Va imr_interface
601may be set to
602.Va INADDR_ANY
603to choose the default interface, although this is not recommended; this is
604considered to be the first interface corresponding to the default route.
605Otherwise, the first multicast-capable interface configured in the system
606will be used.
607.Pp
608Legacy
609.Vt "struct ip_mreq" ,
610that lacks
611.Va imr_ifindex
612field is also supported by
613.Dv IP_ADD_MEMBERSHIP
614setsockopt.
615In this case kernel would behave as if
616.Va imr_ifindex
617was set to zero:
618.Va imr_interface
619will be used to lookup interface.
620.Pp
621Prior to
622.Fx 7.0 ,
623if the
624.Va imr_interface
625member is within the network range
626.Li 0.0.0.0/8 ,
627it is treated as an interface index in the system interface MIB,
628as per the RIP Version 2 MIB Extension (RFC-1724).
629In versions of
630.Fx
631since 7.0, this behavior is no longer supported.
632Developers should
633instead use the RFC 3678 multicast source filter APIs; in particular,
634.Dv MCAST_JOIN_GROUP .
635.Pp
636Up to
637.Dv IP_MAX_MEMBERSHIPS
638memberships may be added on a single socket.
639Membership is associated with a single interface;
640programs running on multihomed hosts may need to
641join the same group on more than one interface.
642.Pp
643To drop a membership, use:
644.Bd -literal
645struct ip_mreq mreq;
646setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
647.Ed
648.Pp
649where
650.Fa mreq
651contains the same values as used to add the membership.
652Memberships are dropped when the socket is closed or the process exits.
653.\" TODO: Update this piece when IPv4 source-address selection is implemented.
654.Pp
655The IGMP protocol uses the primary IP address of the interface
656as its identifier for group membership.
657This is the first IP address configured on the interface.
658If this address is removed or changed, the results are
659undefined, as the IGMP membership state will then be inconsistent.
660If multiple IP aliases are configured on the same interface,
661they will be ignored.
662.Pp
663This shortcoming was addressed in IPv6; MLDv2 requires
664that the unique link-local address for an interface is
665used to identify an MLDv2 listener.
666.Ss "Source-Specific Multicast Options"
667Since
668.Fx 8.0 ,
669the use of Source-Specific Multicast (SSM) is supported.
670These extensions require an IGMPv3 multicast router in order to
671make best use of them.
672If a legacy multicast router is present on the link,
673.Fx
674will simply downgrade to the version of IGMP spoken by the router,
675and the benefits of source filtering on the upstream link
676will not be present, although the kernel will continue to
677squelch transmissions from blocked sources.
678.Pp
679Each group membership on a socket now has a filter mode:
680.Bl -tag -width MCAST_EXCLUDE
681.It Dv MCAST_EXCLUDE
682Datagrams sent to this group are accepted,
683unless the source is in a list of blocked source addresses.
684.It Dv MCAST_INCLUDE
685Datagrams sent to this group are accepted
686only if the source is in a list of accepted source addresses.
687.El
688.Pp
689Groups joined using the legacy
690.Dv IP_ADD_MEMBERSHIP
691option are placed in exclusive-mode,
692and are able to request that certain sources are blocked or allowed.
693This is known as the
694.Em delta-based API .
695.Pp
696To block a multicast source on an existing group membership:
697.Bd -literal
698struct ip_mreq_source mreqs;
699setsockopt(s, IPPROTO_IP, IP_BLOCK_SOURCE, &mreqs, sizeof(mreqs));
700.Ed
701.Pp
702where
703.Fa mreqs
704is the following structure:
705.Bd -literal
706struct ip_mreq_source {
707    struct in_addr imr_multiaddr; /* IP multicast address of group */
708    struct in_addr imr_sourceaddr; /* IP address of source */
709    struct in_addr imr_interface; /* local IP address of interface */
710}
711.Ed
712.Va imr_sourceaddr
713should be set to the address of the source to be blocked.
714.Pp
715To unblock a multicast source on an existing group:
716.Bd -literal
717struct ip_mreq_source mreqs;
718setsockopt(s, IPPROTO_IP, IP_UNBLOCK_SOURCE, &mreqs, sizeof(mreqs));
719.Ed
720.Pp
721The
722.Dv IP_BLOCK_SOURCE
723and
724.Dv IP_UNBLOCK_SOURCE
725options are
726.Em not permitted
727for inclusive-mode group memberships.
728.Pp
729To join a multicast group in
730.Dv MCAST_INCLUDE
731mode with a single source,
732or add another source to an existing inclusive-mode membership:
733.Bd -literal
734struct ip_mreq_source mreqs;
735setsockopt(s, IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
736.Ed
737.Pp
738To leave a single source from an existing group in inclusive mode:
739.Bd -literal
740struct ip_mreq_source mreqs;
741setsockopt(s, IPPROTO_IP, IP_DROP_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
742.Ed
743If this is the last accepted source for the group, the membership
744will be dropped.
745.Pp
746The
747.Dv IP_ADD_SOURCE_MEMBERSHIP
748and
749.Dv IP_DROP_SOURCE_MEMBERSHIP
750options are
751.Em not accepted
752for exclusive-mode group memberships.
753However, both exclusive and inclusive mode memberships
754support the use of the
755.Em full-state API
756documented in RFC 3678.
757For management of source filter lists using this API,
758please refer to
759.Xr sourcefilter 3 .
760.Pp
761The sysctl settings
762.Va net.inet.ip.mcast.maxsocksrc
763and
764.Va net.inet.ip.mcast.maxgrpsrc
765are used to specify an upper limit on the number of per-socket and per-group
766source filter entries which the kernel may allocate.
767.\"-----------------------
768.Ss "Raw IP Sockets"
769Raw
770.Tn IP
771sockets are connectionless,
772and are normally used with the
773.Xr sendto 2
774and
775.Xr recvfrom 2
776calls, though the
777.Xr connect 2
778call may also be used to fix the destination for future
779packets (in which case the
780.Xr read 2
781or
782.Xr recv 2
783and
784.Xr write 2
785or
786.Xr send 2
787system calls may be used).
788.Pp
789If
790.Fa proto
791is 0, the default protocol
792.Dv IPPROTO_RAW
793is used for outgoing
794packets, and only incoming packets destined for that protocol
795are received.
796If
797.Fa proto
798is non-zero, that protocol number will be used on outgoing packets
799and to filter incoming packets.
800.Pp
801Outgoing packets automatically have an
802.Tn IP
803header prepended to
804them (based on the destination address and the protocol
805number the socket is created with),
806unless the
807.Dv IP_HDRINCL
808option has been set.
809Unlike in previous
810.Bx
811releases, incoming packets are received with
812.Tn IP
813header and options intact, leaving all fields in network byte order.
814.Pp
815.Dv IP_HDRINCL
816indicates the complete IP header is included with the data
817and may be used only with the
818.Dv SOCK_RAW
819type.
820.Bd -literal
821#include <netinet/in_systm.h>
822#include <netinet/ip.h>
823
824int hincl = 1;                  /* 1 = on, 0 = off */
825setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
826.Ed
827.Pp
828Unlike previous
829.Bx
830releases, the program must set all
831the fields of the IP header, including the following:
832.Bd -literal
833ip->ip_v = IPVERSION;
834ip->ip_hl = hlen >> 2;
835ip->ip_id = 0;  /* 0 means kernel set appropriate value */
836ip->ip_off = htons(offset);
837ip->ip_len = htons(len);
838.Ed
839.Pp
840The packet should be provided as is to be sent over wire.
841This implies all fields, including
842.Va ip_len
843and
844.Va ip_off
845to be in network byte order.
846See
847.Xr byteorder 3
848for more information on network byte order.
849If the
850.Va ip_id
851field is set to 0 then the kernel will choose an
852appropriate value.
853If the header source address is set to
854.Dv INADDR_ANY ,
855the kernel will choose an appropriate address.
856.Sh ERRORS
857A socket operation may fail with one of the following errors returned:
858.Bl -tag -width Er
859.It Bq Er EISCONN
860when trying to establish a connection on a socket which
861already has one, or when trying to send a datagram with the destination
862address specified and the socket is already connected;
863.It Bq Er ENOTCONN
864when trying to send a datagram, but
865no destination address is specified, and the socket has not been
866connected;
867.It Bq Er ENOBUFS
868when the system runs out of memory for
869an internal data structure;
870.It Bq Er EADDRNOTAVAIL
871when an attempt is made to create a
872socket with a network address for which no network interface
873exists.
874.It Bq Er EACCES
875when an attempt is made to create
876a raw IP socket by a non-privileged process.
877.El
878.Pp
879The following errors specific to
880.Tn IP
881may occur when setting or getting
882.Tn IP
883options:
884.Bl -tag -width Er
885.It Bq Er EINVAL
886An unknown socket option name was given.
887.It Bq Er EINVAL
888The IP option field was improperly formed;
889an option field was shorter than the minimum value
890or longer than the option buffer provided.
891.El
892.Pp
893The following errors may occur when attempting to send
894.Tn IP
895datagrams via a
896.Dq raw socket
897with the
898.Dv IP_HDRINCL
899option set:
900.Bl -tag -width Er
901.It Bq Er EINVAL
902The user-supplied
903.Va ip_len
904field was not equal to the length of the datagram written to the socket.
905.El
906.Sh SEE ALSO
907.Xr getsockopt 2 ,
908.Xr recv 2 ,
909.Xr send 2 ,
910.Xr byteorder 3 ,
911.Xr CMSG_DATA 3 ,
912.Xr sourcefilter 3 ,
913.Xr icmp 4 ,
914.Xr igmp 4 ,
915.Xr inet 4 ,
916.Xr intro 4 ,
917.Xr multicast 4
918.Rs
919.%A D. Thaler
920.%A B. Fenner
921.%A B. Quinn
922.%T "Socket Interface Extensions for Multicast Source Filters"
923.%N RFC 3678
924.%D Jan 2004
925.Re
926.Sh HISTORY
927The
928.Nm
929protocol appeared in
930.Bx 4.2 .
931The
932.Vt ip_mreqn
933structure appeared in
934.Tn Linux 2.4 .
935.Sh BUGS
936Before
937.Fx 10.0
938packets received on raw IP sockets had the
939.Va ip_hl
940subtracted from the
941.Va ip_len
942field.
943.Pp
944Before
945.Fx 11.0
946packets received on raw IP sockets had the
947.Va ip_len
948and
949.Va ip_off
950fields converted to host byte order.
951Packets written to raw IP sockets were expected to have
952.Va ip_len
953and
954.Va ip_off
955in host byte order.
956