xref: /freebsd/share/man/man4/ip.4 (revision 63f537551380d2dab29fa402ad1269feae17e594)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. Neither the name of the University nor the names of its contributors
13.\"    may be used to endorse or promote products derived from this software
14.\"    without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
29.\"
30.Dd August 9, 2021
31.Dt IP 4
32.Os
33.Sh NAME
34.Nm ip
35.Nd Internet Protocol
36.Sh SYNOPSIS
37.In sys/types.h
38.In sys/socket.h
39.In netinet/in.h
40.Ft int
41.Fn socket AF_INET SOCK_RAW proto
42.Sh DESCRIPTION
43.Tn IP
44is the transport layer protocol used
45by the Internet protocol family.
46Options may be set at the
47.Tn IP
48level
49when using higher-level protocols that are based on
50.Tn IP
51(such as
52.Tn TCP
53and
54.Tn UDP ) .
55It may also be accessed
56through a
57.Dq raw socket
58when developing new protocols, or
59special-purpose applications.
60.Pp
61There are several
62.Tn IP-level
63.Xr setsockopt 2
64and
65.Xr getsockopt 2
66options.
67.Dv IP_OPTIONS
68may be used to provide
69.Tn IP
70options to be transmitted in the
71.Tn IP
72header of each outgoing packet
73or to examine the header options on incoming packets.
74.Tn IP
75options may be used with any socket type in the Internet family.
76The format of
77.Tn IP
78options to be sent is that specified by the
79.Tn IP
80protocol specification (RFC-791), with one exception:
81the list of addresses for Source Route options must include the first-hop
82gateway at the beginning of the list of gateways.
83The first-hop gateway address will be extracted from the option list
84and the size adjusted accordingly before use.
85To disable previously specified options,
86use a zero-length buffer:
87.Bd -literal
88setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
89.Ed
90.Pp
91.Dv IP_TOS
92may be used to set the differential service codepoint (DSCP) and the
93explicit congestion notfication (ECN) codepoint.
94Setting the ECN codepoint - the two least significant bits - on a
95socket using a transport protocol implementing ECN has no effect.
96.Pp
97.Dv IP_TTL
98configures the time-to-live (TTL) field in the
99.Tn IP
100header for
101.Dv SOCK_STREAM , SOCK_DGRAM ,
102and certain types of
103.Dv SOCK_RAW
104sockets.
105For example,
106.Bd -literal
107int tos = IPTOS_DSCP_EF;       /* see <netinet/ip.h> */
108setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
109
110int ttl = 60;                   /* max = 255 */
111setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
112.Ed
113.Pp
114.Dv IP_IPSEC_POLICY
115controls IPSec policy for sockets.
116For example,
117.Bd -literal
118const char *policy = "in ipsec ah/transport//require";
119char *buf = ipsec_set_policy(policy, strlen(policy));
120setsockopt(s, IPPROTO_IP, IP_IPSEC_POLICY, buf, ipsec_get_policylen(buf));
121.Ed
122.Pp
123.Dv IP_MINTTL
124may be used to set the minimum acceptable TTL a packet must have when
125received on a socket.
126All packets with a lower TTL are silently dropped.
127This option is only really useful when set to 255, preventing packets
128from outside the directly connected networks reaching local listeners
129on sockets.
130.Pp
131.Dv IP_DONTFRAG
132may be used to set the Don't Fragment flag on IP packets.
133Currently this option is respected only on
134.Xr udp 4
135and raw
136.Nm
137sockets, unless the
138.Dv IP_HDRINCL
139option has been set.
140On
141.Xr tcp 4
142sockets, the Don't Fragment flag is controlled by the Path
143MTU Discovery option.
144Sending a packet larger than the MTU size of the egress interface,
145determined by the destination address, returns an
146.Er EMSGSIZE
147error.
148.Pp
149If the
150.Dv IP_ORIGDSTADDR
151option is enabled on a
152.Dv SOCK_DGRAM
153socket,
154the
155.Xr recvmsg 2
156call will return the destination
157.Tn IP
158address and destination port for a
159.Tn UDP
160datagram.
161The
162.Vt msg_control
163field in the
164.Vt msghdr
165structure points to a buffer
166that contains a
167.Vt cmsghdr
168structure followed by the
169.Tn sockaddr_in
170structure.
171The
172.Vt cmsghdr
173fields have the following values:
174.Bd -literal
175cmsg_len = CMSG_LEN(sizeof(struct sockaddr_in))
176cmsg_level = IPPROTO_IP
177cmsg_type = IP_ORIGDSTADDR
178.Ed
179.Pp
180If the
181.Dv IP_RECVDSTADDR
182option is enabled on a
183.Dv SOCK_DGRAM
184socket,
185the
186.Xr recvmsg 2
187call will return the destination
188.Tn IP
189address for a
190.Tn UDP
191datagram.
192The
193.Vt msg_control
194field in the
195.Vt msghdr
196structure points to a buffer
197that contains a
198.Vt cmsghdr
199structure followed by the
200.Tn IP
201address.
202The
203.Vt cmsghdr
204fields have the following values:
205.Bd -literal
206cmsg_len = CMSG_LEN(sizeof(struct in_addr))
207cmsg_level = IPPROTO_IP
208cmsg_type = IP_RECVDSTADDR
209.Ed
210.Pp
211The source address to be used for outgoing
212.Tn UDP
213datagrams on a socket can be specified as ancillary data with a type code of
214.Dv IP_SENDSRCADDR .
215The msg_control field in the msghdr structure should point to a buffer
216that contains a
217.Vt cmsghdr
218structure followed by the
219.Tn IP
220address.
221The cmsghdr fields should have the following values:
222.Bd -literal
223cmsg_len = CMSG_LEN(sizeof(struct in_addr))
224cmsg_level = IPPROTO_IP
225cmsg_type = IP_SENDSRCADDR
226.Ed
227.Pp
228The socket should be either bound to
229.Dv INADDR_ANY
230and a local port, and the address supplied with
231.Dv IP_SENDSRCADDR
232should't be
233.Dv INADDR_ANY ,
234or the socket should be bound to a local address and the address supplied with
235.Dv IP_SENDSRCADDR
236should be
237.Dv INADDR_ANY .
238In the latter case bound address is overridden via generic source address
239selection logic, which would choose IP address of interface closest to
240destination.
241.Pp
242For convenience,
243.Dv IP_SENDSRCADDR
244is defined to have the same value as
245.Dv IP_RECVDSTADDR ,
246so the
247.Dv IP_RECVDSTADDR
248control message from
249.Xr recvmsg 2
250can be used directly as a control message for
251.Xr sendmsg 2 .
252.\"
253.Pp
254If the
255.Dv IP_ONESBCAST
256option is enabled on a
257.Dv SOCK_DGRAM
258or a
259.Dv SOCK_RAW
260socket, the destination address of outgoing
261broadcast datagrams on that socket will be forced
262to the undirected broadcast address,
263.Dv INADDR_BROADCAST ,
264before transmission.
265This is in contrast to the default behavior of the
266system, which is to transmit undirected broadcasts
267via the first network interface with the
268.Dv IFF_BROADCAST
269flag set.
270.Pp
271This option allows applications to choose which
272interface is used to transmit an undirected broadcast
273datagram.
274For example, the following code would force an
275undirected broadcast to be transmitted via the interface
276configured with the broadcast address 192.168.2.255:
277.Bd -literal
278char msg[512];
279struct sockaddr_in sin;
280int onesbcast = 1;	/* 0 = disable (default), 1 = enable */
281
282setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
283sin.sin_addr.s_addr = inet_addr("192.168.2.255");
284sin.sin_port = htons(1234);
285sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
286.Ed
287.Pp
288It is the application's responsibility to set the
289.Dv IP_TTL
290option
291to an appropriate value in order to prevent broadcast storms.
292The application must have sufficient credentials to set the
293.Dv SO_BROADCAST
294socket level option, otherwise the
295.Dv IP_ONESBCAST
296option has no effect.
297.Pp
298If the
299.Dv IP_BINDANY
300option is enabled on a
301.Dv SOCK_STREAM ,
302.Dv SOCK_DGRAM
303or a
304.Dv SOCK_RAW
305socket, one can
306.Xr bind 2
307to any address, even one not bound to any available network interface in the
308system.
309This functionality (in conjunction with special firewall rules) can be used for
310implementing a transparent proxy.
311The
312.Dv PRIV_NETINET_BINDANY
313privilege is needed to set this option.
314.Pp
315If the
316.Dv IP_RECVTTL
317option is enabled on a
318.Dv SOCK_DGRAM
319socket, the
320.Xr recvmsg 2
321call will return the
322.Tn IP
323.Tn TTL
324(time to live) field for a
325.Tn UDP
326datagram.
327The msg_control field in the msghdr structure points to a buffer
328that contains a cmsghdr structure followed by the
329.Tn TTL .
330The cmsghdr fields have the following values:
331.Bd -literal
332cmsg_len = CMSG_LEN(sizeof(u_char))
333cmsg_level = IPPROTO_IP
334cmsg_type = IP_RECVTTL
335.Ed
336.\"
337.Pp
338If the
339.Dv IP_RECVTOS
340option is enabled on a
341.Dv SOCK_DGRAM
342socket, the
343.Xr recvmsg 2
344call will return the
345.Tn IP
346.Tn TOS
347(type of service) field for a
348.Tn UDP
349datagram.
350The msg_control field in the msghdr structure points to a buffer
351that contains a cmsghdr structure followed by the
352.Tn TOS .
353The cmsghdr fields have the following values:
354.Bd -literal
355cmsg_len = CMSG_LEN(sizeof(u_char))
356cmsg_level = IPPROTO_IP
357cmsg_type = IP_RECVTOS
358.Ed
359.\"
360.Pp
361If the
362.Dv IP_RECVIF
363option is enabled on a
364.Dv SOCK_DGRAM
365socket, the
366.Xr recvmsg 2
367call returns a
368.Vt "struct sockaddr_dl"
369corresponding to the interface on which the
370packet was received.
371The
372.Va msg_control
373field in the
374.Vt msghdr
375structure points to a buffer that contains a
376.Vt cmsghdr
377structure followed by the
378.Vt "struct sockaddr_dl" .
379The
380.Vt cmsghdr
381fields have the following values:
382.Bd -literal
383cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl))
384cmsg_level = IPPROTO_IP
385cmsg_type = IP_RECVIF
386.Ed
387.Pp
388.Dv IP_PORTRANGE
389may be used to set the port range used for selecting a local port number
390on a socket with an unspecified (zero) port number.
391It has the following
392possible values:
393.Bl -tag -width IP_PORTRANGE_DEFAULT
394.It Dv IP_PORTRANGE_DEFAULT
395use the default range of values, normally
396.Dv IPPORT_HIFIRSTAUTO
397through
398.Dv IPPORT_HILASTAUTO .
399This is adjustable through the sysctl setting:
400.Va net.inet.ip.portrange.first
401and
402.Va net.inet.ip.portrange.last .
403.It Dv IP_PORTRANGE_HIGH
404use a high range of values, normally
405.Dv IPPORT_HIFIRSTAUTO
406and
407.Dv IPPORT_HILASTAUTO .
408This is adjustable through the sysctl setting:
409.Va net.inet.ip.portrange.hifirst
410and
411.Va net.inet.ip.portrange.hilast .
412.It Dv IP_PORTRANGE_LOW
413use a low range of ports, which are normally restricted to
414privileged processes on
415.Ux
416systems.
417The range is normally from
418.Dv IPPORT_RESERVED
419\- 1 down to
420.Li IPPORT_RESERVEDSTART
421in descending order.
422This is adjustable through the sysctl setting:
423.Va net.inet.ip.portrange.lowfirst
424and
425.Va net.inet.ip.portrange.lowlast .
426.El
427.Pp
428The range of privileged ports which only may be opened by
429root-owned processes may be modified by the
430.Va net.inet.ip.portrange.reservedlow
431and
432.Va net.inet.ip.portrange.reservedhigh
433sysctl settings.
434The values default to the traditional range,
4350 through
436.Dv IPPORT_RESERVED
437\- 1
438(0 through 1023), respectively.
439Note that these settings do not affect and are not accounted for in the
440use or calculation of the other
441.Va net.inet.ip.portrange
442values above.
443Changing these values departs from
444.Ux
445tradition and has security
446consequences that the administrator should carefully evaluate before
447modifying these settings.
448.Pp
449Ports are allocated at random within the specified port range in order
450to increase the difficulty of random spoofing attacks.
451In scenarios such as benchmarking, this behavior may be undesirable.
452In these cases,
453.Va net.inet.ip.portrange.randomized
454can be used to toggle randomization off.
455.Ss "Multicast Options"
456.Tn IP
457multicasting is supported only on
458.Dv AF_INET
459sockets of type
460.Dv SOCK_DGRAM
461and
462.Dv SOCK_RAW ,
463and only on networks where the interface
464driver supports multicasting.
465.Pp
466The
467.Dv IP_MULTICAST_TTL
468option changes the time-to-live (TTL)
469for outgoing multicast datagrams
470in order to control the scope of the multicasts:
471.Bd -literal
472u_char ttl;	/* range: 0 to 255, default = 1 */
473setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
474.Ed
475.Pp
476Datagrams with a TTL of 1 are not forwarded beyond the local network.
477Multicast datagrams with a TTL of 0 will not be transmitted on any network,
478but may be delivered locally if the sending host belongs to the destination
479group and if multicast loopback has not been disabled on the sending socket
480(see below).
481Multicast datagrams with TTL greater than 1 may be forwarded
482to other networks if a multicast router is attached to the local network.
483.Pp
484For hosts with multiple interfaces, where an interface has not
485been specified for a multicast group membership,
486each multicast transmission is sent from the primary network interface.
487The
488.Dv IP_MULTICAST_IF
489option overrides the default for
490subsequent transmissions from a given socket:
491.Bd -literal
492struct in_addr addr;
493setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
494.Ed
495.Pp
496where "addr" is the local
497.Tn IP
498address of the desired interface or
499.Dv INADDR_ANY
500to specify the default interface.
501.Pp
502To specify an interface by index, an instance of
503.Vt ip_mreqn
504may be passed instead.
505The
506.Vt imr_ifindex
507member should be set to the index of the desired interface,
508or 0 to specify the default interface.
509The kernel differentiates between these two structures by their size.
510.Pp
511The use of
512.Vt IP_MULTICAST_IF
513is
514.Em not recommended ,
515as multicast memberships are scoped to each
516individual interface.
517It is supported for legacy use only by applications,
518such as routing daemons, which expect to
519be able to transmit link-local IPv4 multicast datagrams (224.0.0.0/24)
520on multiple interfaces,
521without requesting an individual membership for each interface.
522.Pp
523.\"
524An interface's local IP address and multicast capability can
525be obtained via the
526.Dv SIOCGIFCONF
527and
528.Dv SIOCGIFFLAGS
529ioctls.
530Normal applications should not need to use this option.
531.Pp
532If a multicast datagram is sent to a group to which the sending host itself
533belongs (on the outgoing interface), a copy of the datagram is, by default,
534looped back by the IP layer for local delivery.
535The
536.Dv IP_MULTICAST_LOOP
537option gives the sender explicit control
538over whether or not subsequent datagrams are looped back:
539.Bd -literal
540u_char loop;	/* 0 = disable, 1 = enable (default) */
541setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
542.Ed
543.Pp
544This option
545improves performance for applications that may have no more than one
546instance on a single host (such as a routing daemon), by eliminating
547the overhead of receiving their own transmissions.
548It should generally not
549be used by applications for which there may be more than one instance on a
550single host (such as a conferencing program) or for which the sender does
551not belong to the destination group (such as a time querying program).
552.Pp
553The sysctl setting
554.Va net.inet.ip.mcast.loop
555controls the default setting of the
556.Dv IP_MULTICAST_LOOP
557socket option for new sockets.
558.Pp
559A multicast datagram sent with an initial TTL greater than 1 may be delivered
560to the sending host on a different interface from that on which it was sent,
561if the host belongs to the destination group on that other interface.
562The loopback control option has no effect on such delivery.
563.Pp
564A host must become a member of a multicast group before it can receive
565datagrams sent to the group.
566To join a multicast group, use the
567.Dv IP_ADD_MEMBERSHIP
568option:
569.Bd -literal
570struct ip_mreqn mreqn;
571setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreqn, sizeof(mreqn));
572.Ed
573.Pp
574where
575.Fa mreqn
576is the following structure:
577.Bd -literal
578struct ip_mreqn {
579    struct in_addr imr_multiaddr; /* IP multicast address of group */
580    struct in_addr imr_interface; /* local IP address of interface */
581    int            imr_ifindex;   /* interface index */
582}
583.Ed
584.Pp
585.Va imr_ifindex
586should be set to the index of a particular multicast-capable interface if
587the host is multihomed.
588If
589.Va imr_ifindex
590is non-zero, value of
591.Va imr_interface
592is ignored.
593Otherwise, if
594.Va imr_ifindex
595is 0, kernel will use IP address from
596.Va imr_interface
597to lookup the interface.
598Value of
599.Va imr_interface
600may be set to
601.Va INADDR_ANY
602to choose the default interface, although this is not recommended; this is
603considered to be the first interface corresponding to the default route.
604Otherwise, the first multicast-capable interface configured in the system
605will be used.
606.Pp
607Legacy
608.Vt "struct ip_mreq" ,
609that lacks
610.Va imr_ifindex
611field is also supported by
612.Dv IP_ADD_MEMBERSHIP
613setsockopt.
614In this case kernel would behave as if
615.Va imr_ifindex
616was set to zero:
617.Va imr_interface
618will be used to lookup interface.
619.Pp
620Prior to
621.Fx 7.0 ,
622if the
623.Va imr_interface
624member is within the network range
625.Li 0.0.0.0/8 ,
626it is treated as an interface index in the system interface MIB,
627as per the RIP Version 2 MIB Extension (RFC-1724).
628In versions of
629.Fx
630since 7.0, this behavior is no longer supported.
631Developers should
632instead use the RFC 3678 multicast source filter APIs; in particular,
633.Dv MCAST_JOIN_GROUP .
634.Pp
635Up to
636.Dv IP_MAX_MEMBERSHIPS
637memberships may be added on a single socket.
638Membership is associated with a single interface;
639programs running on multihomed hosts may need to
640join the same group on more than one interface.
641.Pp
642To drop a membership, use:
643.Bd -literal
644struct ip_mreq mreq;
645setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
646.Ed
647.Pp
648where
649.Fa mreq
650contains the same values as used to add the membership.
651Memberships are dropped when the socket is closed or the process exits.
652.\" TODO: Update this piece when IPv4 source-address selection is implemented.
653.Pp
654The IGMP protocol uses the primary IP address of the interface
655as its identifier for group membership.
656This is the first IP address configured on the interface.
657If this address is removed or changed, the results are
658undefined, as the IGMP membership state will then be inconsistent.
659If multiple IP aliases are configured on the same interface,
660they will be ignored.
661.Pp
662This shortcoming was addressed in IPv6; MLDv2 requires
663that the unique link-local address for an interface is
664used to identify an MLDv2 listener.
665.Ss "Source-Specific Multicast Options"
666Since
667.Fx 8.0 ,
668the use of Source-Specific Multicast (SSM) is supported.
669These extensions require an IGMPv3 multicast router in order to
670make best use of them.
671If a legacy multicast router is present on the link,
672.Fx
673will simply downgrade to the version of IGMP spoken by the router,
674and the benefits of source filtering on the upstream link
675will not be present, although the kernel will continue to
676squelch transmissions from blocked sources.
677.Pp
678Each group membership on a socket now has a filter mode:
679.Bl -tag -width MCAST_EXCLUDE
680.It Dv MCAST_EXCLUDE
681Datagrams sent to this group are accepted,
682unless the source is in a list of blocked source addresses.
683.It Dv MCAST_INCLUDE
684Datagrams sent to this group are accepted
685only if the source is in a list of accepted source addresses.
686.El
687.Pp
688Groups joined using the legacy
689.Dv IP_ADD_MEMBERSHIP
690option are placed in exclusive-mode,
691and are able to request that certain sources are blocked or allowed.
692This is known as the
693.Em delta-based API .
694.Pp
695To block a multicast source on an existing group membership:
696.Bd -literal
697struct ip_mreq_source mreqs;
698setsockopt(s, IPPROTO_IP, IP_BLOCK_SOURCE, &mreqs, sizeof(mreqs));
699.Ed
700.Pp
701where
702.Fa mreqs
703is the following structure:
704.Bd -literal
705struct ip_mreq_source {
706    struct in_addr imr_multiaddr; /* IP multicast address of group */
707    struct in_addr imr_sourceaddr; /* IP address of source */
708    struct in_addr imr_interface; /* local IP address of interface */
709}
710.Ed
711.Va imr_sourceaddr
712should be set to the address of the source to be blocked.
713.Pp
714To unblock a multicast source on an existing group:
715.Bd -literal
716struct ip_mreq_source mreqs;
717setsockopt(s, IPPROTO_IP, IP_UNBLOCK_SOURCE, &mreqs, sizeof(mreqs));
718.Ed
719.Pp
720The
721.Dv IP_BLOCK_SOURCE
722and
723.Dv IP_UNBLOCK_SOURCE
724options are
725.Em not permitted
726for inclusive-mode group memberships.
727.Pp
728To join a multicast group in
729.Dv MCAST_INCLUDE
730mode with a single source,
731or add another source to an existing inclusive-mode membership:
732.Bd -literal
733struct ip_mreq_source mreqs;
734setsockopt(s, IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
735.Ed
736.Pp
737To leave a single source from an existing group in inclusive mode:
738.Bd -literal
739struct ip_mreq_source mreqs;
740setsockopt(s, IPPROTO_IP, IP_DROP_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
741.Ed
742If this is the last accepted source for the group, the membership
743will be dropped.
744.Pp
745The
746.Dv IP_ADD_SOURCE_MEMBERSHIP
747and
748.Dv IP_DROP_SOURCE_MEMBERSHIP
749options are
750.Em not accepted
751for exclusive-mode group memberships.
752However, both exclusive and inclusive mode memberships
753support the use of the
754.Em full-state API
755documented in RFC 3678.
756For management of source filter lists using this API,
757please refer to
758.Xr sourcefilter 3 .
759.Pp
760The sysctl settings
761.Va net.inet.ip.mcast.maxsocksrc
762and
763.Va net.inet.ip.mcast.maxgrpsrc
764are used to specify an upper limit on the number of per-socket and per-group
765source filter entries which the kernel may allocate.
766.\"-----------------------
767.Ss "Raw IP Sockets"
768Raw
769.Tn IP
770sockets are connectionless,
771and are normally used with the
772.Xr sendto 2
773and
774.Xr recvfrom 2
775calls, though the
776.Xr connect 2
777call may also be used to fix the destination for future
778packets (in which case the
779.Xr read 2
780or
781.Xr recv 2
782and
783.Xr write 2
784or
785.Xr send 2
786system calls may be used).
787.Pp
788If
789.Fa proto
790is 0, the default protocol
791.Dv IPPROTO_RAW
792is used for outgoing
793packets, and only incoming packets destined for that protocol
794are received.
795If
796.Fa proto
797is non-zero, that protocol number will be used on outgoing packets
798and to filter incoming packets.
799.Pp
800Outgoing packets automatically have an
801.Tn IP
802header prepended to
803them (based on the destination address and the protocol
804number the socket is created with),
805unless the
806.Dv IP_HDRINCL
807option has been set.
808Unlike in previous
809.Bx
810releases, incoming packets are received with
811.Tn IP
812header and options intact, leaving all fields in network byte order.
813.Pp
814.Dv IP_HDRINCL
815indicates the complete IP header is included with the data
816and may be used only with the
817.Dv SOCK_RAW
818type.
819.Bd -literal
820#include <netinet/in_systm.h>
821#include <netinet/ip.h>
822
823int hincl = 1;                  /* 1 = on, 0 = off */
824setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
825.Ed
826.Pp
827Unlike previous
828.Bx
829releases, the program must set all
830the fields of the IP header, including the following:
831.Bd -literal
832ip->ip_v = IPVERSION;
833ip->ip_hl = hlen >> 2;
834ip->ip_id = 0;  /* 0 means kernel set appropriate value */
835ip->ip_off = htons(offset);
836ip->ip_len = htons(len);
837.Ed
838.Pp
839The packet should be provided as is to be sent over wire.
840This implies all fields, including
841.Va ip_len
842and
843.Va ip_off
844to be in network byte order.
845See
846.Xr byteorder 3
847for more information on network byte order.
848If the
849.Va ip_id
850field is set to 0 then the kernel will choose an
851appropriate value.
852If the header source address is set to
853.Dv INADDR_ANY ,
854the kernel will choose an appropriate address.
855.Sh ERRORS
856A socket operation may fail with one of the following errors returned:
857.Bl -tag -width Er
858.It Bq Er EISCONN
859when trying to establish a connection on a socket which
860already has one, or when trying to send a datagram with the destination
861address specified and the socket is already connected;
862.It Bq Er ENOTCONN
863when trying to send a datagram, but
864no destination address is specified, and the socket has not been
865connected;
866.It Bq Er ENOBUFS
867when the system runs out of memory for
868an internal data structure;
869.It Bq Er EADDRNOTAVAIL
870when an attempt is made to create a
871socket with a network address for which no network interface
872exists.
873.It Bq Er EACCES
874when an attempt is made to create
875a raw IP socket by a non-privileged process.
876.El
877.Pp
878The following errors specific to
879.Tn IP
880may occur when setting or getting
881.Tn IP
882options:
883.Bl -tag -width Er
884.It Bq Er EINVAL
885An unknown socket option name was given.
886.It Bq Er EINVAL
887The IP option field was improperly formed;
888an option field was shorter than the minimum value
889or longer than the option buffer provided.
890.El
891.Pp
892The following errors may occur when attempting to send
893.Tn IP
894datagrams via a
895.Dq raw socket
896with the
897.Dv IP_HDRINCL
898option set:
899.Bl -tag -width Er
900.It Bq Er EINVAL
901The user-supplied
902.Va ip_len
903field was not equal to the length of the datagram written to the socket.
904.El
905.Sh SEE ALSO
906.Xr getsockopt 2 ,
907.Xr recv 2 ,
908.Xr send 2 ,
909.Xr byteorder 3 ,
910.Xr CMSG_DATA 3 ,
911.Xr sourcefilter 3 ,
912.Xr icmp 4 ,
913.Xr igmp 4 ,
914.Xr inet 4 ,
915.Xr intro 4 ,
916.Xr multicast 4
917.Rs
918.%A D. Thaler
919.%A B. Fenner
920.%A B. Quinn
921.%T "Socket Interface Extensions for Multicast Source Filters"
922.%N RFC 3678
923.%D Jan 2004
924.Re
925.Sh HISTORY
926The
927.Nm
928protocol appeared in
929.Bx 4.2 .
930The
931.Vt ip_mreqn
932structure appeared in
933.Tn Linux 2.4 .
934.Sh BUGS
935Before
936.Fx 10.0
937packets received on raw IP sockets had the
938.Va ip_hl
939subtracted from the
940.Va ip_len
941field.
942.Pp
943Before
944.Fx 11.0
945packets received on raw IP sockets had the
946.Va ip_len
947and
948.Va ip_off
949fields converted to host byte order.
950Packets written to raw IP sockets were expected to have
951.Va ip_len
952and
953.Va ip_off
954in host byte order.
955