xref: /freebsd/share/man/man4/ip.4 (revision 3b3a8eb937bf8045231e8364bfd1b94cd4a95979)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
33.\" $FreeBSD$
34.\"
35.Dd September 12, 2012
36.Dt IP 4
37.Os
38.Sh NAME
39.Nm ip
40.Nd Internet Protocol
41.Sh SYNOPSIS
42.In sys/types.h
43.In sys/socket.h
44.In netinet/in.h
45.Ft int
46.Fn socket AF_INET SOCK_RAW proto
47.Sh DESCRIPTION
48.Tn IP
49is the transport layer protocol used
50by the Internet protocol family.
51Options may be set at the
52.Tn IP
53level
54when using higher-level protocols that are based on
55.Tn IP
56(such as
57.Tn TCP
58and
59.Tn UDP ) .
60It may also be accessed
61through a
62.Dq raw socket
63when developing new protocols, or
64special-purpose applications.
65.Pp
66There are several
67.Tn IP-level
68.Xr setsockopt 2
69and
70.Xr getsockopt 2
71options.
72.Dv IP_OPTIONS
73may be used to provide
74.Tn IP
75options to be transmitted in the
76.Tn IP
77header of each outgoing packet
78or to examine the header options on incoming packets.
79.Tn IP
80options may be used with any socket type in the Internet family.
81The format of
82.Tn IP
83options to be sent is that specified by the
84.Tn IP
85protocol specification (RFC-791), with one exception:
86the list of addresses for Source Route options must include the first-hop
87gateway at the beginning of the list of gateways.
88The first-hop gateway address will be extracted from the option list
89and the size adjusted accordingly before use.
90To disable previously specified options,
91use a zero-length buffer:
92.Bd -literal
93setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
94.Ed
95.Pp
96.Dv IP_TOS
97and
98.Dv IP_TTL
99may be used to set the type-of-service and time-to-live
100fields in the
101.Tn IP
102header for
103.Dv SOCK_STREAM , SOCK_DGRAM ,
104and certain types of
105.Dv SOCK_RAW
106sockets.
107For example,
108.Bd -literal
109int tos = IPTOS_LOWDELAY;       /* see <netinet/ip.h> */
110setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
111
112int ttl = 60;                   /* max = 255 */
113setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
114.Ed
115.Pp
116.Dv IP_MINTTL
117may be used to set the minimum acceptable TTL a packet must have when
118received on a socket.
119All packets with a lower TTL are silently dropped.
120This option is only really useful when set to 255, preventing packets
121from outside the directly connected networks reaching local listeners
122on sockets.
123.Pp
124.Dv IP_DONTFRAG
125may be used to set the Don't Fragment flag on IP packets.
126Currently this option is respected only on
127.Xr udp 4
128and raw
129.Xr ip 4
130sockets, unless the
131.Dv IP_HDRINCL
132option has been set.
133On
134.Xr tcp 4
135sockets, the Don't Fragment flag is controlled by the Path
136MTU Discovery option.
137Sending a packet larger than the MTU size of the egress interface,
138determined by the destination address, returns an
139.Er EMSGSIZE
140error.
141.Pp
142If the
143.Dv IP_RECVDSTADDR
144option is enabled on a
145.Dv SOCK_DGRAM
146socket,
147the
148.Xr recvmsg 2
149call will return the destination
150.Tn IP
151address for a
152.Tn UDP
153datagram.
154The
155.Vt msg_control
156field in the
157.Vt msghdr
158structure points to a buffer
159that contains a
160.Vt cmsghdr
161structure followed by the
162.Tn IP
163address.
164The
165.Vt cmsghdr
166fields have the following values:
167.Bd -literal
168cmsg_len = CMSG_LEN(sizeof(struct in_addr))
169cmsg_level = IPPROTO_IP
170cmsg_type = IP_RECVDSTADDR
171.Ed
172.Pp
173The source address to be used for outgoing
174.Tn UDP
175datagrams on a socket can be specified as ancillary data with a type code of
176.Dv IP_SENDSRCADDR .
177The msg_control field in the msghdr structure should point to a buffer
178that contains a
179.Vt cmsghdr
180structure followed by the
181.Tn IP
182address.
183The cmsghdr fields should have the following values:
184.Bd -literal
185cmsg_len = CMSG_LEN(sizeof(struct in_addr))
186cmsg_level = IPPROTO_IP
187cmsg_type = IP_SENDSRCADDR
188.Ed
189.Pp
190The socket should be bound to a local port.
191The socket may be bound or not bound to a local address.
192In the former case address supplied with
193.Dv IP_SENDSRCADDR
194overrides bound address.
195If the socket is bound to a local address and the address supplied with
196.Dv IP_SENDSRCADDR
197is
198.Dv INADDR_ANY ,
199then bound address is overriden via generic source address selection logic,
200which would choose IP address of interface closest to destination.
201If the socket is not bound to a local address, then address supplied with
202.Dv IP_SENDSRCADDR
203can't be
204.Dv INADDR_ANY .
205.Pp
206For convenience,
207.Dv IP_SENDSRCADDR
208is defined to have the same value as
209.Dv IP_RECVDSTADDR ,
210so the
211.Dv IP_RECVDSTADDR
212control message from
213.Xr recvmsg 2
214can be used directly as a control message for
215.Xr sendmsg 2 .
216.\"
217.Pp
218If the
219.Dv IP_ONESBCAST
220option is enabled on a
221.Dv SOCK_DGRAM
222or a
223.Dv SOCK_RAW
224socket, the destination address of outgoing
225broadcast datagrams on that socket will be forced
226to the undirected broadcast address,
227.Dv INADDR_BROADCAST ,
228before transmission.
229This is in contrast to the default behavior of the
230system, which is to transmit undirected broadcasts
231via the first network interface with the
232.Dv IFF_BROADCAST
233flag set.
234.Pp
235This option allows applications to choose which
236interface is used to transmit an undirected broadcast
237datagram.
238For example, the following code would force an
239undirected broadcast to be transmitted via the interface
240configured with the broadcast address 192.168.2.255:
241.Bd -literal
242char msg[512];
243struct sockaddr_in sin;
244int onesbcast = 1;	/* 0 = disable (default), 1 = enable */
245
246setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
247sin.sin_addr.s_addr = inet_addr("192.168.2.255");
248sin.sin_port = htons(1234);
249sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
250.Ed
251.Pp
252It is the application's responsibility to set the
253.Dv IP_TTL
254option
255to an appropriate value in order to prevent broadcast storms.
256The application must have sufficient credentials to set the
257.Dv SO_BROADCAST
258socket level option, otherwise the
259.Dv IP_ONESBCAST
260option has no effect.
261.Pp
262If the
263.Dv IP_BINDANY
264option is enabled on a
265.Dv SOCK_STREAM ,
266.Dv SOCK_DGRAM
267or a
268.Dv SOCK_RAW
269socket, one can
270.Xr bind 2
271to any address, even one not bound to any available network interface in the
272system.
273This functionality (in conjunction with special firewall rules) can be used for
274implementing a transparent proxy.
275The
276.Dv PRIV_NETINET_BINDANY
277privilege is needed to set this option.
278.Pp
279If the
280.Dv IP_RECVTTL
281option is enabled on a
282.Dv SOCK_DGRAM
283socket, the
284.Xr recvmsg 2
285call will return the
286.Tn IP
287.Tn TTL
288(time to live) field for a
289.Tn UDP
290datagram.
291The msg_control field in the msghdr structure points to a buffer
292that contains a cmsghdr structure followed by the
293.Tn TTL .
294The cmsghdr fields have the following values:
295.Bd -literal
296cmsg_len = CMSG_LEN(sizeof(u_char))
297cmsg_level = IPPROTO_IP
298cmsg_type = IP_RECVTTL
299.Ed
300.\"
301.Pp
302If the
303.Dv IP_RECVTOS
304option is enabled on a
305.Dv SOCK_DGRAM
306socket, the
307.Xr recvmsg 2
308call will return the
309.Tn IP
310.Tn TOS
311(type of service) field for a
312.Tn UDP
313datagram.
314The msg_control field in the msghdr structure points to a buffer
315that contains a cmsghdr structure followed by the
316.Tn TOS .
317The cmsghdr fields have the following values:
318.Bd -literal
319cmsg_len = CMSG_LEN(sizeof(u_char))
320cmsg_level = IPPROTO_IP
321cmsg_type = IP_RECVTOS
322.Ed
323.\"
324.Pp
325If the
326.Dv IP_RECVIF
327option is enabled on a
328.Dv SOCK_DGRAM
329socket, the
330.Xr recvmsg 2
331call returns a
332.Vt "struct sockaddr_dl"
333corresponding to the interface on which the
334packet was received.
335The
336.Va msg_control
337field in the
338.Vt msghdr
339structure points to a buffer that contains a
340.Vt cmsghdr
341structure followed by the
342.Vt "struct sockaddr_dl" .
343The
344.Vt cmsghdr
345fields have the following values:
346.Bd -literal
347cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl))
348cmsg_level = IPPROTO_IP
349cmsg_type = IP_RECVIF
350.Ed
351.Pp
352.Dv IP_PORTRANGE
353may be used to set the port range used for selecting a local port number
354on a socket with an unspecified (zero) port number.
355It has the following
356possible values:
357.Bl -tag -width IP_PORTRANGE_DEFAULT
358.It Dv IP_PORTRANGE_DEFAULT
359use the default range of values, normally
360.Dv IPPORT_HIFIRSTAUTO
361through
362.Dv IPPORT_HILASTAUTO .
363This is adjustable through the sysctl setting:
364.Va net.inet.ip.portrange.first
365and
366.Va net.inet.ip.portrange.last .
367.It Dv IP_PORTRANGE_HIGH
368use a high range of values, normally
369.Dv IPPORT_HIFIRSTAUTO
370and
371.Dv IPPORT_HILASTAUTO .
372This is adjustable through the sysctl setting:
373.Va net.inet.ip.portrange.hifirst
374and
375.Va net.inet.ip.portrange.hilast .
376.It Dv IP_PORTRANGE_LOW
377use a low range of ports, which are normally restricted to
378privileged processes on
379.Ux
380systems.
381The range is normally from
382.Dv IPPORT_RESERVED
383\- 1 down to
384.Li IPPORT_RESERVEDSTART
385in descending order.
386This is adjustable through the sysctl setting:
387.Va net.inet.ip.portrange.lowfirst
388and
389.Va net.inet.ip.portrange.lowlast .
390.El
391.Pp
392The range of privileged ports which only may be opened by
393root-owned processes may be modified by the
394.Va net.inet.ip.portrange.reservedlow
395and
396.Va net.inet.ip.portrange.reservedhigh
397sysctl settings.
398The values default to the traditional range,
3990 through
400.Dv IPPORT_RESERVED
401\- 1
402(0 through 1023), respectively.
403Note that these settings do not affect and are not accounted for in the
404use or calculation of the other
405.Va net.inet.ip.portrange
406values above.
407Changing these values departs from
408.Ux
409tradition and has security
410consequences that the administrator should carefully evaluate before
411modifying these settings.
412.Pp
413Ports are allocated at random within the specified port range in order
414to increase the difficulty of random spoofing attacks.
415In scenarios such as benchmarking, this behavior may be undesirable.
416In these cases,
417.Va net.inet.ip.portrange.randomized
418can be used to toggle randomization off.
419If more than
420.Va net.inet.ip.portrange.randomcps
421ports have been allocated in the last second, then return to sequential
422port allocation.
423Return to random allocation only once the current port allocation rate
424drops below
425.Va net.inet.ip.portrange.randomcps
426for at least
427.Va net.inet.ip.portrange.randomtime
428seconds.
429The default values for
430.Va net.inet.ip.portrange.randomcps
431and
432.Va net.inet.ip.portrange.randomtime
433are 10 port allocations per second and 45 seconds correspondingly.
434.Ss "Multicast Options"
435.Tn IP
436multicasting is supported only on
437.Dv AF_INET
438sockets of type
439.Dv SOCK_DGRAM
440and
441.Dv SOCK_RAW ,
442and only on networks where the interface
443driver supports multicasting.
444.Pp
445The
446.Dv IP_MULTICAST_TTL
447option changes the time-to-live (TTL)
448for outgoing multicast datagrams
449in order to control the scope of the multicasts:
450.Bd -literal
451u_char ttl;	/* range: 0 to 255, default = 1 */
452setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
453.Ed
454.Pp
455Datagrams with a TTL of 1 are not forwarded beyond the local network.
456Multicast datagrams with a TTL of 0 will not be transmitted on any network,
457but may be delivered locally if the sending host belongs to the destination
458group and if multicast loopback has not been disabled on the sending socket
459(see below).
460Multicast datagrams with TTL greater than 1 may be forwarded
461to other networks if a multicast router is attached to the local network.
462.Pp
463For hosts with multiple interfaces, where an interface has not
464been specified for a multicast group membership,
465each multicast transmission is sent from the primary network interface.
466The
467.Dv IP_MULTICAST_IF
468option overrides the default for
469subsequent transmissions from a given socket:
470.Bd -literal
471struct in_addr addr;
472setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
473.Ed
474.Pp
475where "addr" is the local
476.Tn IP
477address of the desired interface or
478.Dv INADDR_ANY
479to specify the default interface.
480.Pp
481To specify an interface by index, an instance of
482.Vt ip_mreqn
483may be passed instead.
484The
485.Vt imr_ifindex
486member should be set to the index of the desired interface,
487or 0 to specify the default interface.
488The kernel differentiates between these two structures by their size.
489.Pp
490The use of
491.Vt IP_MULTICAST_IF
492is
493.Em not recommended ,
494as multicast memberships are scoped to each
495individual interface.
496It is supported for legacy use only by applications,
497such as routing daemons, which expect to
498be able to transmit link-local IPv4 multicast datagrams (224.0.0.0/24)
499on multiple interfaces,
500without requesting an individual membership for each interface.
501.Pp
502.\"
503An interface's local IP address and multicast capability can
504be obtained via the
505.Dv SIOCGIFCONF
506and
507.Dv SIOCGIFFLAGS
508ioctls.
509Normal applications should not need to use this option.
510.Pp
511If a multicast datagram is sent to a group to which the sending host itself
512belongs (on the outgoing interface), a copy of the datagram is, by default,
513looped back by the IP layer for local delivery.
514The
515.Dv IP_MULTICAST_LOOP
516option gives the sender explicit control
517over whether or not subsequent datagrams are looped back:
518.Bd -literal
519u_char loop;	/* 0 = disable, 1 = enable (default) */
520setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
521.Ed
522.Pp
523This option
524improves performance for applications that may have no more than one
525instance on a single host (such as a routing daemon), by eliminating
526the overhead of receiving their own transmissions.
527It should generally not
528be used by applications for which there may be more than one instance on a
529single host (such as a conferencing program) or for which the sender does
530not belong to the destination group (such as a time querying program).
531.Pp
532The sysctl setting
533.Va net.inet.ip.mcast.loop
534controls the default setting of the
535.Dv IP_MULTICAST_LOOP
536socket option for new sockets.
537.Pp
538A multicast datagram sent with an initial TTL greater than 1 may be delivered
539to the sending host on a different interface from that on which it was sent,
540if the host belongs to the destination group on that other interface.
541The loopback control option has no effect on such delivery.
542.Pp
543A host must become a member of a multicast group before it can receive
544datagrams sent to the group.
545To join a multicast group, use the
546.Dv IP_ADD_MEMBERSHIP
547option:
548.Bd -literal
549struct ip_mreq mreq;
550setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
551.Ed
552.Pp
553where
554.Fa mreq
555is the following structure:
556.Bd -literal
557struct ip_mreq {
558    struct in_addr imr_multiaddr; /* IP multicast address of group */
559    struct in_addr imr_interface; /* local IP address of interface */
560}
561.Ed
562.Pp
563.Va imr_interface
564should be set to the
565.Tn IP
566address of a particular multicast-capable interface if
567the host is multihomed.
568It may be set to
569.Dv INADDR_ANY
570to choose the default interface, although this is not recommended;
571this is considered to be the first interface corresponding
572to the default route.
573Otherwise, the first multicast-capable interface
574configured in the system will be used.
575.Pp
576Prior to
577.Fx 7.0 ,
578if the
579.Va imr_interface
580member is within the network range
581.Li 0.0.0.0/8 ,
582it is treated as an interface index in the system interface MIB,
583as per the RIP Version 2 MIB Extension (RFC-1724).
584In versions of
585.Fx
586since 7.0, this behavior is no longer supported.
587Developers should
588instead use the RFC 3678 multicast source filter APIs; in particular,
589.Dv MCAST_JOIN_GROUP .
590.Pp
591Up to
592.Dv IP_MAX_MEMBERSHIPS
593memberships may be added on a single socket.
594Membership is associated with a single interface;
595programs running on multihomed hosts may need to
596join the same group on more than one interface.
597.Pp
598To drop a membership, use:
599.Bd -literal
600struct ip_mreq mreq;
601setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
602.Ed
603.Pp
604where
605.Fa mreq
606contains the same values as used to add the membership.
607Memberships are dropped when the socket is closed or the process exits.
608.\" TODO: Update this piece when IPv4 source-address selection is implemented.
609.Pp
610The IGMP protocol uses the primary IP address of the interface
611as its identifier for group membership.
612This is the first IP address configured on the interface.
613If this address is removed or changed, the results are
614undefined, as the IGMP membership state will then be inconsistent.
615If multiple IP aliases are configured on the same interface,
616they will be ignored.
617.Pp
618This shortcoming was addressed in IPv6; MLDv2 requires
619that the unique link-local address for an interface is
620used to identify an MLDv2 listener.
621.Ss "Source-Specific Multicast Options"
622Since
623.Fx 8.0 ,
624the use of Source-Specific Multicast (SSM) is supported.
625These extensions require an IGMPv3 multicast router in order to
626make best use of them.
627If a legacy multicast router is present on the link,
628.Fx
629will simply downgrade to the version of IGMP spoken by the router,
630and the benefits of source filtering on the upstream link
631will not be present, although the kernel will continue to
632squelch transmissions from blocked sources.
633.Pp
634Each group membership on a socket now has a filter mode:
635.Bl -tag -width MCAST_EXCLUDE
636.It Dv MCAST_EXCLUDE
637Datagrams sent to this group are accepted,
638unless the source is in a list of blocked source addresses.
639.It Dv MCAST_INCLUDE
640Datagrams sent to this group are accepted
641only if the source is in a list of accepted source addresses.
642.El
643.Pp
644Groups joined using the legacy
645.Dv IP_ADD_MEMBERSHIP
646option are placed in exclusive-mode,
647and are able to request that certain sources are blocked or allowed.
648This is known as the
649.Em delta-based API .
650.Pp
651To block a multicast source on an existing group membership:
652.Bd -literal
653struct ip_mreq_source mreqs;
654setsockopt(s, IPPROTO_IP, IP_BLOCK_SOURCE, &mreqs, sizeof(mreqs));
655.Ed
656.Pp
657where
658.Fa mreqs
659is the following structure:
660.Bd -literal
661struct ip_mreq_source {
662    struct in_addr imr_multiaddr; /* IP multicast address of group */
663    struct in_addr imr_sourceaddr; /* IP address of source */
664    struct in_addr imr_interface; /* local IP address of interface */
665}
666.Ed
667.Va imr_sourceaddr
668should be set to the address of the source to be blocked.
669.Pp
670To unblock a multicast source on an existing group:
671.Bd -literal
672struct ip_mreq_source mreqs;
673setsockopt(s, IPPROTO_IP, IP_UNBLOCK_SOURCE, &mreqs, sizeof(mreqs));
674.Ed
675.Pp
676The
677.Dv IP_BLOCK_SOURCE
678and
679.Dv IP_UNBLOCK_SOURCE
680options are
681.Em not permitted
682for inclusive-mode group memberships.
683.Pp
684To join a multicast group in
685.Dv MCAST_INCLUDE
686mode with a single source,
687or add another source to an existing inclusive-mode membership:
688.Bd -literal
689struct ip_mreq_source mreqs;
690setsockopt(s, IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
691.Ed
692.Pp
693To leave a single source from an existing group in inclusive mode:
694.Bd -literal
695struct ip_mreq_source mreqs;
696setsockopt(s, IPPROTO_IP, IP_DROP_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs));
697.Ed
698If this is the last accepted source for the group, the membership
699will be dropped.
700.Pp
701The
702.Dv IP_ADD_SOURCE_MEMBERSHIP
703and
704.Dv IP_DROP_SOURCE_MEMBERSHIP
705options are
706.Em not accepted
707for exclusive-mode group memberships.
708However, both exclusive and inclusive mode memberships
709support the use of the
710.Em full-state API
711documented in RFC 3678.
712For management of source filter lists using this API,
713please refer to
714.Xr sourcefilter 3 .
715.Pp
716The sysctl settings
717.Va net.inet.ip.mcast.maxsocksrc
718and
719.Va net.inet.ip.mcast.maxgrpsrc
720are used to specify an upper limit on the number of per-socket and per-group
721source filter entries which the kernel may allocate.
722.\"-----------------------
723.Ss "Raw IP Sockets"
724Raw
725.Tn IP
726sockets are connectionless,
727and are normally used with the
728.Xr sendto 2
729and
730.Xr recvfrom 2
731calls, though the
732.Xr connect 2
733call may also be used to fix the destination for future
734packets (in which case the
735.Xr read 2
736or
737.Xr recv 2
738and
739.Xr write 2
740or
741.Xr send 2
742system calls may be used).
743.Pp
744If
745.Fa proto
746is 0, the default protocol
747.Dv IPPROTO_RAW
748is used for outgoing
749packets, and only incoming packets destined for that protocol
750are received.
751If
752.Fa proto
753is non-zero, that protocol number will be used on outgoing packets
754and to filter incoming packets.
755.Pp
756Outgoing packets automatically have an
757.Tn IP
758header prepended to
759them (based on the destination address and the protocol
760number the socket is created with),
761unless the
762.Dv IP_HDRINCL
763option has been set.
764Incoming packets are received with
765.Tn IP
766header and options intact.
767.Pp
768.Dv IP_HDRINCL
769indicates the complete IP header is included with the data
770and may be used only with the
771.Dv SOCK_RAW
772type.
773.Bd -literal
774#include <netinet/in_systm.h>
775#include <netinet/ip.h>
776
777int hincl = 1;                  /* 1 = on, 0 = off */
778setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
779.Ed
780.Pp
781Unlike previous
782.Bx
783releases, the program must set all
784the fields of the IP header, including the following:
785.Bd -literal
786ip->ip_v = IPVERSION;
787ip->ip_hl = hlen >> 2;
788ip->ip_id = 0;  /* 0 means kernel set appropriate value */
789ip->ip_off = offset;
790.Ed
791.Pp
792The
793.Va ip_len
794and
795.Va ip_off
796fields
797.Em must
798be provided in host byte order.
799All other fields must be provided in network byte order.
800See
801.Xr byteorder 3
802for more information on network byte order.
803If the
804.Va ip_id
805field is set to 0 then the kernel will choose an
806appropriate value.
807If the header source address is set to
808.Dv INADDR_ANY ,
809the kernel will choose an appropriate address.
810.Sh ERRORS
811A socket operation may fail with one of the following errors returned:
812.Bl -tag -width Er
813.It Bq Er EISCONN
814when trying to establish a connection on a socket which
815already has one, or when trying to send a datagram with the destination
816address specified and the socket is already connected;
817.It Bq Er ENOTCONN
818when trying to send a datagram, but
819no destination address is specified, and the socket has not been
820connected;
821.It Bq Er ENOBUFS
822when the system runs out of memory for
823an internal data structure;
824.It Bq Er EADDRNOTAVAIL
825when an attempt is made to create a
826socket with a network address for which no network interface
827exists.
828.It Bq Er EACCES
829when an attempt is made to create
830a raw IP socket by a non-privileged process.
831.El
832.Pp
833The following errors specific to
834.Tn IP
835may occur when setting or getting
836.Tn IP
837options:
838.Bl -tag -width Er
839.It Bq Er EINVAL
840An unknown socket option name was given.
841.It Bq Er EINVAL
842The IP option field was improperly formed;
843an option field was shorter than the minimum value
844or longer than the option buffer provided.
845.El
846.Pp
847The following errors may occur when attempting to send
848.Tn IP
849datagrams via a
850.Dq raw socket
851with the
852.Dv IP_HDRINCL
853option set:
854.Bl -tag -width Er
855.It Bq Er EINVAL
856The user-supplied
857.Va ip_len
858field was not equal to the length of the datagram written to the socket.
859.El
860.Sh SEE ALSO
861.Xr getsockopt 2 ,
862.Xr recv 2 ,
863.Xr send 2 ,
864.Xr byteorder 3 ,
865.Xr icmp 4 ,
866.Xr igmp 4 ,
867.Xr inet 4 ,
868.Xr intro 4 ,
869.Xr multicast 4 ,
870.Xr sourcefilter 3
871.Rs
872.%A D. Thaler
873.%A B. Fenner
874.%A B. Quinn
875.%T "Socket Interface Extensions for Multicast Source Filters"
876.%N RFC 3678
877.%D Jan 2004
878.Re
879.Sh HISTORY
880The
881.Nm
882protocol appeared in
883.Bx 4.2 .
884The
885.Vt ip_mreqn
886structure appeared in
887.Tn Linux 2.4 .
888.Sh BUGS
889Before
890.Fx 10.0
891packets received on raw IP sockets had the
892.Va ip_hl
893subtracted from the
894.Va ip_len
895field.
896