xref: /freebsd/share/man/man4/ip.4 (revision 94942af266ac119ede0ca836f9aa5a5ac0582938)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
33.\" $FreeBSD$
34.\"
35.Dd March 18, 2007
36.Dt IP 4
37.Os
38.Sh NAME
39.Nm ip
40.Nd Internet Protocol
41.Sh SYNOPSIS
42.In sys/types.h
43.In sys/socket.h
44.In netinet/in.h
45.Ft int
46.Fn socket AF_INET SOCK_RAW proto
47.Sh DESCRIPTION
48.Tn IP
49is the transport layer protocol used
50by the Internet protocol family.
51Options may be set at the
52.Tn IP
53level
54when using higher-level protocols that are based on
55.Tn IP
56(such as
57.Tn TCP
58and
59.Tn UDP ) .
60It may also be accessed
61through a
62.Dq raw socket
63when developing new protocols, or
64special-purpose applications.
65.Pp
66There are several
67.Tn IP-level
68.Xr setsockopt 2
69and
70.Xr getsockopt 2
71options.
72.Dv IP_OPTIONS
73may be used to provide
74.Tn IP
75options to be transmitted in the
76.Tn IP
77header of each outgoing packet
78or to examine the header options on incoming packets.
79.Tn IP
80options may be used with any socket type in the Internet family.
81The format of
82.Tn IP
83options to be sent is that specified by the
84.Tn IP
85protocol specification (RFC-791), with one exception:
86the list of addresses for Source Route options must include the first-hop
87gateway at the beginning of the list of gateways.
88The first-hop gateway address will be extracted from the option list
89and the size adjusted accordingly before use.
90To disable previously specified options,
91use a zero-length buffer:
92.Bd -literal
93setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
94.Ed
95.Pp
96.Dv IP_TOS
97and
98.Dv IP_TTL
99may be used to set the type-of-service and time-to-live
100fields in the
101.Tn IP
102header for
103.Dv SOCK_STREAM , SOCK_DGRAM ,
104and certain types of
105.Dv SOCK_RAW
106sockets.
107For example,
108.Bd -literal
109int tos = IPTOS_LOWDELAY;       /* see <netinet/ip.h> */
110setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
111
112int ttl = 60;                   /* max = 255 */
113setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
114.Ed
115.Pp
116.Dv IP_MINTTL
117may be used to set the minimum acceptable TTL a packet must have when
118received on a socket.
119All packets with a lower TTL are silently dropped.
120This option is only really useful when set to 255, preventing packets
121from outside the directly connected networks reaching local listeners
122on sockets.
123.Pp
124.Dv IP_DONTFRAG
125may be used to set the Don't Fragment flag on IP packets.
126Currently this option is respected only on
127.Xr udp 4
128and raw
129.Xr ip 4
130sockets, unless the
131.Dv IP_HDRINCL
132option has been set.
133On
134.Xr tcp 4
135sockets, the Don't Fragment flag is controlled by the Path
136MTU Discovery option.
137Sending a packet larger than the MTU size of the egress interface,
138determined by the destination address, returns an
139.Er EMSGSIZE
140error.
141.Pp
142If the
143.Dv IP_RECVDSTADDR
144option is enabled on a
145.Dv SOCK_DGRAM
146socket,
147the
148.Xr recvmsg 2
149call will return the destination
150.Tn IP
151address for a
152.Tn UDP
153datagram.
154The
155.Vt msg_control
156field in the
157.Vt msghdr
158structure points to a buffer
159that contains a
160.Vt cmsghdr
161structure followed by the
162.Tn IP
163address.
164The
165.Vt cmsghdr
166fields have the following values:
167.Bd -literal
168cmsg_len = sizeof(struct in_addr)
169cmsg_level = IPPROTO_IP
170cmsg_type = IP_RECVDSTADDR
171.Ed
172.Pp
173The source address to be used for outgoing
174.Tn UDP
175datagrams on a socket that is not bound to a specific
176.Tn IP
177address can be specified as ancillary data with a type code of
178.Dv IP_SENDSRCADDR .
179The msg_control field in the msghdr structure should point to a buffer
180that contains a
181.Vt cmsghdr
182structure followed by the
183.Tn IP
184address.
185The cmsghdr fields should have the following values:
186.Bd -literal
187cmsg_len = sizeof(struct in_addr)
188cmsg_level = IPPROTO_IP
189cmsg_type = IP_SENDSRCADDR
190.Ed
191.Pp
192For convenience,
193.Dv IP_SENDSRCADDR
194is defined to have the same value as
195.Dv IP_RECVDSTADDR ,
196so the
197.Dv IP_RECVDSTADDR
198control message from
199.Xr recvmsg 2
200can be used directly as a control message for
201.Xr sendmsg 2 .
202.\"
203.Pp
204If the
205.Dv IP_ONESBCAST
206option is enabled on a
207.Dv SOCK_DGRAM
208or a
209.Dv SOCK_RAW
210socket, the destination address of outgoing
211broadcast datagrams on that socket will be forced
212to the undirected broadcast address,
213.Dv INADDR_BROADCAST ,
214before transmission.
215This is in contrast to the default behavior of the
216system, which is to transmit undirected broadcasts
217via the first network interface with the
218.Dv IFF_BROADCAST flag set.
219.Pp
220This option allows applications to choose which
221interface is used to transmit an undirected broadcast
222datagram.
223For example, the following code would force an
224undirected broadcast to be transmitted via the interface
225configured with the broadcast address 192.168.2.255:
226.Bd -literal
227char msg[512];
228struct sockaddr_in sin;
229u_char onesbcast = 1;	/* 0 = disable (default), 1 = enable */
230
231setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
232sin.sin_addr.s_addr = inet_addr("192.168.2.255");
233sin.sin_port = htons(1234);
234sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
235.Ed
236.Pp
237It is the application's responsibility to set the
238.Dv IP_TTL option
239to an appropriate value in order to prevent broadcast storms.
240The application must have sufficient credentials to set the
241.Dv SO_BROADCAST
242socket level option, otherwise the
243.Dv IP_ONESBCAST option has no effect.
244.Pp
245If the
246.Dv IP_RECVTTL
247option is enabled on a
248.Dv SOCK_DGRAM
249socket, the
250.Xr recvmsg 2
251call will return the
252.Tn IP
253.Tn TTL
254(time to live) field for a
255.Tn UDP
256datagram.
257The msg_control field in the msghdr structure points to a buffer
258that contains a cmsghdr structure followed by the
259.Tn TTL .
260The cmsghdr fields have the following values:
261.Bd -literal
262cmsg_len = sizeof(u_char)
263cmsg_level = IPPROTO_IP
264cmsg_type = IP_RECVTTL
265.Ed
266.\"
267.Pp
268If the
269.Dv IP_RECVIF
270option is enabled on a
271.Dv SOCK_DGRAM
272socket, the
273.Xr recvmsg 2
274call returns a
275.Vt "struct sockaddr_dl"
276corresponding to the interface on which the
277packet was received.
278The
279.Va msg_control
280field in the
281.Vt msghdr
282structure points to a buffer that contains a
283.Vt cmsghdr
284structure followed by the
285.Vt "struct sockaddr_dl" .
286The
287.Vt cmsghdr
288fields have the following values:
289.Bd -literal
290cmsg_len = sizeof(struct sockaddr_dl)
291cmsg_level = IPPROTO_IP
292cmsg_type = IP_RECVIF
293.Ed
294.Pp
295.Dv IP_PORTRANGE
296may be used to set the port range used for selecting a local port number
297on a socket with an unspecified (zero) port number.
298It has the following
299possible values:
300.Bl -tag -width IP_PORTRANGE_DEFAULT
301.It Dv IP_PORTRANGE_DEFAULT
302use the default range of values, normally
303.Dv IPPORT_HIFIRSTAUTO
304through
305.Dv IPPORT_HILASTAUTO .
306This is adjustable through the sysctl setting:
307.Va net.inet.ip.portrange.first
308and
309.Va net.inet.ip.portrange.last .
310.It Dv IP_PORTRANGE_HIGH
311use a high range of values, normally
312.Dv IPPORT_HIFIRSTAUTO
313and
314.Dv IPPORT_HILASTAUTO .
315This is adjustable through the sysctl setting:
316.Va net.inet.ip.portrange.hifirst
317and
318.Va net.inet.ip.portrange.hilast .
319.It Dv IP_PORTRANGE_LOW
320use a low range of ports, which are normally restricted to
321privileged processes on
322.Ux
323systems.
324The range is normally from
325.Dv IPPORT_RESERVED
326\- 1 down to
327.Li IPPORT_RESERVEDSTART
328in descending order.
329This is adjustable through the sysctl setting:
330.Va net.inet.ip.portrange.lowfirst
331and
332.Va net.inet.ip.portrange.lowlast .
333.El
334.Pp
335The range of privileged ports which only may be opened by
336root-owned processes may be modified by the
337.Va net.inet.ip.portrange.reservedlow
338and
339.Va net.inet.ip.portrange.reservedhigh
340sysctl settings.
341The values default to the traditional range,
3420 through
343.Dv IPPORT_RESERVED
344\- 1
345(0 through 1023), respectively.
346Note that these settings do not affect and are not accounted for in the
347use or calculation of the other
348.Va net.inet.ip.portrange
349values above.
350Changing these values departs from
351.Ux
352tradition and has security
353consequences that the administrator should carefully evaluate before
354modifying these settings.
355.Pp
356Ports are allocated at random within the specified port range in order
357to increase the difficulty of random spoofing attacks.
358In scenarios such as benchmarking, this behavior may be undesirable.
359In these cases,
360.Va net.inet.ip.portrange.randomized
361can be used to toggle randomization off.
362If more than
363.Va net.inet.ip.portrange.randomcps
364ports have been allocated in the last second, then return to sequential
365port allocation.
366Return to random allocation only once the current port allocation rate
367drops below
368.Va net.inet.ip.portrange.randomcps
369for at least
370.Va net.inet.ip.portrange.randomtime
371seconds.
372The default values for
373.Va net.inet.ip.portrange.randomcps
374and
375.Va net.inet.ip.portrange.randomtime
376are 10 port allocations per second and 45 seconds correspondingly.
377.Ss "Multicast Options"
378.Pp
379.Tn IP
380multicasting is supported only on
381.Dv AF_INET
382sockets of type
383.Dv SOCK_DGRAM
384and
385.Dv SOCK_RAW ,
386and only on networks where the interface
387driver supports multicasting.
388.Pp
389The
390.Dv IP_MULTICAST_TTL
391option changes the time-to-live (TTL)
392for outgoing multicast datagrams
393in order to control the scope of the multicasts:
394.Bd -literal
395u_char ttl;	/* range: 0 to 255, default = 1 */
396setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
397.Ed
398.Pp
399Datagrams with a TTL of 1 are not forwarded beyond the local network.
400Multicast datagrams with a TTL of 0 will not be transmitted on any network,
401but may be delivered locally if the sending host belongs to the destination
402group and if multicast loopback has not been disabled on the sending socket
403(see below).
404Multicast datagrams with TTL greater than 1 may be forwarded
405to other networks if a multicast router is attached to the local network.
406.Pp
407For hosts with multiple interfaces, each multicast transmission is
408sent from the primary network interface.
409The
410.Dv IP_MULTICAST_IF
411option overrides the default for
412subsequent transmissions from a given socket:
413.Bd -literal
414struct in_addr addr;
415setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
416.Ed
417.Pp
418where "addr" is the local
419.Tn IP
420address of the desired interface or
421.Dv INADDR_ANY
422to specify the default interface.
423An interface's local IP address and multicast capability can
424be obtained via the
425.Dv SIOCGIFCONF
426and
427.Dv SIOCGIFFLAGS
428ioctls.
429Normal applications should not need to use this option.
430.Pp
431If a multicast datagram is sent to a group to which the sending host itself
432belongs (on the outgoing interface), a copy of the datagram is, by default,
433looped back by the IP layer for local delivery.
434The
435.Dv IP_MULTICAST_LOOP
436option gives the sender explicit control
437over whether or not subsequent datagrams are looped back:
438.Bd -literal
439u_char loop;	/* 0 = disable, 1 = enable (default) */
440setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
441.Ed
442.Pp
443This option
444improves performance for applications that may have no more than one
445instance on a single host (such as a router daemon), by eliminating
446the overhead of receiving their own transmissions.
447It should generally not
448be used by applications for which there may be more than one instance on a
449single host (such as a conferencing program) or for which the sender does
450not belong to the destination group (such as a time querying program).
451.Pp
452A multicast datagram sent with an initial TTL greater than 1 may be delivered
453to the sending host on a different interface from that on which it was sent,
454if the host belongs to the destination group on that other interface.
455The loopback control option has no effect on such delivery.
456.Pp
457A host must become a member of a multicast group before it can receive
458datagrams sent to the group.
459To join a multicast group, use the
460.Dv IP_ADD_MEMBERSHIP
461option:
462.Bd -literal
463struct ip_mreq mreq;
464setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
465.Ed
466.Pp
467where
468.Fa mreq
469is the following structure:
470.Bd -literal
471struct ip_mreq {
472    struct in_addr imr_multiaddr; /* IP multicast address of group */
473    struct in_addr imr_interface; /* local IP address of interface */
474}
475.Ed
476.Pp
477.Va imr_interface
478should be set to
479.Dv INADDR_ANY
480to choose the default multicast interface,
481or the
482.Tn IP
483address of a particular multicast-capable interface if
484the host is multihomed.
485.\" TODO: Remove this piece when the RFC 3678 API is implemented and
486.\" the RFC 1724 hack is removed.
487Since
488.Fx 4.4 ,
489if the
490.Va imr_interface
491member is within the network range
492.Li 0.0.0.0/8 ,
493it is treated as an interface index in the system interface MIB,
494as per the RIP Version 2 MIB Extension (RFC-1724).
495.\" TODO: Update this piece when IPv4 source-address selection is implemented.
496.Pp
497Up to
498.Dv IP_MAX_MEMBERSHIPS
499memberships may be added on a single socket.
500Membership is associated with a single interface;
501programs running on multihomed hosts may need to
502join the same group on more than one interface.
503.Pp
504The IGMP protocol uses the primary IP address of the interface
505as its identifier for group membership.
506If multiple IP aliases are configured on the same interface,
507they will be ignored.
508This shortcoming was addressed in IPv6; MLDv2 requires
509that the unique link-local address for an interface is
510used to identify an MLDv2 listener.
511.Pp
512To drop a membership, use:
513.Bd -literal
514struct ip_mreq mreq;
515setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
516.Ed
517.Pp
518where
519.Fa mreq
520contains the same values as used to add the membership.
521Memberships are dropped when the socket is closed or the process exits.
522.\"-----------------------
523.Ss "Raw IP Sockets"
524.Pp
525Raw
526.Tn IP
527sockets are connectionless,
528and are normally used with the
529.Xr sendto 2
530and
531.Xr recvfrom 2
532calls, though the
533.Xr connect 2
534call may also be used to fix the destination for future
535packets (in which case the
536.Xr read 2
537or
538.Xr recv 2
539and
540.Xr write 2
541or
542.Xr send 2
543system calls may be used).
544.Pp
545If
546.Fa proto
547is 0, the default protocol
548.Dv IPPROTO_RAW
549is used for outgoing
550packets, and only incoming packets destined for that protocol
551are received.
552If
553.Fa proto
554is non-zero, that protocol number will be used on outgoing packets
555and to filter incoming packets.
556.Pp
557Outgoing packets automatically have an
558.Tn IP
559header prepended to
560them (based on the destination address and the protocol
561number the socket is created with),
562unless the
563.Dv IP_HDRINCL
564option has been set.
565Incoming packets are received with
566.Tn IP
567header and options intact.
568.Pp
569.Dv IP_HDRINCL
570indicates the complete IP header is included with the data
571and may be used only with the
572.Dv SOCK_RAW
573type.
574.Bd -literal
575#include <netinet/in_systm.h>
576#include <netinet/ip.h>
577
578int hincl = 1;                  /* 1 = on, 0 = off */
579setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
580.Ed
581.Pp
582Unlike previous
583.Bx
584releases, the program must set all
585the fields of the IP header, including the following:
586.Bd -literal
587ip->ip_v = IPVERSION;
588ip->ip_hl = hlen >> 2;
589ip->ip_id = 0;  /* 0 means kernel set appropriate value */
590ip->ip_off = offset;
591.Ed
592.Pp
593The
594.Va ip_len
595and
596.Va ip_off
597fields
598.Em must
599be provided in host byte order .
600All other fields must be provided in network byte order.
601See
602.Xr byteorder 3
603for more information on network byte order.
604If the
605.Va ip_id
606field is set to 0 then the kernel will choose an
607appropriate value.
608If the header source address is set to
609.Dv INADDR_ANY ,
610the kernel will choose an appropriate address.
611.Sh ERRORS
612A socket operation may fail with one of the following errors returned:
613.Bl -tag -width Er
614.It Bq Er EISCONN
615when trying to establish a connection on a socket which
616already has one, or when trying to send a datagram with the destination
617address specified and the socket is already connected;
618.It Bq Er ENOTCONN
619when trying to send a datagram, but
620no destination address is specified, and the socket has not been
621connected;
622.It Bq Er ENOBUFS
623when the system runs out of memory for
624an internal data structure;
625.It Bq Er EADDRNOTAVAIL
626when an attempt is made to create a
627socket with a network address for which no network interface
628exists.
629.It Bq Er EACCES
630when an attempt is made to create
631a raw IP socket by a non-privileged process.
632.El
633.Pp
634The following errors specific to
635.Tn IP
636may occur when setting or getting
637.Tn IP
638options:
639.Bl -tag -width Er
640.It Bq Er EINVAL
641An unknown socket option name was given.
642.It Bq Er EINVAL
643The IP option field was improperly formed;
644an option field was shorter than the minimum value
645or longer than the option buffer provided.
646.El
647.Pp
648The following errors may occur when attempting to send
649.Tn IP
650datagrams via a
651.Dq raw socket
652with the
653.Dv IP_HDRINCL
654option set:
655.Bl -tag -width Er
656.It Bq Er EINVAL
657The user-supplied
658.Va ip_len
659field was not equal to the length of the datagram written to the socket.
660.El
661.Sh SEE ALSO
662.Xr getsockopt 2 ,
663.Xr recv 2 ,
664.Xr send 2 ,
665.Xr byteorder 3 ,
666.Xr icmp 4 ,
667.Xr inet 4 ,
668.Xr intro 4 ,
669.Xr multicast 4
670.Sh HISTORY
671The
672.Nm
673protocol appeared in
674.Bx 4.2 .
675