xref: /freebsd/share/man/man4/ip.4 (revision 39beb93c3f8bdbf72a61fda42300b5ebed7390c8)
1.\" Copyright (c) 1983, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)ip.4	8.2 (Berkeley) 11/30/93
33.\" $FreeBSD$
34.\"
35.Dd April 9, 2007
36.Dt IP 4
37.Os
38.Sh NAME
39.Nm ip
40.Nd Internet Protocol
41.Sh SYNOPSIS
42.In sys/types.h
43.In sys/socket.h
44.In netinet/in.h
45.Ft int
46.Fn socket AF_INET SOCK_RAW proto
47.Sh DESCRIPTION
48.Tn IP
49is the transport layer protocol used
50by the Internet protocol family.
51Options may be set at the
52.Tn IP
53level
54when using higher-level protocols that are based on
55.Tn IP
56(such as
57.Tn TCP
58and
59.Tn UDP ) .
60It may also be accessed
61through a
62.Dq raw socket
63when developing new protocols, or
64special-purpose applications.
65.Pp
66There are several
67.Tn IP-level
68.Xr setsockopt 2
69and
70.Xr getsockopt 2
71options.
72.Dv IP_OPTIONS
73may be used to provide
74.Tn IP
75options to be transmitted in the
76.Tn IP
77header of each outgoing packet
78or to examine the header options on incoming packets.
79.Tn IP
80options may be used with any socket type in the Internet family.
81The format of
82.Tn IP
83options to be sent is that specified by the
84.Tn IP
85protocol specification (RFC-791), with one exception:
86the list of addresses for Source Route options must include the first-hop
87gateway at the beginning of the list of gateways.
88The first-hop gateway address will be extracted from the option list
89and the size adjusted accordingly before use.
90To disable previously specified options,
91use a zero-length buffer:
92.Bd -literal
93setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0);
94.Ed
95.Pp
96.Dv IP_TOS
97and
98.Dv IP_TTL
99may be used to set the type-of-service and time-to-live
100fields in the
101.Tn IP
102header for
103.Dv SOCK_STREAM , SOCK_DGRAM ,
104and certain types of
105.Dv SOCK_RAW
106sockets.
107For example,
108.Bd -literal
109int tos = IPTOS_LOWDELAY;       /* see <netinet/ip.h> */
110setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
111
112int ttl = 60;                   /* max = 255 */
113setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
114.Ed
115.Pp
116.Dv IP_MINTTL
117may be used to set the minimum acceptable TTL a packet must have when
118received on a socket.
119All packets with a lower TTL are silently dropped.
120This option is only really useful when set to 255, preventing packets
121from outside the directly connected networks reaching local listeners
122on sockets.
123.Pp
124.Dv IP_DONTFRAG
125may be used to set the Don't Fragment flag on IP packets.
126Currently this option is respected only on
127.Xr udp 4
128and raw
129.Xr ip 4
130sockets, unless the
131.Dv IP_HDRINCL
132option has been set.
133On
134.Xr tcp 4
135sockets, the Don't Fragment flag is controlled by the Path
136MTU Discovery option.
137Sending a packet larger than the MTU size of the egress interface,
138determined by the destination address, returns an
139.Er EMSGSIZE
140error.
141.Pp
142If the
143.Dv IP_RECVDSTADDR
144option is enabled on a
145.Dv SOCK_DGRAM
146socket,
147the
148.Xr recvmsg 2
149call will return the destination
150.Tn IP
151address for a
152.Tn UDP
153datagram.
154The
155.Vt msg_control
156field in the
157.Vt msghdr
158structure points to a buffer
159that contains a
160.Vt cmsghdr
161structure followed by the
162.Tn IP
163address.
164The
165.Vt cmsghdr
166fields have the following values:
167.Bd -literal
168cmsg_len = sizeof(struct in_addr)
169cmsg_level = IPPROTO_IP
170cmsg_type = IP_RECVDSTADDR
171.Ed
172.Pp
173The source address to be used for outgoing
174.Tn UDP
175datagrams on a socket that is not bound to a specific
176.Tn IP
177address can be specified as ancillary data with a type code of
178.Dv IP_SENDSRCADDR .
179The msg_control field in the msghdr structure should point to a buffer
180that contains a
181.Vt cmsghdr
182structure followed by the
183.Tn IP
184address.
185The cmsghdr fields should have the following values:
186.Bd -literal
187cmsg_len = sizeof(struct in_addr)
188cmsg_level = IPPROTO_IP
189cmsg_type = IP_SENDSRCADDR
190.Ed
191.Pp
192For convenience,
193.Dv IP_SENDSRCADDR
194is defined to have the same value as
195.Dv IP_RECVDSTADDR ,
196so the
197.Dv IP_RECVDSTADDR
198control message from
199.Xr recvmsg 2
200can be used directly as a control message for
201.Xr sendmsg 2 .
202.\"
203.Pp
204If the
205.Dv IP_ONESBCAST
206option is enabled on a
207.Dv SOCK_DGRAM
208or a
209.Dv SOCK_RAW
210socket, the destination address of outgoing
211broadcast datagrams on that socket will be forced
212to the undirected broadcast address,
213.Dv INADDR_BROADCAST ,
214before transmission.
215This is in contrast to the default behavior of the
216system, which is to transmit undirected broadcasts
217via the first network interface with the
218.Dv IFF_BROADCAST flag set.
219.Pp
220This option allows applications to choose which
221interface is used to transmit an undirected broadcast
222datagram.
223For example, the following code would force an
224undirected broadcast to be transmitted via the interface
225configured with the broadcast address 192.168.2.255:
226.Bd -literal
227char msg[512];
228struct sockaddr_in sin;
229u_char onesbcast = 1;	/* 0 = disable (default), 1 = enable */
230
231setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast));
232sin.sin_addr.s_addr = inet_addr("192.168.2.255");
233sin.sin_port = htons(1234);
234sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin));
235.Ed
236.Pp
237It is the application's responsibility to set the
238.Dv IP_TTL option
239to an appropriate value in order to prevent broadcast storms.
240The application must have sufficient credentials to set the
241.Dv SO_BROADCAST
242socket level option, otherwise the
243.Dv IP_ONESBCAST option has no effect.
244.Pp
245If the
246.Dv IP_RECVTTL
247option is enabled on a
248.Dv SOCK_DGRAM
249socket, the
250.Xr recvmsg 2
251call will return the
252.Tn IP
253.Tn TTL
254(time to live) field for a
255.Tn UDP
256datagram.
257The msg_control field in the msghdr structure points to a buffer
258that contains a cmsghdr structure followed by the
259.Tn TTL .
260The cmsghdr fields have the following values:
261.Bd -literal
262cmsg_len = sizeof(u_char)
263cmsg_level = IPPROTO_IP
264cmsg_type = IP_RECVTTL
265.Ed
266.\"
267.Pp
268If the
269.Dv IP_RECVIF
270option is enabled on a
271.Dv SOCK_DGRAM
272socket, the
273.Xr recvmsg 2
274call returns a
275.Vt "struct sockaddr_dl"
276corresponding to the interface on which the
277packet was received.
278The
279.Va msg_control
280field in the
281.Vt msghdr
282structure points to a buffer that contains a
283.Vt cmsghdr
284structure followed by the
285.Vt "struct sockaddr_dl" .
286The
287.Vt cmsghdr
288fields have the following values:
289.Bd -literal
290cmsg_len = sizeof(struct sockaddr_dl)
291cmsg_level = IPPROTO_IP
292cmsg_type = IP_RECVIF
293.Ed
294.Pp
295.Dv IP_PORTRANGE
296may be used to set the port range used for selecting a local port number
297on a socket with an unspecified (zero) port number.
298It has the following
299possible values:
300.Bl -tag -width IP_PORTRANGE_DEFAULT
301.It Dv IP_PORTRANGE_DEFAULT
302use the default range of values, normally
303.Dv IPPORT_HIFIRSTAUTO
304through
305.Dv IPPORT_HILASTAUTO .
306This is adjustable through the sysctl setting:
307.Va net.inet.ip.portrange.first
308and
309.Va net.inet.ip.portrange.last .
310.It Dv IP_PORTRANGE_HIGH
311use a high range of values, normally
312.Dv IPPORT_HIFIRSTAUTO
313and
314.Dv IPPORT_HILASTAUTO .
315This is adjustable through the sysctl setting:
316.Va net.inet.ip.portrange.hifirst
317and
318.Va net.inet.ip.portrange.hilast .
319.It Dv IP_PORTRANGE_LOW
320use a low range of ports, which are normally restricted to
321privileged processes on
322.Ux
323systems.
324The range is normally from
325.Dv IPPORT_RESERVED
326\- 1 down to
327.Li IPPORT_RESERVEDSTART
328in descending order.
329This is adjustable through the sysctl setting:
330.Va net.inet.ip.portrange.lowfirst
331and
332.Va net.inet.ip.portrange.lowlast .
333.El
334.Pp
335The range of privileged ports which only may be opened by
336root-owned processes may be modified by the
337.Va net.inet.ip.portrange.reservedlow
338and
339.Va net.inet.ip.portrange.reservedhigh
340sysctl settings.
341The values default to the traditional range,
3420 through
343.Dv IPPORT_RESERVED
344\- 1
345(0 through 1023), respectively.
346Note that these settings do not affect and are not accounted for in the
347use or calculation of the other
348.Va net.inet.ip.portrange
349values above.
350Changing these values departs from
351.Ux
352tradition and has security
353consequences that the administrator should carefully evaluate before
354modifying these settings.
355.Pp
356Ports are allocated at random within the specified port range in order
357to increase the difficulty of random spoofing attacks.
358In scenarios such as benchmarking, this behavior may be undesirable.
359In these cases,
360.Va net.inet.ip.portrange.randomized
361can be used to toggle randomization off.
362If more than
363.Va net.inet.ip.portrange.randomcps
364ports have been allocated in the last second, then return to sequential
365port allocation.
366Return to random allocation only once the current port allocation rate
367drops below
368.Va net.inet.ip.portrange.randomcps
369for at least
370.Va net.inet.ip.portrange.randomtime
371seconds.
372The default values for
373.Va net.inet.ip.portrange.randomcps
374and
375.Va net.inet.ip.portrange.randomtime
376are 10 port allocations per second and 45 seconds correspondingly.
377.Ss "Multicast Options"
378.Pp
379.Tn IP
380multicasting is supported only on
381.Dv AF_INET
382sockets of type
383.Dv SOCK_DGRAM
384and
385.Dv SOCK_RAW ,
386and only on networks where the interface
387driver supports multicasting.
388.Pp
389The
390.Dv IP_MULTICAST_TTL
391option changes the time-to-live (TTL)
392for outgoing multicast datagrams
393in order to control the scope of the multicasts:
394.Bd -literal
395u_char ttl;	/* range: 0 to 255, default = 1 */
396setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl));
397.Ed
398.Pp
399Datagrams with a TTL of 1 are not forwarded beyond the local network.
400Multicast datagrams with a TTL of 0 will not be transmitted on any network,
401but may be delivered locally if the sending host belongs to the destination
402group and if multicast loopback has not been disabled on the sending socket
403(see below).
404Multicast datagrams with TTL greater than 1 may be forwarded
405to other networks if a multicast router is attached to the local network.
406.Pp
407For hosts with multiple interfaces, each multicast transmission is
408sent from the primary network interface.
409The
410.Dv IP_MULTICAST_IF
411option overrides the default for
412subsequent transmissions from a given socket:
413.Bd -literal
414struct in_addr addr;
415setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr));
416.Ed
417.Pp
418where "addr" is the local
419.Tn IP
420address of the desired interface or
421.Dv INADDR_ANY
422to specify the default interface.
423.Pp
424To specify an interface by index, an instance of
425.Vt ip_mreqn
426should be passed instead.
427The
428.Vt imr_ifindex
429member should be set to the index of the desired interface,
430or 0 to specify the default interface.
431The kernel differentiates between these two structures by their size.
432.\"
433An interface's local IP address and multicast capability can
434be obtained via the
435.Dv SIOCGIFCONF
436and
437.Dv SIOCGIFFLAGS
438ioctls.
439Normal applications should not need to use this option.
440.Pp
441If a multicast datagram is sent to a group to which the sending host itself
442belongs (on the outgoing interface), a copy of the datagram is, by default,
443looped back by the IP layer for local delivery.
444The
445.Dv IP_MULTICAST_LOOP
446option gives the sender explicit control
447over whether or not subsequent datagrams are looped back:
448.Bd -literal
449u_char loop;	/* 0 = disable, 1 = enable (default) */
450setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop));
451.Ed
452.Pp
453This option
454improves performance for applications that may have no more than one
455instance on a single host (such as a router daemon), by eliminating
456the overhead of receiving their own transmissions.
457It should generally not
458be used by applications for which there may be more than one instance on a
459single host (such as a conferencing program) or for which the sender does
460not belong to the destination group (such as a time querying program).
461.Pp
462A multicast datagram sent with an initial TTL greater than 1 may be delivered
463to the sending host on a different interface from that on which it was sent,
464if the host belongs to the destination group on that other interface.
465The loopback control option has no effect on such delivery.
466.Pp
467A host must become a member of a multicast group before it can receive
468datagrams sent to the group.
469To join a multicast group, use the
470.Dv IP_ADD_MEMBERSHIP
471option:
472.Bd -literal
473struct ip_mreq mreq;
474setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
475.Ed
476.Pp
477where
478.Fa mreq
479is the following structure:
480.Bd -literal
481struct ip_mreq {
482    struct in_addr imr_multiaddr; /* IP multicast address of group */
483    struct in_addr imr_interface; /* local IP address of interface */
484}
485.Ed
486.Pp
487.Va imr_interface
488should be set to
489.Dv INADDR_ANY
490to choose the default multicast interface,
491or the
492.Tn IP
493address of a particular multicast-capable interface if
494the host is multihomed.
495.\" TODO: Remove this piece when the RFC 3678 API is implemented and
496.\" the RFC 1724 hack is removed.
497Since
498.Fx 4.4 ,
499if the
500.Va imr_interface
501member is within the network range
502.Li 0.0.0.0/8 ,
503it is treated as an interface index in the system interface MIB,
504as per the RIP Version 2 MIB Extension (RFC-1724).
505.\" TODO: Update this piece when IPv4 source-address selection is implemented.
506.Pp
507Up to
508.Dv IP_MAX_MEMBERSHIPS
509memberships may be added on a single socket.
510Membership is associated with a single interface;
511programs running on multihomed hosts may need to
512join the same group on more than one interface.
513.Pp
514The IGMP protocol uses the primary IP address of the interface
515as its identifier for group membership.
516If multiple IP aliases are configured on the same interface,
517they will be ignored.
518This shortcoming was addressed in IPv6; MLDv2 requires
519that the unique link-local address for an interface is
520used to identify an MLDv2 listener.
521.Pp
522To drop a membership, use:
523.Bd -literal
524struct ip_mreq mreq;
525setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
526.Ed
527.Pp
528where
529.Fa mreq
530contains the same values as used to add the membership.
531Memberships are dropped when the socket is closed or the process exits.
532.\"-----------------------
533.Ss "Raw IP Sockets"
534.Pp
535Raw
536.Tn IP
537sockets are connectionless,
538and are normally used with the
539.Xr sendto 2
540and
541.Xr recvfrom 2
542calls, though the
543.Xr connect 2
544call may also be used to fix the destination for future
545packets (in which case the
546.Xr read 2
547or
548.Xr recv 2
549and
550.Xr write 2
551or
552.Xr send 2
553system calls may be used).
554.Pp
555If
556.Fa proto
557is 0, the default protocol
558.Dv IPPROTO_RAW
559is used for outgoing
560packets, and only incoming packets destined for that protocol
561are received.
562If
563.Fa proto
564is non-zero, that protocol number will be used on outgoing packets
565and to filter incoming packets.
566.Pp
567Outgoing packets automatically have an
568.Tn IP
569header prepended to
570them (based on the destination address and the protocol
571number the socket is created with),
572unless the
573.Dv IP_HDRINCL
574option has been set.
575Incoming packets are received with
576.Tn IP
577header and options intact.
578.Pp
579.Dv IP_HDRINCL
580indicates the complete IP header is included with the data
581and may be used only with the
582.Dv SOCK_RAW
583type.
584.Bd -literal
585#include <netinet/in_systm.h>
586#include <netinet/ip.h>
587
588int hincl = 1;                  /* 1 = on, 0 = off */
589setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl));
590.Ed
591.Pp
592Unlike previous
593.Bx
594releases, the program must set all
595the fields of the IP header, including the following:
596.Bd -literal
597ip->ip_v = IPVERSION;
598ip->ip_hl = hlen >> 2;
599ip->ip_id = 0;  /* 0 means kernel set appropriate value */
600ip->ip_off = offset;
601.Ed
602.Pp
603The
604.Va ip_len
605and
606.Va ip_off
607fields
608.Em must
609be provided in host byte order .
610All other fields must be provided in network byte order.
611See
612.Xr byteorder 3
613for more information on network byte order.
614If the
615.Va ip_id
616field is set to 0 then the kernel will choose an
617appropriate value.
618If the header source address is set to
619.Dv INADDR_ANY ,
620the kernel will choose an appropriate address.
621.Sh ERRORS
622A socket operation may fail with one of the following errors returned:
623.Bl -tag -width Er
624.It Bq Er EISCONN
625when trying to establish a connection on a socket which
626already has one, or when trying to send a datagram with the destination
627address specified and the socket is already connected;
628.It Bq Er ENOTCONN
629when trying to send a datagram, but
630no destination address is specified, and the socket has not been
631connected;
632.It Bq Er ENOBUFS
633when the system runs out of memory for
634an internal data structure;
635.It Bq Er EADDRNOTAVAIL
636when an attempt is made to create a
637socket with a network address for which no network interface
638exists.
639.It Bq Er EACCES
640when an attempt is made to create
641a raw IP socket by a non-privileged process.
642.El
643.Pp
644The following errors specific to
645.Tn IP
646may occur when setting or getting
647.Tn IP
648options:
649.Bl -tag -width Er
650.It Bq Er EINVAL
651An unknown socket option name was given.
652.It Bq Er EINVAL
653The IP option field was improperly formed;
654an option field was shorter than the minimum value
655or longer than the option buffer provided.
656.El
657.Pp
658The following errors may occur when attempting to send
659.Tn IP
660datagrams via a
661.Dq raw socket
662with the
663.Dv IP_HDRINCL
664option set:
665.Bl -tag -width Er
666.It Bq Er EINVAL
667The user-supplied
668.Va ip_len
669field was not equal to the length of the datagram written to the socket.
670.El
671.Sh SEE ALSO
672.Xr getsockopt 2 ,
673.Xr recv 2 ,
674.Xr send 2 ,
675.Xr byteorder 3 ,
676.Xr icmp 4 ,
677.Xr inet 4 ,
678.Xr intro 4 ,
679.Xr multicast 4
680.Sh HISTORY
681The
682.Nm
683protocol appeared in
684.Bx 4.2 .
685The
686.Vt ip_mreqn
687structure appeared in
688.Tn Linux 2.4 .
689