'\" te .\" Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved. .\" Copyright 2008 AT&T .\" The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License. .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License. .\" When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner] .TH IP 7P "Dec 3, 2008" .SH NAME ip, IP \- Internet Protocol .SH SYNOPSIS .LP .nf \fB#include \fR .fi .LP .nf \fB#include \fR .fi .LP .nf \fBs = socket(AF_INET, SOCK_RAW, proto);\fR .fi .LP .nf \fBt = t_open ("/dev/rawip", O_RDWR);\fR .fi .SH DESCRIPTION .sp .LP IP is the internetwork datagram delivery protocol that is central to the Internet protocol family. Programs may use \fBIP\fR through higher-level protocols such as the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP), or may interface directly to IP. See \fBtcp\fR(7P) and \fBudp\fR(7P). Direct access may be by means of the socket interface, using a "raw socket," or by means of the Transport Level Interface (TLI). The protocol options defined in the IP specification may be set in outgoing datagrams. .sp .LP Packets sent to or from this system may be subject to IPsec policy. See \fBipsec\fR(7P) for more information. .SH APPLICATION PROGRAMMING INTERFACE .sp .LP The STREAMS driver \fB/dev/rawip\fR is the TLI transport provider that provides raw access to IP. .sp .LP Raw IP sockets are connectionless and are normally used with the \fBsendto()\fR and \fBrecvfrom()\fR calls (see \fBsend\fR(3SOCKET) and \fBrecv\fR(3SOCKET)), although the \fBconnect\fR(3SOCKET) call may also be used to fix the destination for future datagram. In this case, the \fBread\fR(2) or \fBrecv\fR(3SOCKET) and \fBwrite\fR(2) or \fBsend\fR(3SOCKET) calls may be used. If \fIproto\fR is \fBIPPROTO_RAW\fR or \fBIPPROTO_IGMP\fR, the application is expected to include a complete IP header when sending. Otherwise, that protocol number will be set in outgoing datagrams and used to filter incoming datagrams and an IP header will be generated and prepended to each outgoing datagram. In either case, received datagrams are returned with the IP header and options intact. .sp .LP If an application uses \fBIP_HDRINCL\fR and provides the IP header contents, the IP stack does not modify the following supplied fields under any conditions: Type of Service, DF Flag, Protocol, and Destination Address. The IP Options and IHL fields are set by use of \fBIP_OPTIONS\fR, and \fBTotal Length\fR is updated to include any options. Version is set to the default. Identification is chosen by the normal IP ID selection logic. The source address is updated if none was specified and the TTL is changed if the packet has a broadcast destination address. Since an applicaton cannot send down fragments (as IP assigns the IP ID), Fragment Offset is always 0. The IP Checksum field is computed by IP. None of the data beyond the IP header are changed, including the application-provided transport header. .sp .LP The socket options supported at the IP level are: .sp .ne 2 .na \fB\fBIP_OPTIONS\fR\fR .ad .RS 22n IP options for outgoing datagrams. This socket option may be used to set IP options to be included in each outgoing datagram. IP options to be sent are set with \fBsetsockopt()\fR (see \fBgetsockopt\fR(3SOCKET)). The \fBgetsockopt\fR(3SOCKET) call returns the IP options set in the last \fBsetsockopt()\fR call. IP options on received datagrams are visible to user programs only using raw IP sockets. The format of IP options given in \fBsetsockopt()\fR matches those defined in the IP specification with one exception: the list of addresses for the source routing options must include the first-hop gateway at the beginning of the list of gateways. The first-hop gateway address will be extracted from the option list and the size adjusted accordingly before use. IP options may be used with any socket type in the Internet family. .RE .sp .ne 2 .na \fB\fBIP_SEC_OPT\fR\fR .ad .RS 22n Enable or obtain IPsec security settings for this socket. For more details on the protection services of IPsec, see \fBipsec\fR(7P). .RE .sp .ne 2 .na \fB\fBIP_ADD_MEMBERSHIP\fR\fR .ad .RS 22n Join a multicast group. .RE .sp .ne 2 .na \fB\fBIP_DROP_MEMBERSHIP\fR\fR .ad .RS 22n Leave a multicast group. .RE .sp .ne 2 .na \fB\fBIP_BOUND_IF\fR\fR .ad .RS 22n Limit reception and transmission of packets to this interface. Takes an integer as an argument. The integer is the selected interface index. .RE .sp .LP The following options take \fBin_pktinfo_t\fR as the parameter: .sp .ne 2 .na \fB\fBIP_PKTINFO\fR\fR .ad .sp .6 .RS 4n Set the source address and/or transmit interface of the packet(s). Note that the IP_BOUND_IF socket option takes precedence over the interface index passed in IP_PKTINFO. .sp .in +2 .nf struct in_pktinfo { unsigned int ipi_ifindex;/* send/recv interface index */ struct in_addr ipi_spec_dst;/* matched source addr. */ struct in_addr ipi_addr;/* src/dst addr. in IP hdr */ } in_pktinfo_t; .fi .in -2 When passed in (on transmit) via ancillary data with IP_PKTINFO, ipi_spec_dst is used as the source address and ipi_ifindex is used as the interface index to send the packet out. .RE .sp .ne 2 .na \fB\fBIP_RECVPKTINFO\fR\fR .ad .sp .6 .RS 4n Enable/disable receipt of the index of the interface the packet arrived on, the local address that was matched for reception, and the inbound packet's actual destination address. Takes boolean as the parameter. Returns struct in_pktinfo_t as ancillary data. .RE .sp .LP The following options take a \fBstruct ip_mreq\fR as the parameter. The structure contains a multicast address which must be set to the \fBCLASS-D\fR \fBIP\fR multicast address and an interface address. Normally the interface address is set to \fBINADDR_ANY\fR which causes the kernel to choose the interface on which to join. .sp .ne 2 .na \fB\fBIP_BLOCK_SOURCE\fR\fR .ad .RS 29n Block multicast packets whose source address matches the given source address. The specified group must be joined previously using IP_ADD_MEMBERSHIP or MCAST_JOIN_GROUP. .RE .sp .ne 2 .na \fB\fBIP_UNBLOCK_SOURCE\fR\fR .ad .RS 29n Unblock (begin receiving) multicast packets which were previously blocked using IP_BLOCK_SOURCE. .RE .sp .ne 2 .na \fB\fBIP_ADD_SOURCE_MEMBERSHIP\fR\fR .ad .RS 29n Begin receiving packets for the given multicast group whose source address matches the specified address. .RE .sp .ne 2 .na \fB\fBIP_DROP_SOURCE_MEMBERSHIP\fR\fR .ad .RS 29n Stop receiving packets for the given multicast group whose source address matches the specified address. .RE .sp .LP The following options take a \fBstruct ip_mreq_source\fR as the parameter. The structure contains a multicast address (which must be set to the CLASS-D IP multicast address), an interface address, and a source address. .sp .ne 2 .na \fB\fBMCAST_JOIN_GROUP\fR\fR .ad .RS 28n Join a multicast group. Functionally equivalent to IP_ADD_MEMBERSHIP. .RE .sp .ne 2 .na \fB\fBMCAST_BLOCK_SOURCE\fR\fR .ad .RS 28n Block multicast packets whose source address matches the given source address. The specified group must be joined previously using IP_ADD_MEMBERSHIP or MCAST_JOIN_GROUP. .RE .sp .ne 2 .na \fB\fBMCAST_UNBLOCK_SOURCE\fR\fR .ad .RS 28n Unblock (begin receiving) multicast packets which were previously blocked using MCAST_BLOCK_SOURCE. .RE .sp .ne 2 .na \fB\fBMCAST_LEAVE_GROUP\fR\fR .ad .RS 28n Leave a multicast group. Functionally equivalent to IP_DROP_MEMBERSHIP. .RE .sp .ne 2 .na \fB\fBMCAST_JOIN_SOURCE_GROUP\fR\fR .ad .RS 28n Begin receiving packets for the given multicast group whose source address matches the specified address. .RE .sp .ne 2 .na \fB\fBMCAST_LEAVE_SOURCE_GROUP\fR\fR .ad .RS 28n Stop receiving packets for the given multicast group whose source address matches the specified address. .RE .sp .LP The following options take a struct \fBgroup_req\fR or struct \fBgroup_source_req\fR as the parameter. The `\fBgroup_req\fR structure contains an interface index and a multicast address which must be set to the CLASS-D multicast address. The \fBgroup_source_req\fR structure is used for those options which include a source address. It contains an interface index, multicast address, and source address. .sp .ne 2 .na \fB\fBIP_MULTICAST_IF\fR\fR .ad .RS 21n The outgoing interface for multicast packets. This option takes a \fBstruct\fR \fBin_addr\fR as an argument, and it selects that interface for outgoing IP multicast packets. If the address specified is \fBINADDR_ANY\fR, it uses the unicast routing table to select the outgoing interface (which is the default behavior). .RE .sp .ne 2 .na \fB\fBIP_MULTICAST_TTL\fR\fR .ad .RS 21n Time to live for multicast datagrams. This option takes an unsigned character as an argument. Its value is the TTL that IP uses on outgoing multicast datagrams. The default is \fB1\fR. .RE .sp .ne 2 .na \fB\fBIP_MULTICAST_LOOP\fR\fR .ad .RS 21n Loopback for multicast datagrams. Normally multicast datagrams are delivered to members on the sending host (or sending zone). Setting the unsigned character argument to 0 causes the opposite behavior, meaning that when multiple zones are present, the datagrams are delivered to all zones except the sending zone. .RE .sp .ne 2 .na \fB\fBIP_RECVIF\fR\fR .ad .RS 21n Receive the inbound interface index. .RE .sp .ne 2 .na \fB\fBIP_TOS\fR\fR .ad .RS 21n This option takes an integer argument as its input value. The least significant 8 bits of the value are used to set the Type Of Service field in the IP header of the outgoing packets. .RE .sp .ne 2 .na \fB\fBIP_NEXTHOP\fR\fR .ad .RS 21n This option specifies the address of the onlink nexthop for traffic originating from that socket. It causes the routing table to be bypassed and outgoing traffic is sent directly to the specified nexthop. This option takes an ipaddr_t argument representing the IPv4 address of the nexthop as the input value. The IP_NEXTHOP option takes precedence over IPOPT_LSRR. IP_BOUND_IF and SO_DONTROUTE take precedence over IP_NEXTHOP. This option has no meaning for broadcast and multicast packets. The application must ensure that the specified nexthop is alive. An application may want to specify the IP_NEXTHOP option on a TCP listener socket only for incoming requests to a particular IP address. In this case, it must avoid binding the socket to INADDR_ANY and instead must bind the listener socket to the specific IP address. In addition, typically the application may want the incoming and outgoing interface to be the same. In this case, the application must select a suitable nexthop that is onlink and reachable via the desired interface and do a setsockopt (IP_NEXTHOP) on it. Then it must bind to the IP address of the desired interface. Setting the IP_NEXTHOP option requires the PRIV_SYS_NET_CONFIG privilege. .RE .sp .LP The multicast socket options (IP_MULTICAST_IF, IP_MULTICAST_TTL, IP_MULTICAST_LOOP and IP_RECVIF) can be used with any datagram socket type in the Internet family. .sp .LP At the socket level, the socket option \fBSO_DONTROUTE\fR may be applied. This option forces datagrams being sent to bypass routing and forwarding by forcing the IP Time To Live field to \fB1\fR, meaning that the packet will not be forwarded by routers. .sp .LP Raw IP datagrams can also be sent and received using the TLI connectionless primitives. .sp .LP Datagrams flow through the IP layer in two directions: from the network \fIup\fR to user processes and from user processes \fIdown\fR to the network. Using this orientation, IP is layered \fIabove\fR the network interface drivers and \fIbelow\fR the transport protocols such as UDP and TCP. The Internet Control Message Protocol (ICMP) is logically a part of IP. See \fBicmp\fR(7P). .sp .LP IP provides for a checksum of the header part, but not the data part, of the datagram. The checksum value is computed and set in the process of sending datagrams and checked when receiving datagrams. .sp .LP IP options in received datagrams are processed in the IP layer according to the protocol specification. Currently recognized IP options include: security, loose source and record route (LSRR), strict source and record route (SSRR), record route, and internet timestamp. .sp .LP By default, the IP layer will not forward IPv4 packets that are not addressed to it. This behavior can be overridden by using \fBrouteadm\fR(1M) to enable the ipv4-forwarding option. IPv4 forwarding is configured at boot time based on the setting of \fBrouteadm\fR(1M)'s ipv4-forwarding option. .sp .LP For backwards compatibility, IPv4 forwarding can be enabled or disabled using \fBndd\fR(1M)'s ip_forwarding variable. It is set to 1 if IPv4 forwarding is enabled, or 0 if it is disabled. .sp .LP Additionally, finer-grained forwarding can be configured in IP. Each interface can be configured to forward IP packets by setting the IFF_ROUTER interface flag. This flag can be set and cleared using \fBifconfig\fR(1M)'s router and router options. If an interface's IFF_ROUTER flag is set, packets can be forwarded to or from the interface. If it is clear, packets will neither be forwarded from this interface to others, nor forwarded to this interface. Setting the ip_forwarding variable sets all of the IPv4 interfaces' IFF_ROUTER flags. .sp .LP For backwards compatibility, each interface creates an \fB:ip_forwarding /dev/ip\fR variable that can be modified using \fBndd\fR(1M). An interface's \fB:ip_forwarding ndd\fR variable is a boolean variable that mirrors the status of its IFF_ROUTER interface flag. It is set to 1 if the flag is set, or 0 if it is clear. This interface specific \fB :ip_forwarding ndd\fR variable is obsolete and may be removed in a future release of Solaris. The \fBifconfig\fR(1M) router and -router interfaces are preferred. .sp .LP The IP layer sends an ICMP message back to the source host in many cases when it receives a datagram that can not be handled. A "time exceeded" ICMP message is sent if the "time to live" field in the IP header drops to zero in the process of forwarding a datagram. A "destination unreachable" message is sent if a datagram can not be forwarded because there is no route to the final destination, or if it can not be fragmented. If the datagram is addressed to the local host but is destined for a protocol that is not supported or a port that is not in use, a destination unreachable message is also sent. The IP layer may send an ICMP "source quench" message if it is receiving datagrams too quickly. ICMP messages are only sent for the first fragment of a fragmented datagram and are never returned in response to errors in other ICMP messages. .sp .LP The IP layer supports fragmentation and reassembly. Datagrams are fragmented on output if the datagram is larger than the maximum transmission unit (MTU) of the network interface. Fragments of received datagrams are dropped from the reassembly queues if the complete datagram is not reconstructed within a short time period. .sp .LP Errors in sending discovered at the network interface driver layer are passed by IP back up to the user process. .sp .LP Multi-Data Transmit allows more than one packet to be sent from the IP module to another in a given call, thereby reducing the per-packet processing costs. The behavior of Multi-Data Transmit can be overrideen by using \fBndd\fR(1M) to set the \fB/dev/ip\fR variable, ip_multidata_outbound to 0. Note, the IP module will only initiate Multi-Data Transmit if the network interface driver supports it. .SH PACKET EVENTS .sp .LP Through the netinfo framework, this driver provides the following packet events: .sp .ne 2 .na \fBPhysical in\fR .ad .RS 16n Packets received on a network interface from an external source. .RE .sp .ne 2 .na \fBPhysical out\fR .ad .RS 16n Packets to be sent out a network interface. .RE .sp .ne 2 .na \fBForwarding\fR .ad .RS 16n Packets being forwarded through this host to another network. .RE .sp .ne 2 .na \fBloopback in\fR .ad .RS 16n Packets that have been sent by a local application to another. .RE .sp .ne 2 .na \fBloopback out\fR .ad .RS 16n Packets about to be received by a local application from another. .RE .sp .LP Currently, only a single function may be registered for each event. As a result, if the slot for an event is already occupied by someone else, a second attempt to register a callback fails. .sp .LP To receive packet events in a kernel module, it is first necessary to obtain a handle for either IPv4 or IPv6 traffic. This is achieved by passing NHF_INET or NHF_INET6 through to a net_protocol_lookup() call. The value returned from this call must then be passed into a call to net_register_hook(), along with a description of the hook to add. For a description of the structure passed through to the callback, please see \fBhook_pkt_event\fR(9S). For IP packets, this structure is filled out as follows: .sp .ne 2 .na \fBhpe_ifp\fR .ad .RS 11n Identifier indicating the inbound interface for packets received with the "physical in" event. .RE .sp .ne 2 .na \fBhpe_ofp\fR .ad .RS 11n Identifier indicating the outbound interface for packets received with the "physical out" event. .RE .sp .ne 2 .na \fBhpe_hdr\fR .ad .RS 11n Pointer to the start of the IP header (not the ethernet header). .RE .sp .ne 2 .na \fBhpe_mp\fR .ad .RS 11n Pointer to the start of the mblk_t chain containing the IP packet. .RE .sp .ne 2 .na \fBhpe_mb\fR .ad .RS 11n Pointer to the mblk_t with the IP header in it. .RE .SH NETWORK INTERFACE EVENTS .sp .LP In addition to events describing packets as they move through the system, it is also possible to receive notification of events relating to network interfaces. These events are all reported back through the same callback. The list of events is as follows: .sp .ne 2 .na \fBplumb\fR .ad .RS 18n A new network interface has been instantiated. .RE .sp .ne 2 .na \fBunplumb\fR .ad .RS 18n A network interface is no longer associated with this protocol. .RE .sp .ne 2 .na \fBup\fR .ad .RS 18n At least one logical interface is now ready to receive packets. .RE .sp .ne 2 .na \fBdown\fR .ad .RS 18n There are no logical interfaces expecting to receive packets. .RE .sp .ne 2 .na \fBaddress change\fR .ad .RS 18n An address has changed on a logical interface. .RE .SH SEE ALSO .sp .LP \fBifconfig\fR(1M), \fBrouteadm\fR(1M), \fBndd\fR(1M), \fBread\fR(2), \fBwrite\fR(2), \fBbind\fR(3SOCKET), \fBconnect\fR(3SOCKET), \fBgetsockopt\fR(3SOCKET), \fBrecv\fR(3SOCKET), \fBsend\fR(3SOCKET), \fBdefaultrouter\fR(4), \fBicmp\fR(7P), \fBif_tcp\fR(7P), \fBinet\fR(7P), \fBip6\fR(7P), \fBipsec\fR(7P), \fBrouting\fR(7P), \fBtcp\fR(7P), \fBudp\fR(7P), \fBnet_hook_register\fR(9F), \fBhook_pkt_event\fR(9S) .sp .LP Braden, R., \fIRFC 1122, Requirements for Internet Hosts \(mi Communication Layers\fR, Information Sciences Institute, University of Southern California, October 1989. .sp .LP Postel, J., \fIRFC 791, Internet Protocol \(mi DARPA Internet Program Protocol Specification\fR, Information Sciences Institute, University of Southern California, September 1981. .SH DIAGNOSTICS .sp .LP A socket operation may fail with one of the following errors returned: .sp .ne 2 .na \fB\fBEACCES\fR\fR .ad .RS 17n A \fBbind()\fR operation was attempted with a "reserved" port number and the effective user ID of the process was not the privileged user. .sp Setting the IP_NEXTHOP was attempted by a process lacking the PRIV_SYS_NET_CONFIG privilege. .RE .sp .ne 2 .na \fB\fBEADDRINUSE\fR\fR .ad .RS 17n A \fBbind()\fR operation was attempted on a socket with a network address/port pair that has already been bound to another socket. .RE .sp .ne 2 .na \fB\fBEADDRNOTAVAIL\fR\fR .ad .RS 17n A \fBbind()\fR operation was attempted for an address that is not configured on this machine. .RE .sp .ne 2 .na \fB\fBEINVAL\fR\fR .ad .RS 17n A \fBsendmsg()\fR operation with a non-NULL \fBmsg_accrights\fR was attempted. .RE .sp .ne 2 .na \fB\fBEINVAL\fR\fR .ad .RS 17n A \fBgetsockopt()\fR or \fBsetsockopt()\fR operation with an unknown socket option name was given. .RE .sp .ne 2 .na \fB\fBEINVAL\fR\fR .ad .RS 17n A \fBgetsockopt()\fR or \fBsetsockopt()\fR operation was attempted with the \fBIP\fR option field improperly formed; an option field was shorter than the minimum value or longer than the option buffer provided. .RE .sp .ne 2 .na \fB\fBEISCONN\fR\fR .ad .RS 17n A \fBconnect()\fR operation was attempted on a socket on which a \fBconnect()\fR operation had already been performed, and the socket could not be successfully disconnected before making the new connection. .RE .sp .ne 2 .na \fB\fBEISCONN\fR\fR .ad .RS 17n A \fBsendto()\fR or \fBsendmsg()\fR operation specifying an address to which the message should be sent was attempted on a socket on which a \fBconnect()\fR operation had already been performed. .RE .sp .ne 2 .na \fB\fBEMSGSIZE\fR\fR .ad .RS 17n A \fBsend()\fR, \fBsendto()\fR, or \fBsendmsg()\fR operation was attempted to send a datagram that was too large for an interface, but was not allowed to be fragmented (such as broadcasts). .RE .sp .ne 2 .na \fB\fBENETUNREACH\fR\fR .ad .RS 17n An attempt was made to establish a connection by means of \fBconnect()\fR, or to send a datagram by means of \fBsendto()\fR or \fBsendmsg()\fR, where there was no matching entry in the routing table; or if an ICMP "destination unreachable" message was received. .RE .sp .ne 2 .na \fB\fBENOTCONN\fR\fR .ad .RS 17n A \fBsend()\fR or \fBwrite()\fR operation, or a \fBsendto()\fR or \fBsendmsg()\fR operation not specifying an address to which the message should be sent, was attempted on a socket on which a \fBconnect()\fR operation had not already been performed. .RE .sp .ne 2 .na \fB\fBENOBUFS\fR\fR .ad .RS 17n The system ran out of memory for fragmentation buffers or other internal data structures. .RE .sp .ne 2 .na \fB\fBENOBUFS\fR\fR .ad .RS 17n \fBSO_SNDBUF\fR or \fBSO_RCVBUF\fR exceeds a system limit. .RE .sp .ne 2 .na \fB\fBEINVAL\fR\fR .ad .RS 17n Invalid length for \fBIP_OPTIONS\fR. .RE .sp .ne 2 .na \fB\fBEHOSTUNREACH\fR\fR .ad .RS 17n Invalid address for \fBIP_MULTICAST_IF\fR. .sp Invalid (offlink) nexthop address for IP_NEXTHOP. .RE .sp .ne 2 .na \fB\fBEINVAL\fR\fR .ad .RS 17n Not a multicast address for \fBIP_ADD_MEMBERSHIP\fR and \fBIP_DROP_MEMBERSHIP\fR. .RE .sp .ne 2 .na \fB\fBEADDRNOTAVAIL\fR\fR .ad .RS 17n Bad interface address for \fBIP_ADD_MEMBERSHIP\fR and \fBIP_DROP_MEMBERSHIP\fR. .RE .sp .ne 2 .na \fB\fBEADDRINUSE\fR\fR .ad .RS 17n Address already joined for \fBIP_ADD_MEMBERSHIP\fR. .RE .sp .ne 2 .na \fB\fBENOENT\fR\fR .ad .RS 17n Address not joined for \fBIP_DROP_MEMBERSHIP\fR. .RE .sp .ne 2 .na \fB\fBENOPROTOOPT\fR\fR .ad .RS 17n Invalid socket type. .RE .sp .ne 2 .na \fB\fBEPERM\fR\fR .ad .RS 17n No permissions. .RE .SH NOTES .sp .LP Raw sockets should receive \fBICMP\fR error packets relating to the protocol; currently such packets are simply discarded. .sp .LP Users of higher-level protocols such as \fBTCP\fR and \fBUDP\fR should be able to see received IP options.