17c478bd9Sstevel@tonic-gateCDDL HEADER START 27c478bd9Sstevel@tonic-gate 37c478bd9Sstevel@tonic-gateThe contents of this file are subject to the terms of the 4*d04ccbb3ScarlsonjCommon Development and Distribution License (the "License"). 5*d04ccbb3ScarlsonjYou may not use this file except in compliance with the License. 67c478bd9Sstevel@tonic-gate 77c478bd9Sstevel@tonic-gateYou can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 87c478bd9Sstevel@tonic-gateor http://www.opensolaris.org/os/licensing. 97c478bd9Sstevel@tonic-gateSee the License for the specific language governing permissions 107c478bd9Sstevel@tonic-gateand limitations under the License. 117c478bd9Sstevel@tonic-gate 127c478bd9Sstevel@tonic-gateWhen distributing Covered Code, include this CDDL HEADER in each 137c478bd9Sstevel@tonic-gatefile and include the License file at usr/src/OPENSOLARIS.LICENSE. 147c478bd9Sstevel@tonic-gateIf applicable, add the following below this CDDL HEADER, with the 157c478bd9Sstevel@tonic-gatefields enclosed by brackets "[]" replaced with your own identifying 167c478bd9Sstevel@tonic-gateinformation: Portions Copyright [yyyy] [name of copyright owner] 177c478bd9Sstevel@tonic-gate 187c478bd9Sstevel@tonic-gateCDDL HEADER END 197c478bd9Sstevel@tonic-gate 20*d04ccbb3ScarlsonjCopyright 2007 Sun Microsystems, Inc. All rights reserved. 217c478bd9Sstevel@tonic-gateUse is subject to license terms. 227c478bd9Sstevel@tonic-gate 237c478bd9Sstevel@tonic-gateArchitectural Overview for the DHCP agent 247c478bd9Sstevel@tonic-gatePeter Memishian 257c478bd9Sstevel@tonic-gateident "%Z%%M% %I% %E% SMI" 267c478bd9Sstevel@tonic-gate 277c478bd9Sstevel@tonic-gateINTRODUCTION 287c478bd9Sstevel@tonic-gate============ 297c478bd9Sstevel@tonic-gate 30*d04ccbb3ScarlsonjThe Solaris DHCP agent (dhcpagent) is a DHCP client implementation 31*d04ccbb3Scarlsonjcompliant with RFCs 2131, 3315, and others. The major forces shaping 32*d04ccbb3Scarlsonjits design were: 337c478bd9Sstevel@tonic-gate 347c478bd9Sstevel@tonic-gate * Must be capable of managing multiple network interfaces. 357c478bd9Sstevel@tonic-gate * Must consume little CPU, since it will always be running. 367c478bd9Sstevel@tonic-gate * Must have a small memory footprint, since it will always be 377c478bd9Sstevel@tonic-gate running. 38*d04ccbb3Scarlsonj * Must not rely on any shared libraries outside of /lib, since 39*d04ccbb3Scarlsonj it must run before all filesystems have been mounted. 407c478bd9Sstevel@tonic-gate 417c478bd9Sstevel@tonic-gateWhen a DHCP agent implementation is only required to control a single 427c478bd9Sstevel@tonic-gateinterface on a machine, the problem is expressed well as a simple 437c478bd9Sstevel@tonic-gatestate-machine, as shown in RFC2131. However, when a DHCP agent is 447c478bd9Sstevel@tonic-gateresponsible for managing more than one interface at a time, the 45*d04ccbb3Scarlsonjproblem becomes much more complicated. 46*d04ccbb3Scarlsonj 47*d04ccbb3ScarlsonjThis can be resolved using threads or with an event-driven model. 48*d04ccbb3ScarlsonjGiven that DHCP's behavior can be expressed concisely as a state 49*d04ccbb3Scarlsonjmachine, the event-driven model is the closest match. 50*d04ccbb3Scarlsonj 51*d04ccbb3ScarlsonjWhile tried-and-true, that model is subtle and easy to get wrong. 52*d04ccbb3ScarlsonjIndeed, much of the agent's code is there to manage the complexity of 53*d04ccbb3Scarlsonjprogramming in an asynchronous event-driven paradigm. 547c478bd9Sstevel@tonic-gate 557c478bd9Sstevel@tonic-gateTHE BASICS 567c478bd9Sstevel@tonic-gate========== 577c478bd9Sstevel@tonic-gate 58*d04ccbb3ScarlsonjThe DHCP agent consists of roughly 30 source files, most with a 59*d04ccbb3Scarlsonjcompanion header file. While the largest source file is around 1700 607c478bd9Sstevel@tonic-gatelines, most are much shorter. The source files can largely be broken 617c478bd9Sstevel@tonic-gateup into three groups: 627c478bd9Sstevel@tonic-gate 63*d04ccbb3Scarlsonj * Source files that, along with their companion header files, 647c478bd9Sstevel@tonic-gate define an abstract "object" that is used by other parts of 65*d04ccbb3Scarlsonj the system. Examples include "packet.c", which along with 66*d04ccbb3Scarlsonj "packet.h" provide a Packet object for use by the rest of 67*d04ccbb3Scarlsonj the agent; and "async.c", which along with "async.h" defines 68*d04ccbb3Scarlsonj an interface for managing asynchronous transactions within 69*d04ccbb3Scarlsonj the agent. 707c478bd9Sstevel@tonic-gate 71*d04ccbb3Scarlsonj * Source files that implement a given state of the agent; for 727c478bd9Sstevel@tonic-gate instance, there is a "request.c" which comprises all of 737c478bd9Sstevel@tonic-gate the procedural "work" which must be done while in the 747c478bd9Sstevel@tonic-gate REQUESTING state of the agent. By encapsulating states in 757c478bd9Sstevel@tonic-gate files, it becomes easier to debug errors in the 767c478bd9Sstevel@tonic-gate client/server protocol and adapt the agent to new 777c478bd9Sstevel@tonic-gate constraints, since all the relevant code is in one place. 787c478bd9Sstevel@tonic-gate 797c478bd9Sstevel@tonic-gate * Source files, which along with their companion header files, 807c478bd9Sstevel@tonic-gate encapsulate a given task or related set of tasks. The 817c478bd9Sstevel@tonic-gate difference between this and the first group is that the 827c478bd9Sstevel@tonic-gate interfaces exported from these files do not operate on 837c478bd9Sstevel@tonic-gate an "object", but rather perform a specific task. Examples 847c478bd9Sstevel@tonic-gate include "dlpi_io.c", which provides a useful interface 857c478bd9Sstevel@tonic-gate to DLPI-related i/o operations. 867c478bd9Sstevel@tonic-gate 877c478bd9Sstevel@tonic-gateOVERVIEW 887c478bd9Sstevel@tonic-gate======== 897c478bd9Sstevel@tonic-gate 907c478bd9Sstevel@tonic-gateHere we discuss the essential objects and subtle aspects of the 917c478bd9Sstevel@tonic-gateDHCP agent implementation. Note that there is of course much more 927c478bd9Sstevel@tonic-gatethat is not discussed here, but after this overview you should be able 937c478bd9Sstevel@tonic-gateto fend for yourself in the source code. 947c478bd9Sstevel@tonic-gate 95*d04ccbb3ScarlsonjFor details on the DHCPv6 aspects of the design, and how this relates 96*d04ccbb3Scarlsonjto the implementation present in previous releases of Solaris, see the 97*d04ccbb3ScarlsonjREADME.v6 file. 98*d04ccbb3Scarlsonj 997c478bd9Sstevel@tonic-gateEvent Handlers and Timer Queues 1007c478bd9Sstevel@tonic-gate------------------------------- 1017c478bd9Sstevel@tonic-gate 1027c478bd9Sstevel@tonic-gateThe most important object in the agent is the event handler, whose 1037c478bd9Sstevel@tonic-gateinterface is in libinetutil.h and whose implementation is in 1047c478bd9Sstevel@tonic-gatelibinetutil. The event handler is essentially an object-oriented 1057c478bd9Sstevel@tonic-gatewrapper around poll(2): other components of the agent can register to 1067c478bd9Sstevel@tonic-gatebe called back when specific events on file descriptors happen -- for 1077c478bd9Sstevel@tonic-gateinstance, to wait for requests to arrive on its IPC socket, the agent 1087c478bd9Sstevel@tonic-gateregisters a callback function (accept_event()) that will be called 1097c478bd9Sstevel@tonic-gateback whenever a new connection arrives on the file descriptor 1107c478bd9Sstevel@tonic-gateassociated with the IPC socket. When the agent initially begins in 1117c478bd9Sstevel@tonic-gatemain(), it registers a number of events with the event handler, and 1127c478bd9Sstevel@tonic-gatethen calls iu_handle_events(), which proceeds to wait for events to 1137c478bd9Sstevel@tonic-gatehappen -- this function does not return until the agent is shutdown 1147c478bd9Sstevel@tonic-gatevia signal. 1157c478bd9Sstevel@tonic-gate 1167c478bd9Sstevel@tonic-gateWhen the registered events occur, the callback functions are called 1177c478bd9Sstevel@tonic-gateback, which in turn might lead to additional callbacks being 1187c478bd9Sstevel@tonic-gateregistered -- this is the classic event-driven model. (As an aside, 1197c478bd9Sstevel@tonic-gatenote that programming in an event-driven model means that callbacks 1207c478bd9Sstevel@tonic-gatecannot block, or else the agent will become unresponsive.) 1217c478bd9Sstevel@tonic-gate 1227c478bd9Sstevel@tonic-gateA special kind of "event" is a timeout. Since there are many timers 1237c478bd9Sstevel@tonic-gatewhich must be maintained for each DHCP-controlled interface (such as a 1247c478bd9Sstevel@tonic-gatelease expiration timer, time-to-first-renewal (t1) timer, and so 1257c478bd9Sstevel@tonic-gateforth), an object-oriented abstraction to timers called a "timer 1267c478bd9Sstevel@tonic-gatequeue" is provided, whose interface is in libinetutil.h with a 1277c478bd9Sstevel@tonic-gatecorresponding implementation in libinetutil. The timer queue allows 1287c478bd9Sstevel@tonic-gatecallback functions to be "scheduled" for callback after a certain 1297c478bd9Sstevel@tonic-gateamount of time has passed. 1307c478bd9Sstevel@tonic-gate 1317c478bd9Sstevel@tonic-gateThe event handler and timer queue objects work hand-in-hand: the event 1327c478bd9Sstevel@tonic-gatehandler is passed a pointer to a timer queue in iu_handle_events() -- 1337c478bd9Sstevel@tonic-gatefrom there, it can use the iu_earliest_timer() routine to find the 1347c478bd9Sstevel@tonic-gatetimer which will next fire, and use this to set its timeout value in 1357c478bd9Sstevel@tonic-gateits call to poll(2). If poll(2) returns due to a timeout, the event 1367c478bd9Sstevel@tonic-gatehandler calls iu_expire_timers() to expire all timers that expired 1377c478bd9Sstevel@tonic-gate(note that more than one may have expired if, for example, multiple 1387c478bd9Sstevel@tonic-gatetimers were set to expire at the same time). 1397c478bd9Sstevel@tonic-gate 1407c478bd9Sstevel@tonic-gateAlthough it is possible to instantiate more than one timer queue or 1417c478bd9Sstevel@tonic-gateevent handler object, it doesn't make a lot of sense -- these objects 1427c478bd9Sstevel@tonic-gateare really "singletons". Accordingly, the agent has two global 1437c478bd9Sstevel@tonic-gatevariables, `eh' and `tq', which store pointers to the global event 1447c478bd9Sstevel@tonic-gatehandler and timer queue. 1457c478bd9Sstevel@tonic-gate 1467c478bd9Sstevel@tonic-gateNetwork Interfaces 1477c478bd9Sstevel@tonic-gate------------------ 1487c478bd9Sstevel@tonic-gate 1497c478bd9Sstevel@tonic-gateFor each network interface managed by the agent, there is a set of 1507c478bd9Sstevel@tonic-gateassociated state that describes both its general properties (such as 151*d04ccbb3Scarlsonjthe maximum MTU) and its connections to DHCP-related state (the 152*d04ccbb3Scarlsonjprotocol state machines). This state is stored in a pair of 153*d04ccbb3Scarlsonjstructures called `dhcp_pif_t' (the IP physical interface layer or 154*d04ccbb3ScarlsonjPIF) and `dhcp_lif_t' (the IP logical interface layer or LIF). Each 155*d04ccbb3Scarlsonjdhcp_pif_t represents a single physical interface, such as "hme0," for 156*d04ccbb3Scarlsonja given IP protocol version (4 or 6), and has a list of dhcp_lif_t 157*d04ccbb3Scarlsonjstructures representing the logical interfaces (such as "hme0:1") in 158*d04ccbb3Scarlsonjuse by the agent. 159*d04ccbb3Scarlsonj 160*d04ccbb3ScarlsonjThis split is important because of differences between IPv4 and IPv6. 161*d04ccbb3ScarlsonjFor IPv4, each DHCP state machine manages a single IP address and 162*d04ccbb3Scarlsonjassociated configuration data. This corresponds to a single logical 163*d04ccbb3Scarlsonjinterface, which must be specified by the user. For IPv6, however, 164*d04ccbb3Scarlsonjeach DHCP state machine manages a group of addresses, and is 165*d04ccbb3Scarlsonjassociated with DUID value rather than with just an interface. 166*d04ccbb3Scarlsonj 167*d04ccbb3ScarlsonjThus, DHCPv6 behaves more like in.ndpd in its creation of "ADDRCONF" 168*d04ccbb3Scarlsonjinterfaces. The agent automatically plumbs logical interfaces when 169*d04ccbb3Scarlsonjneeded and removes them when the addresses expire. 170*d04ccbb3Scarlsonj 171*d04ccbb3ScarlsonjThe state for a given session is stored separately in `dhcp_smach_t'. 172*d04ccbb3ScarlsonjThis state machine then points to the main LIF used for I/O, and to a 173*d04ccbb3Scarlsonjlist of `dhcp_lease_t' structures representing individual leases, and 174*d04ccbb3Scarlsonjeach of those points to a list of LIFs corresponding to the individual 175*d04ccbb3Scarlsonjaddresses being managed. 1767c478bd9Sstevel@tonic-gate 1777c478bd9Sstevel@tonic-gateOne point that was brushed over in the preceding discussion of event 1787c478bd9Sstevel@tonic-gatehandlers and timer queues was context. Recall that the event-driven 1797c478bd9Sstevel@tonic-gatenature of the agent requires that functions cannot block, lest they 1807c478bd9Sstevel@tonic-gatestarve out others and impact the observed responsiveness of the agent. 1817c478bd9Sstevel@tonic-gateAs an example, consider the process of extending a lease: the agent 1827c478bd9Sstevel@tonic-gatemust send a REQUEST packet and wait for an ACK or NAK packet in 183*d04ccbb3Scarlsonjresponse. This is done by sending a REQUEST and then returning to the 184*d04ccbb3Scarlsonjevent handler that waits for an ACK or NAK packet to arrive on the 185*d04ccbb3Scarlsonjfile descriptor associated with the interface. Note however, that 186*d04ccbb3Scarlsonjwhen the ACK or NAK does arrive, and the callback function called 187*d04ccbb3Scarlsonjback, it must know which state machine this packet is for (it must get 188*d04ccbb3Scarlsonjback its context). This could be handled through an ad-hoc mapping of 189*d04ccbb3Scarlsonjfile descriptors to state machines, but a cleaner approach is to have 190*d04ccbb3Scarlsonjthe event handler's register function (iu_register_event()) take in an 191*d04ccbb3Scarlsonjopaque context pointer, which will then be passed back to the 192*d04ccbb3Scarlsonjcallback. In the agent, the context pointer used depends on the 193*d04ccbb3Scarlsonjnature of the event: events on LIFs use the dhcp_lif_t pointer, events 194*d04ccbb3Scarlsonjon the state machine use dhcp_smach_t, and so on. 1957c478bd9Sstevel@tonic-gate 1967c478bd9Sstevel@tonic-gateNote that there is nothing that guarantees the pointer passed into 1977c478bd9Sstevel@tonic-gateiu_register_event() or iu_schedule_timer() will still be valid when 1987c478bd9Sstevel@tonic-gatethe callback is called back (for instance, the memory may have been 199*d04ccbb3Scarlsonjfreed in the meantime). To solve this problem, all of the data 200*d04ccbb3Scarlsonjstructures used in this way are reference counted. For more details 201*d04ccbb3Scarlsonjon how the reference count scheme is implemented, see the closing 202*d04ccbb3Scarlsonjcomments in interface.h regarding memory management. 2037c478bd9Sstevel@tonic-gate 2047c478bd9Sstevel@tonic-gateTransactions 2057c478bd9Sstevel@tonic-gate------------ 2067c478bd9Sstevel@tonic-gate 2077c478bd9Sstevel@tonic-gateMany operations performed via DHCP must be performed in groups -- for 2087c478bd9Sstevel@tonic-gateinstance, acquiring a lease requires several steps: sending a 2097c478bd9Sstevel@tonic-gateDISCOVER, collecting OFFERs, selecting an OFFER, sending a REQUEST, 2107c478bd9Sstevel@tonic-gateand receiving an ACK, assuming everything goes well. Note however 2117c478bd9Sstevel@tonic-gatethat due to the event-driven model the agent operates in, these 2127c478bd9Sstevel@tonic-gateoperations are not inherently "grouped" -- instead, the agent sends a 2137c478bd9Sstevel@tonic-gateDISCOVER, goes back into the main event loop, waits for events 2147c478bd9Sstevel@tonic-gate(perhaps even requests on the IPC channel to begin acquiring a lease 215*d04ccbb3Scarlsonjon another state machine), eventually checks to see if an acceptable 216*d04ccbb3ScarlsonjOFFER has come in, and so forth. To some degree, the notion of the 217*d04ccbb3Scarlsonjstate machine's current state (SELECTING, REQUESTING, etc) helps 218*d04ccbb3Scarlsonjcontrol the potential chaos of the event-driven model (for instance, 219*d04ccbb3Scarlsonjif while the agent is waiting for an OFFER on a given state machine, 220*d04ccbb3Scarlsonjan IPC event comes in requesting that the leases be RELEASED, the 221*d04ccbb3Scarlsonjagent knows to send back an error since the state machine must be in 222*d04ccbb3Scarlsonjat least the BOUND state before a RELEASE can be performed.) 2237c478bd9Sstevel@tonic-gate 2247c478bd9Sstevel@tonic-gateHowever, states are not enough -- for instance, suppose that the agent 225*d04ccbb3Scarlsonjbegins trying to renew a lease. This is done by sending a REQUEST 2267c478bd9Sstevel@tonic-gatepacket and waiting for an ACK or NAK, which might never come. If, 2277c478bd9Sstevel@tonic-gatewhile waiting for the ACK or NAK, the user sends a request to renew 2287c478bd9Sstevel@tonic-gatethe lease as well, then if the agent were to send another REQUEST, 2297c478bd9Sstevel@tonic-gatethings could get quite complicated (and this is only the beginning of 2307c478bd9Sstevel@tonic-gatethis rathole). To protect against this, two objects exist: 2317c478bd9Sstevel@tonic-gate`async_action' and `ipc_action'. These objects are related, but 2327c478bd9Sstevel@tonic-gateindependent of one another; the more essential object is the 2337c478bd9Sstevel@tonic-gate`async_action', which we will discuss first. 2347c478bd9Sstevel@tonic-gate 2357c478bd9Sstevel@tonic-gateIn short, an `async_action' represents a pending transaction (aka 236*d04ccbb3Scarlsonjasynchronous action), of which each state machine can have at most 237*d04ccbb3Scarlsonjone. The `async_action' structure is embedded in the `dhcp_smach_t' 238*d04ccbb3Scarlsonjstructure, which is fine since there can be at most one pending 239*d04ccbb3Scarlsonjtransaction per state machine. Typical "asynchronous transactions" 240*d04ccbb3Scarlsonjare START, EXTEND, and INFORM, since each consists of a sequence of 241*d04ccbb3Scarlsonjpackets that must be done without interruption. Note that not all 242*d04ccbb3ScarlsonjDHCP operations are "asynchronous" -- for instance, a DHCPv4 RELEASE 243*d04ccbb3Scarlsonjoperation is synchronous (not asynchronous) since after the RELEASE is 244*d04ccbb3Scarlsonjsent no reply is expected from the DHCP server, but DHCPv6 Release is 245*d04ccbb3Scarlsonjasynchronous, as all DHCPv6 messages are transactional. Some 246*d04ccbb3Scarlsonjoperations, such as status query, are synchronous and do not affect 247*d04ccbb3Scarlsonjthe system state, and thus do not require sequencing. 2487c478bd9Sstevel@tonic-gate 2497c478bd9Sstevel@tonic-gateWhen the agent realizes it must perform an asynchronous transaction, 250*d04ccbb3Scarlsonjit calls async_async() to open the transaction. If one is already 251*d04ccbb3Scarlsonjpending, then the new transaction must fail (the details of failure 252*d04ccbb3Scarlsonjdepend on how the transaction was initiated, which is described in 253*d04ccbb3Scarlsonjmore detail later when the `ipc_action' object is discussed). If 254*d04ccbb3Scarlsonjthere is no pending asynchronous transaction, the operation succeeds. 2557c478bd9Sstevel@tonic-gate 256*d04ccbb3ScarlsonjWhen the transaction is complete, either async_finish() or 257*d04ccbb3Scarlsonjasync_cancel() must be called to complete or cancel the asynchronous 258*d04ccbb3Scarlsonjaction on that state machine. If the transaction is unable to 259*d04ccbb3Scarlsonjcomplete within a certain amount of time (more on this later), a timer 260*d04ccbb3Scarlsonjshould be used to cancel the operation. 2617c478bd9Sstevel@tonic-gate 2627c478bd9Sstevel@tonic-gateThe notion of asynchronous transactions is complicated by the fact 2637c478bd9Sstevel@tonic-gatethat they may originate from both inside and outside of the agent. 2647c478bd9Sstevel@tonic-gateFor instance, a user initiates an asynchronous START transaction when 2657c478bd9Sstevel@tonic-gatehe performs an `ifconfig hme0 dhcp start', but the agent will 2667c478bd9Sstevel@tonic-gateinternally need to perform asynchronous EXTEND transactions to extend 267*d04ccbb3Scarlsonjthe lease before it expires. Note that user-initiated actions always 268*d04ccbb3Scarlsonjhave priority over internal actions: the former will cancel the 269*d04ccbb3Scarlsonjlatter, if necessary. 2707c478bd9Sstevel@tonic-gate 271*d04ccbb3ScarlsonjThis leads us into the `ipc_action' object. An `ipc_action' 272*d04ccbb3Scarlsonjrepresents the IPC-related pieces of an asynchronous transaction that 273*d04ccbb3Scarlsonjwas started as a result of a user request, as well as the `BUSY' state 274*d04ccbb3Scarlsonjof the administrative interface. Only IPC-generated asynchronous 275*d04ccbb3Scarlsonjtransactions have a valid `ipc_action' object. Note that since there 276*d04ccbb3Scarlsonjcan be at most one asynchronous action per state machine, there can 277*d04ccbb3Scarlsonjalso be at most one `ipc_action' per state machine (this means it can 278*d04ccbb3Scarlsonjalso conveniently be embedded inside the `dhcp_smach_t' structure). 2797c478bd9Sstevel@tonic-gate 2807c478bd9Sstevel@tonic-gateOne of the main purposes of the `ipc_action' object is to timeout user 281*d04ccbb3Scarlsonjevents. When the user specifies a timeout value as an argument to 2827c478bd9Sstevel@tonic-gateifconfig, he is specifying an `ipc_action' timeout; in other words, 283*d04ccbb3Scarlsonjhow long he is willing to wait for the command to complete. When this 284*d04ccbb3Scarlsonjtime expires, the ipc_action is terminated, as well as the 285*d04ccbb3Scarlsonjasynchronous operation. 2867c478bd9Sstevel@tonic-gate 2877c478bd9Sstevel@tonic-gateThe API provided for the `ipc_action' object is quite similar to the 2887c478bd9Sstevel@tonic-gateone for the `async_action' object: when an IPC request comes in for an 2897c478bd9Sstevel@tonic-gateoperation requiring asynchronous operation, ipc_action_start() is 2907c478bd9Sstevel@tonic-gatecalled. When the request completes, ipc_action_finish() is called. 2917c478bd9Sstevel@tonic-gateIf the user times out before the request completes, then 2927c478bd9Sstevel@tonic-gateipc_action_timeout() is called. 2937c478bd9Sstevel@tonic-gate 2947c478bd9Sstevel@tonic-gatePacket Management 2957c478bd9Sstevel@tonic-gate----------------- 2967c478bd9Sstevel@tonic-gate 2977c478bd9Sstevel@tonic-gateAnother complicated area is packet management: building, manipulating, 2987c478bd9Sstevel@tonic-gatesending and receiving packets. These operations are all encapsulated 2997c478bd9Sstevel@tonic-gatebehind a dozen or so interfaces (see packet.h) that abstract the 3007c478bd9Sstevel@tonic-gateunimportant details away from the rest of the agent code. In order to 3017c478bd9Sstevel@tonic-gatesend a DHCP packet, code first calls init_pkt(), which returns a 3027c478bd9Sstevel@tonic-gatedhcp_pkt_t initialized suitably for transmission. Note that currently 3037c478bd9Sstevel@tonic-gateinit_pkt() returns a dhcp_pkt_t that is actually allocated as part of 304*d04ccbb3Scarlsonjthe `dhcp_smach_t', but this may change in the future.. After calling 3057c478bd9Sstevel@tonic-gateinit_pkt(), the add_pkt_opt*() functions are used to add options to 306*d04ccbb3Scarlsonjthe DHCP packet. Finally, send_pkt() and send_pkt_v6() can be used to 307*d04ccbb3Scarlsonjtransmit the packet to a given IP address. 3087c478bd9Sstevel@tonic-gate 3097c478bd9Sstevel@tonic-gateThe send_pkt() function is actually quite complicated; for one, it 310*d04ccbb3Scarlsonjmust internally use either DLPI or sockets depending on the machine 311*d04ccbb3Scarlsonjstate; for another, it handles the details of packet timeout and 3127c478bd9Sstevel@tonic-gateretransmission. The last argument to send_pkt() is a pointer to a 313*d04ccbb3Scarlsonj"stop function." If this argument is passed as NULL, then the packet 3147c478bd9Sstevel@tonic-gatewill only be sent once (it won't be retransmitted). Otherwise, before 3157c478bd9Sstevel@tonic-gateeach retransmission, the stop function will be called back prior to 316*d04ccbb3Scarlsonjretransmission. The callback may alter dsm_send_timeout if necessary 317*d04ccbb3Scarlsonjto place a cap on the next timeout; this is done for DHCPv6 in 318*d04ccbb3Scarlsonjstop_init_reboot() in order to implement the CNF_MAX_RD constraint. 319*d04ccbb3Scarlsonj 320*d04ccbb3ScarlsonjThe return value from this function indicates whether to continue 321*d04ccbb3Scarlsonjretransmission or not, which allows the send_pkt() caller to control 322*d04ccbb3Scarlsonjthe retransmission policy without making it have to deal with the 323*d04ccbb3Scarlsonjretransmission mechanism. See request.c for an example of this in 324*d04ccbb3Scarlsonjaction. 3257c478bd9Sstevel@tonic-gate 3267c478bd9Sstevel@tonic-gateThe recv_pkt() function is simpler but still complicated by the fact 3277c478bd9Sstevel@tonic-gatethat one may want to receive several different types of packets at 328*d04ccbb3Scarlsonjonce and in different ways (DLPI or sockets). The caller registers an 329*d04ccbb3Scarlsonjevent handler on the file descriptor, and then calls recv_pkt() to 330*d04ccbb3Scarlsonjread in the packet along with meta information about the message (the 331*d04ccbb3Scarlsonjsender and interface identifier). 332*d04ccbb3Scarlsonj 333*d04ccbb3ScarlsonjFor IPv6, packet reception is done with a single socket, using 334*d04ccbb3ScarlsonjIPV6_PKTINFO to determine the actual destination address and receiving 335*d04ccbb3Scarlsonjinterface. Packets are then matched against the state machines on the 336*d04ccbb3Scarlsonjgiven interface through the transaction ID. 337*d04ccbb3Scarlsonj 338*d04ccbb3ScarlsonjThe same facility exists for inbound IPv4 packets, but because there's 339*d04ccbb3Scarlsonjno IP_PKTINFO processing on output yet in Solaris, and because IPv4 340*d04ccbb3Scarlsonjstill relies on DLPI, DHCP packets are handled on a per-LIF (when 341*d04ccbb3Scarlsonjbound) and per-PIF (when unbound) basis. Eventually, when IP_PKTINFO 342*d04ccbb3Scarlsonjis available for IPv4, the per-LIF sockets can go away. If it ever 343*d04ccbb3Scarlsonjbecomes possible to send and receive IP packets without having an IP 344*d04ccbb3Scarlsonjaddress configured on an interface, then the DLPI streams can go as 345*d04ccbb3Scarlsonjwell. 3467c478bd9Sstevel@tonic-gate 3477c478bd9Sstevel@tonic-gateTime 3487c478bd9Sstevel@tonic-gate---- 3497c478bd9Sstevel@tonic-gate 3507c478bd9Sstevel@tonic-gateThe notion of time is an exceptionally subtle area. You will notice 3517c478bd9Sstevel@tonic-gatefive ways that time is represented in the source: as lease_t's, 3527c478bd9Sstevel@tonic-gateuint32_t's, time_t's, hrtime_t's, and monosec_t's. Each of these 3537c478bd9Sstevel@tonic-gatetypes serves a slightly different function. 3547c478bd9Sstevel@tonic-gate 3557c478bd9Sstevel@tonic-gateThe `lease_t' type is the simplest to understand; it is the unit of 3567c478bd9Sstevel@tonic-gatetime in the CD_{LEASE,T1,T2}_TIME options in a DHCP packet, as defined 3577c478bd9Sstevel@tonic-gateby RFC2131. This is defined as a positive number of seconds (relative 3587c478bd9Sstevel@tonic-gateto some fixed point in time) or the value `-1' (DHCP_PERM) which 3597c478bd9Sstevel@tonic-gaterepresents infinity (i.e., a permanent lease). The lease_t should be 3607c478bd9Sstevel@tonic-gateused either when dealing with actual DHCP packets that are sent on the 3617c478bd9Sstevel@tonic-gatewire or for variables which follow the exact definition given in the 3627c478bd9Sstevel@tonic-gateRFC. 3637c478bd9Sstevel@tonic-gate 3647c478bd9Sstevel@tonic-gateThe `uint32_t' type is also used to represent a relative time in 3657c478bd9Sstevel@tonic-gateseconds. However, here the value `-1' is not special and of course 3667c478bd9Sstevel@tonic-gatethis type is not tied to any definition given in RFC2131. Use this 3677c478bd9Sstevel@tonic-gatefor representing "offsets" from another point in time that are not 3687c478bd9Sstevel@tonic-gateDHCP lease times. 3697c478bd9Sstevel@tonic-gate 3707c478bd9Sstevel@tonic-gateThe `time_t' type is the natural Unix type for representing time since 3717c478bd9Sstevel@tonic-gatethe epoch. Unfortunately, it is affected by stime(2) or adjtime(2) 3727c478bd9Sstevel@tonic-gateand since the DHCP client is used during system installation (and thus 3737c478bd9Sstevel@tonic-gatewhen time is typically being configured), the time_t cannot be used in 3747c478bd9Sstevel@tonic-gategeneral to represent an absolute time since the epoch. For instance, 3757c478bd9Sstevel@tonic-gateif a time_t were used to keep track of when a lease began, and then a 3767c478bd9Sstevel@tonic-gateminute later stime(2) was called to adjust the system clock forward a 3777c478bd9Sstevel@tonic-gateyear, then the lease would appeared to have expired a year ago even 3787c478bd9Sstevel@tonic-gatethough it has only been a minute. For this reason, time_t's should 3797c478bd9Sstevel@tonic-gateonly be used either when wall time must be displayed (such as in 3807c478bd9Sstevel@tonic-gateDHCP_STATUS ipc transaction) or when a time meaningful across reboots 3817c478bd9Sstevel@tonic-gatemust be obtained (such as when caching an ACK packet at system 3827c478bd9Sstevel@tonic-gateshutdown). 3837c478bd9Sstevel@tonic-gate 3847c478bd9Sstevel@tonic-gateThe `hrtime_t' type returned from gethrtime() works around the 3857c478bd9Sstevel@tonic-gatelimitations of the time_t in that it is not affected by stime(2) or 3867c478bd9Sstevel@tonic-gateadjtime(2), with the disadvantage that it represents time from some 3877c478bd9Sstevel@tonic-gatearbitrary time in the past and in nanoseconds. The timer queue code 3887c478bd9Sstevel@tonic-gatedeals with hrtime_t's directly since that particular piece of code is 3897c478bd9Sstevel@tonic-gatemeant to be fairly independent of the rest of the DHCP client. 3907c478bd9Sstevel@tonic-gate 3917c478bd9Sstevel@tonic-gateHowever, dealing with nanoseconds is error-prone when all the other 3927c478bd9Sstevel@tonic-gatetime types are in seconds. As a result, yet another time type, the 3937c478bd9Sstevel@tonic-gate`monosec_t' was created to represent a monotonically increasing time 3947c478bd9Sstevel@tonic-gatein seconds, and is really no more than (hrtime_t / NANOSEC). Note 3957c478bd9Sstevel@tonic-gatethat this unit is typically used where time_t's would've traditionally 3967c478bd9Sstevel@tonic-gatebeen used. The function monosec() in util.c returns the current 3977c478bd9Sstevel@tonic-gatemonosec, and monosec_to_time() can convert a given monosec to wall 3987c478bd9Sstevel@tonic-gatetime, using the system's current notion of time. 3997c478bd9Sstevel@tonic-gate 4007c478bd9Sstevel@tonic-gateOne additional limitation of the `hrtime_t' and `monosec_t' types is 4017c478bd9Sstevel@tonic-gatethat they are unaware of the passage of time across checkpoint/resume 4027c478bd9Sstevel@tonic-gateevents (e.g., those generated by sys-suspend(1M)). For example, if 4037c478bd9Sstevel@tonic-gategethrtime() returns time T, and then the machine is suspended for 2 4047c478bd9Sstevel@tonic-gatehours, and then gethrtime() is called again, the time returned is not 4057c478bd9Sstevel@tonic-gateT + (2 * 60 * 60 * NANOSEC), but rather approximately still T. 4067c478bd9Sstevel@tonic-gate 4077c478bd9Sstevel@tonic-gateTo work around this (and other checkpoint/resume related problems), 4087c478bd9Sstevel@tonic-gatewhen a system is resumed, the DHCP client makes the pessimistic 4097c478bd9Sstevel@tonic-gateassumption that all finite leases have expired while the machine was 4107c478bd9Sstevel@tonic-gatesuspended and must be obtained again. This is known as "refreshing" 411*d04ccbb3Scarlsonjthe leases, and is handled by refresh_smachs(). 4127c478bd9Sstevel@tonic-gate 4137c478bd9Sstevel@tonic-gateNote that it appears like a more intelligent approach would be to 4147c478bd9Sstevel@tonic-gaterecord the time(2) when the system is suspended, compare that against 4157c478bd9Sstevel@tonic-gatethe time(2) when the system is resumed, and use the delta between them 4167c478bd9Sstevel@tonic-gateto decide which leases have expired. Sadly, this cannot be done since 417*d04ccbb3Scarlsonjthrough at least Solaris 10, it is not possible for userland programs 4187c478bd9Sstevel@tonic-gateto be notified of system suspend events. 4197c478bd9Sstevel@tonic-gate 4207c478bd9Sstevel@tonic-gateConfiguration 4217c478bd9Sstevel@tonic-gate------------- 4227c478bd9Sstevel@tonic-gate 4237c478bd9Sstevel@tonic-gateFor the most part, the DHCP client only *retrieves* configuration data 4247c478bd9Sstevel@tonic-gatefrom the DHCP server, leaving the configuration to scripts (such as 4257c478bd9Sstevel@tonic-gateboot scripts), which themselves use dhcpinfo(1) to retrieve the data 4267c478bd9Sstevel@tonic-gatefrom the DHCP client. This is desirable because it keeps the mechanism 4277c478bd9Sstevel@tonic-gateof retrieving the configuration data decoupled from the policy of using 4287c478bd9Sstevel@tonic-gatethe data. 4297c478bd9Sstevel@tonic-gate 430*d04ccbb3ScarlsonjHowever, unless used in "inform" mode, the DHCP client *does* 431*d04ccbb3Scarlsonjconfigure each IP interface enough to allow it to communicate with 432*d04ccbb3Scarlsonjother hosts. Specifically, the DHCP client configures the interface's 433*d04ccbb3ScarlsonjIP address, netmask, and broadcast address using the information 434*d04ccbb3Scarlsonjprovided by the server. Further, for IPv4 logical interface 0 435*d04ccbb3Scarlsonj("hme0"), any provided default routes are also configured. 436*d04ccbb3Scarlsonj 437*d04ccbb3ScarlsonjFor IPv6, only the IP addresses are set. The netmask (prefix) is then 438*d04ccbb3Scarlsonjset automatically by in.ndpd, and routes are discovered in the usual 439*d04ccbb3Scarlsonjway by router discovery or routing protocols. DHCPv6 doesn't set 440*d04ccbb3Scarlsonjroutes. 441*d04ccbb3Scarlsonj 442*d04ccbb3ScarlsonjSince logical interfaces cannot be specified as output interfaces in 443*d04ccbb3Scarlsonjthe kernel forwarding table, and in most cases, logical interfaces 444*d04ccbb3Scarlsonjshare a default route with their associated physical interface, the 445*d04ccbb3ScarlsonjDHCP client does not automatically add or remove default routes when 446*d04ccbb3ScarlsonjIPv4 leases are acquired or expired on logical interfaces. 4477c478bd9Sstevel@tonic-gate 4487c478bd9Sstevel@tonic-gateEvent Scripting 4497c478bd9Sstevel@tonic-gate--------------- 4507c478bd9Sstevel@tonic-gate 4517c478bd9Sstevel@tonic-gateThe DHCP client supports user program invocations on DHCP events. The 452*d04ccbb3Scarlsonjsupported events are BOUND, EXTEND, EXPIRE, DROP, RELEASE, and INFORM 453*d04ccbb3Scarlsonjfor DHCPv4, and BUILD6, EXTEND6, EXPIRE6, DROP6, LOSS6, RELEASE6, and 454*d04ccbb3ScarlsonjINFORM6 for DHCPv6. The user program runs asynchronous to the DHCP 455*d04ccbb3Scarlsonjclient so that the main event loop stays active to process other 456*d04ccbb3Scarlsonjevents, including events triggered by the user program (for example, 457*d04ccbb3Scarlsonjwhen it invokes dhcpinfo). 4587c478bd9Sstevel@tonic-gate 4597c478bd9Sstevel@tonic-gateThe user program execution is part of the transaction of a DHCP command. 4607c478bd9Sstevel@tonic-gateFor example, if the user program is not enabled, the transaction of the 4617c478bd9Sstevel@tonic-gateDHCP command START is considered over when an ACK is received and the 4627c478bd9Sstevel@tonic-gateinterface is configured successfully. If the user program is enabled, 4637c478bd9Sstevel@tonic-gateit is invoked after the interface is configured successfully, and the 4647c478bd9Sstevel@tonic-gatetransaction is considered over only when the user program exits. The 4657c478bd9Sstevel@tonic-gateevent scripting implementation makes use of the asynchronous operations 4667c478bd9Sstevel@tonic-gatediscussed in the "Transactions" section. 4677c478bd9Sstevel@tonic-gate 468*d04ccbb3ScarlsonjAn upper bound of 58 seconds is imposed on how long the user program 4697c478bd9Sstevel@tonic-gatecan run. If the user program does not exit after 55 seconds, the signal 4707c478bd9Sstevel@tonic-gateSIGTERM is sent to it. If it still does not exit after additional 3 4717c478bd9Sstevel@tonic-gateseconds, the signal SIGKILL is sent to it. Since the event handler is 4727c478bd9Sstevel@tonic-gatea wrapper around poll(), the DHCP client cannot directly observe the 4737c478bd9Sstevel@tonic-gatecompletion of the user program. Instead, the DHCP client creates a 4747c478bd9Sstevel@tonic-gatechild "helper" process to synchronously monitor the user program (this 4757c478bd9Sstevel@tonic-gateprocess is also used to send the aformentioned signals to the process, 4767c478bd9Sstevel@tonic-gateif necessary). The DHCP client and the helper process share a pipe 4777c478bd9Sstevel@tonic-gatewhich is included in the set of poll descriptors monitored by the DHCP 4787c478bd9Sstevel@tonic-gateclient's event handler. When the user program exits, the helper process 4797c478bd9Sstevel@tonic-gatepasses the user program exit status to the DHCP client through the pipe, 4807c478bd9Sstevel@tonic-gateinforming the DHCP client that the user program has finished. When the 4817c478bd9Sstevel@tonic-gateDHCP client is asked to shut down, it will wait for any running instances 4827c478bd9Sstevel@tonic-gateof the user program to complete. 483