Name Date Size #Lines LOC

..--

MakefileH A D02-Oct-20231.9 KiB6827

READMEH A D26-Feb-202224 KiB476406

README.v6H A D26-Feb-202255.4 KiB1,233988

adopt.cH A D07-Feb-20189.6 KiB396202

agent.cH A D10-Mar-202345.7 KiB1,681954

agent.hH A D27-Dec-20225.4 KiB15056

async.cH A D27-Dec-20222.8 KiB10645

async.hH A D27-Dec-20221.7 KiB6121

bound.cH A D18-Nov-201633 KiB1,188716

class_id.cH A D02-Oct-20234.5 KiB204105

class_id.hH A D27-Dec-20221.2 KiB4710

common.hH A D27-Dec-20221.7 KiB6826

defaults.cH A D18-Nov-20167.9 KiB293173

defaults.hH A D18-Nov-20162.4 KiB7631

dhcpagent.dflH A D26-Feb-20228.5 KiB189174

dhcpagent.xclH A D14-Jun-20051.4 KiB5634

inform.cH A D19-Aug-20103.7 KiB12656

init_reboot.cH A D18-Nov-20167.1 KiB268146

interface.cH A D17-Nov-200946.9 KiB1,8011,128

interface.hH A D13-Oct-20097.7 KiB203100

ipc_action.cH A D18-Nov-20167.2 KiB277134

ipc_action.hH A D27-Dec-20222 KiB6829

packet.cH A D18-Nov-201641.3 KiB1,615965

packet.hH A D18-Nov-20165 KiB14865

release.cH A D19-Aug-20107.9 KiB288166

renew.cH A D18-Nov-201615.2 KiB559283

request.cH A D18-Nov-201633 KiB1,223689

script_handler.cH A D30-Apr-20099.2 KiB393195

script_handler.hH A D30-Apr-20092.4 KiB9232

select.cH A D18-Nov-20166.9 KiB250121

states.cH A D18-Nov-201640.5 KiB1,634969

states.hH A D18-Nov-201611.4 KiB345150

util.cH A D26-Feb-202230.8 KiB1,216639

util.hH A D18-Nov-20162.6 KiB8944

README

1CDDL HEADER START
2
3The contents of this file are subject to the terms of the
4Common Development and Distribution License (the "License").
5You may not use this file except in compliance with the License.
6
7You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
8or http://www.opensolaris.org/os/licensing.
9See the License for the specific language governing permissions
10and limitations under the License.
11
12When distributing Covered Code, include this CDDL HEADER in each
13file and include the License file at usr/src/OPENSOLARIS.LICENSE.
14If applicable, add the following below this CDDL HEADER, with the
15fields enclosed by brackets "[]" replaced with your own identifying
16information: Portions Copyright [yyyy] [name of copyright owner]
17
18CDDL HEADER END
19
20Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
21Use is subject to license terms.
22
23Architectural Overview for the DHCP agent
24Peter Memishian
25
26INTRODUCTION
27============
28
29The Solaris DHCP agent (dhcpagent) is a DHCP client implementation
30compliant with RFCs 2131, 3315, and others.  The major forces shaping
31its design were:
32
33	* Must be capable of managing multiple network interfaces.
34	* Must consume little CPU, since it will always be running.
35	* Must have a small memory footprint, since it will always be
36	  running.
37	* Must not rely on any shared libraries outside of /lib, since
38	  it must run before all filesystems have been mounted.
39
40When a DHCP agent implementation is only required to control a single
41interface on a machine, the problem is expressed well as a simple
42state-machine, as shown in RFC2131.  However, when a DHCP agent is
43responsible for managing more than one interface at a time, the
44problem becomes much more complicated.
45
46This can be resolved using threads or with an event-driven model.
47Given that DHCP's behavior can be expressed concisely as a state
48machine, the event-driven model is the closest match.
49
50While tried-and-true, that model is subtle and easy to get wrong.
51Indeed, much of the agent's code is there to manage the complexity of
52programming in an asynchronous event-driven paradigm.
53
54THE BASICS
55==========
56
57The DHCP agent consists of roughly 30 source files, most with a
58companion header file.  While the largest source file is around 1700
59lines, most are much shorter.  The source files can largely be broken
60up into three groups:
61
62	* Source files that, along with their companion header files,
63	  define an abstract "object" that is used by other parts of
64	  the system.  Examples include "packet.c", which along with
65	  "packet.h" provide a Packet object for use by the rest of
66	  the agent; and "async.c", which along with "async.h" defines
67	  an interface for managing asynchronous transactions within
68	  the agent.
69
70	* Source files that implement a given state of the agent; for
71	  instance, there is a "request.c" which comprises all of
72	  the procedural "work" which must be done while in the
73	  REQUESTING state of the agent.  By encapsulating states in
74	  files, it becomes easier to debug errors in the
75	  client/server protocol and adapt the agent to new
76	  constraints, since all the relevant code is in one place.
77
78	* Source files, which along with their companion header files,
79  	  encapsulate a given task or related set of tasks.  The
80	  difference between this and the first group is that the
81	  interfaces exported from these files do not operate on
82	  an "object", but rather perform a specific task.  Examples
83	  include "defaults.c", which provides a useful interface
84	  to /etc/default/dhcpagent file operations.
85
86OVERVIEW
87========
88
89Here we discuss the essential objects and subtle aspects of the
90DHCP agent implementation.  Note that there is of course much more
91that is not discussed here, but after this overview you should be able
92to fend for yourself in the source code.
93
94For details on the DHCPv6 aspects of the design, and how this relates
95to the implementation present in previous releases of Solaris, see the
96README.v6 file.
97
98Event Handlers and Timer Queues
99-------------------------------
100
101The most important object in the agent is the event handler, whose
102interface is in libinetutil.h and whose implementation is in
103libinetutil.  The event handler is essentially an object-oriented
104wrapper around poll(2): other components of the agent can register to
105be called back when specific events on file descriptors happen -- for
106instance, to wait for requests to arrive on its IPC socket, the agent
107registers a callback function (accept_event()) that will be called
108back whenever a new connection arrives on the file descriptor
109associated with the IPC socket.  When the agent initially begins in
110main(), it registers a number of events with the event handler, and
111then calls iu_handle_events(), which proceeds to wait for events to
112happen -- this function does not return until the agent is shutdown
113via signal.
114
115When the registered events occur, the callback functions are called
116back, which in turn might lead to additional callbacks being
117registered -- this is the classic event-driven model.  (As an aside,
118note that programming in an event-driven model means that callbacks
119cannot block, or else the agent will become unresponsive.)
120
121A special kind of "event" is a timeout.  Since there are many timers
122which must be maintained for each DHCP-controlled interface (such as a
123lease expiration timer, time-to-first-renewal (t1) timer, and so
124forth), an object-oriented abstraction to timers called a "timer
125queue" is provided, whose interface is in libinetutil.h with a
126corresponding implementation in libinetutil.  The timer queue allows
127callback functions to be "scheduled" for callback after a certain
128amount of time has passed.
129
130The event handler and timer queue objects work hand-in-hand: the event
131handler is passed a pointer to a timer queue in iu_handle_events() --
132from there, it can use the iu_earliest_timer() routine to find the
133timer which will next fire, and use this to set its timeout value in
134its call to poll(2).  If poll(2) returns due to a timeout, the event
135handler calls iu_expire_timers() to expire all timers that expired
136(note that more than one may have expired if, for example, multiple
137timers were set to expire at the same time).
138
139Although it is possible to instantiate more than one timer queue or
140event handler object, it doesn't make a lot of sense -- these objects
141are really "singletons".  Accordingly, the agent has two global
142variables, `eh' and `tq', which store pointers to the global event
143handler and timer queue.
144
145Network Interfaces
146------------------
147
148For each network interface managed by the agent, there is a set of
149associated state that describes both its general properties (such as
150the maximum MTU) and its connections to DHCP-related state (the
151protocol state machines).  This state is stored in a pair of
152structures called `dhcp_pif_t' (the IP physical interface layer or
153PIF) and `dhcp_lif_t' (the IP logical interface layer or LIF).  Each
154dhcp_pif_t represents a single physical interface, such as "hme0," for
155a given IP protocol version (4 or 6), and has a list of dhcp_lif_t
156structures representing the logical interfaces (such as "hme0:1") in
157use by the agent.
158
159This split is important because of differences between IPv4 and IPv6.
160For IPv4, each DHCP state machine manages a single IP address and
161associated configuration data.  This corresponds to a single logical
162interface, which must be specified by the user.  For IPv6, however,
163each DHCP state machine manages a group of addresses, and is
164associated with DUID value rather than with just an interface.
165
166Thus, DHCPv6 behaves more like in.ndpd in its creation of "ADDRCONF"
167interfaces.  The agent automatically plumbs logical interfaces when
168needed and removes them when the addresses expire.
169
170The state for a given session is stored separately in `dhcp_smach_t'.
171This state machine then points to the main LIF used for I/O, and to a
172list of `dhcp_lease_t' structures representing individual leases, and
173each of those points to a list of LIFs corresponding to the individual
174addresses being managed.
175
176One point that was brushed over in the preceding discussion of event
177handlers and timer queues was context.  Recall that the event-driven
178nature of the agent requires that functions cannot block, lest they
179starve out others and impact the observed responsiveness of the agent.
180As an example, consider the process of extending a lease: the agent
181must send a REQUEST packet and wait for an ACK or NAK packet in
182response.  This is done by sending a REQUEST and then returning to the
183event handler that waits for an ACK or NAK packet to arrive on the
184file descriptor associated with the interface.  Note however, that
185when the ACK or NAK does arrive, and the callback function called
186back, it must know which state machine this packet is for (it must get
187back its context).  This could be handled through an ad-hoc mapping of
188file descriptors to state machines, but a cleaner approach is to have
189the event handler's register function (iu_register_event()) take in an
190opaque context pointer, which will then be passed back to the
191callback.  In the agent, the context pointer used depends on the
192nature of the event: events on LIFs use the dhcp_lif_t pointer, events
193on the state machine use dhcp_smach_t, and so on.
194
195Note that there is nothing that guarantees the pointer passed into
196iu_register_event() or iu_schedule_timer() will still be valid when
197the callback is called back (for instance, the memory may have been
198freed in the meantime).  To solve this problem, all of the data
199structures used in this way are reference counted.  For more details
200on how the reference count scheme is implemented, see the closing
201comments in interface.h regarding memory management.
202
203Transactions
204------------
205
206Many operations performed via DHCP must be performed in groups -- for
207instance, acquiring a lease requires several steps: sending a
208DISCOVER, collecting OFFERs, selecting an OFFER, sending a REQUEST,
209and receiving an ACK, assuming everything goes well.  Note however
210that due to the event-driven model the agent operates in, these
211operations are not inherently "grouped" -- instead, the agent sends a
212DISCOVER, goes back into the main event loop, waits for events
213(perhaps even requests on the IPC channel to begin acquiring a lease
214on another state machine), eventually checks to see if an acceptable
215OFFER has come in, and so forth.  To some degree, the notion of the
216state machine's current state (SELECTING, REQUESTING, etc) helps
217control the potential chaos of the event-driven model (for instance,
218if while the agent is waiting for an OFFER on a given state machine,
219an IPC event comes in requesting that the leases be RELEASED, the
220agent knows to send back an error since the state machine must be in
221at least the BOUND state before a RELEASE can be performed.)
222
223However, states are not enough -- for instance, suppose that the agent
224begins trying to renew a lease.  This is done by sending a REQUEST
225packet and waiting for an ACK or NAK, which might never come.  If,
226while waiting for the ACK or NAK, the user sends a request to renew
227the lease as well, then if the agent were to send another REQUEST,
228things could get quite complicated (and this is only the beginning of
229this rathole).  To protect against this, two objects exist:
230`async_action' and `ipc_action'.  These objects are related, but
231independent of one another; the more essential object is the
232`async_action', which we will discuss first.
233
234In short, an `async_action' represents a pending transaction (aka
235asynchronous action), of which each state machine can have at most
236one.  The `async_action' structure is embedded in the `dhcp_smach_t'
237structure, which is fine since there can be at most one pending
238transaction per state machine.  Typical "asynchronous transactions"
239are START, EXTEND, and INFORM, since each consists of a sequence of
240packets that must be done without interruption.  Note that not all
241DHCP operations are "asynchronous" -- for instance, a DHCPv4 RELEASE
242operation is synchronous (not asynchronous) since after the RELEASE is
243sent no reply is expected from the DHCP server, but DHCPv6 Release is
244asynchronous, as all DHCPv6 messages are transactional.  Some
245operations, such as status query, are synchronous and do not affect
246the system state, and thus do not require sequencing.
247
248When the agent realizes it must perform an asynchronous transaction,
249it calls async_async() to open the transaction.  If one is already
250pending, then the new transaction must fail (the details of failure
251depend on how the transaction was initiated, which is described in
252more detail later when the `ipc_action' object is discussed).  If
253there is no pending asynchronous transaction, the operation succeeds.
254
255When the transaction is complete, either async_finish() or
256async_cancel() must be called to complete or cancel the asynchronous
257action on that state machine.  If the transaction is unable to
258complete within a certain amount of time (more on this later), a timer
259should be used to cancel the operation.
260
261The notion of asynchronous transactions is complicated by the fact
262that they may originate from both inside and outside of the agent.
263For instance, a user initiates an asynchronous START transaction when
264he performs an `ifconfig hme0 dhcp start', but the agent will
265internally need to perform asynchronous EXTEND transactions to extend
266the lease before it expires.  Note that user-initiated actions always
267have priority over internal actions: the former will cancel the
268latter, if necessary.
269
270This leads us into the `ipc_action' object.  An `ipc_action'
271represents the IPC-related pieces of an asynchronous transaction that
272was started as a result of a user request, as well as the `BUSY' state
273of the administrative interface.  Only IPC-generated asynchronous
274transactions have a valid `ipc_action' object.  Note that since there
275can be at most one asynchronous action per state machine, there can
276also be at most one `ipc_action' per state machine (this means it can
277also conveniently be embedded inside the `dhcp_smach_t' structure).
278
279One of the main purposes of the `ipc_action' object is to timeout user
280events.  When the user specifies a timeout value as an argument to
281ifconfig, he is specifying an `ipc_action' timeout; in other words,
282how long he is willing to wait for the command to complete.  When this
283time expires, the ipc_action is terminated, as well as the
284asynchronous operation.
285
286The API provided for the `ipc_action' object is quite similar to the
287one for the `async_action' object: when an IPC request comes in for an
288operation requiring asynchronous operation, ipc_action_start() is
289called.  When the request completes, ipc_action_finish() is called.
290If the user times out before the request completes, then
291ipc_action_timeout() is called.
292
293Packet Management
294-----------------
295
296Another complicated area is packet management: building, manipulating,
297sending and receiving packets.  These operations are all encapsulated
298behind a dozen or so interfaces (see packet.h) that abstract the
299unimportant details away from the rest of the agent code.  In order to
300send a DHCP packet, code first calls init_pkt(), which returns a
301dhcp_pkt_t initialized suitably for transmission.  Note that currently
302init_pkt() returns a dhcp_pkt_t that is actually allocated as part of
303the `dhcp_smach_t', but this may change in the future..  After calling
304init_pkt(), the add_pkt_opt*() functions are used to add options to
305the DHCP packet.  Finally, send_pkt() and send_pkt_v6() can be used to
306transmit the packet to a given IP address.
307
308The send_pkt() function handles the details of packet timeout and
309retransmission.  The last argument to send_pkt() is a pointer to a
310"stop function."  If this argument is passed as NULL, then the packet
311will only be sent once (it won't be retransmitted).  Otherwise, before
312each retransmission, the stop function will be called back prior to
313retransmission.  The callback may alter dsm_send_timeout if necessary
314to place a cap on the next timeout; this is done for DHCPv6 in
315stop_init_reboot() in order to implement the CNF_MAX_RD constraint.
316
317The return value from this function indicates whether to continue
318retransmission or not, which allows the send_pkt() caller to control
319the retransmission policy without making it have to deal with the
320retransmission mechanism.  See request.c for an example of this in
321action.
322
323The recv_pkt() function is simpler but still complicated by the fact
324that one may want to receive several different types of packets at
325once.  The caller registers an event handler on the file descriptor,
326and then calls recv_pkt() to read in the packet along with meta
327information about the message (the sender and interface identifier).
328
329For IPv6, packet reception is done with a single socket, using
330IPV6_PKTINFO to determine the actual destination address and receiving
331interface.  Packets are then matched against the state machines on the
332given interface through the transaction ID.
333
334For IPv4, due to oddities in the DHCP specification (discussed in
335PSARC/2007/571), a special IP_DHCPINIT_IF socket option must be used
336to allow unicast DHCP traffic to be received on an interface during
337lease acquisition.  Since the IP_DHCPINIT_IF socket option can only
338enable one interface at a time, one socket must be used per interface.
339
340Time
341----
342
343The notion of time is an exceptionally subtle area.  You will notice
344five ways that time is represented in the source: as lease_t's,
345uint32_t's, time_t's, hrtime_t's, and monosec_t's.  Each of these
346types serves a slightly different function.
347
348The `lease_t' type is the simplest to understand; it is the unit of
349time in the CD_{LEASE,T1,T2}_TIME options in a DHCP packet, as defined
350by RFC2131. This is defined as a positive number of seconds (relative
351to some fixed point in time) or the value `-1' (DHCP_PERM) which
352represents infinity (i.e., a permanent lease).  The lease_t should be
353used either when dealing with actual DHCP packets that are sent on the
354wire or for variables which follow the exact definition given in the
355RFC.
356
357The `uint32_t' type is also used to represent a relative time in
358seconds.  However, here the value `-1' is not special and of course
359this type is not tied to any definition given in RFC2131.  Use this
360for representing "offsets" from another point in time that are not
361DHCP lease times.
362
363The `time_t' type is the natural Unix type for representing time since
364the epoch.  Unfortunately, it is affected by stime(2) or adjtime(2)
365and since the DHCP client is used during system installation (and thus
366when time is typically being configured), the time_t cannot be used in
367general to represent an absolute time since the epoch.  For instance,
368if a time_t were used to keep track of when a lease began, and then a
369minute later stime(2) was called to adjust the system clock forward a
370year, then the lease would appeared to have expired a year ago even
371though it has only been a minute.  For this reason, time_t's should
372only be used either when wall time must be displayed (such as in
373DHCP_STATUS ipc transaction) or when a time meaningful across reboots
374must be obtained (such as when caching an ACK packet at system
375shutdown).
376
377The `hrtime_t' type returned from gethrtime() works around the
378limitations of the time_t in that it is not affected by stime(2) or
379adjtime(2), with the disadvantage that it represents time from some
380arbitrary time in the past and in nanoseconds.  The timer queue code
381deals with hrtime_t's directly since that particular piece of code is
382meant to be fairly independent of the rest of the DHCP client.
383
384However, dealing with nanoseconds is error-prone when all the other
385time types are in seconds.  As a result, yet another time type, the
386`monosec_t' was created to represent a monotonically increasing time
387in seconds, and is really no more than (hrtime_t / NANOSEC).  Note
388that this unit is typically used where time_t's would've traditionally
389been used.  The function monosec() in util.c returns the current
390monosec, and monosec_to_time() can convert a given monosec to wall
391time, using the system's current notion of time.
392
393One additional limitation of the `hrtime_t' and `monosec_t' types is
394that they are unaware of the passage of time across checkpoint/resume
395events (e.g., those generated by sys-suspend(8)).  For example, if
396gethrtime() returns time T, and then the machine is suspended for 2
397hours, and then gethrtime() is called again, the time returned is not
398T + (2 * 60 * 60 * NANOSEC), but rather approximately still T.
399
400To work around this (and other checkpoint/resume related problems),
401when a system is resumed, the DHCP client makes the pessimistic
402assumption that all finite leases have expired while the machine was
403suspended and must be obtained again.  This is known as "refreshing"
404the leases, and is handled by refresh_smachs().
405
406Note that it appears like a more intelligent approach would be to
407record the time(2) when the system is suspended, compare that against
408the time(2) when the system is resumed, and use the delta between them
409to decide which leases have expired.  Sadly, this cannot be done since
410through at least Solaris 10, it is not possible for userland programs
411to be notified of system suspend events.
412
413Configuration
414-------------
415
416For the most part, the DHCP client only *retrieves* configuration data
417from the DHCP server, leaving the configuration to scripts (such as
418boot scripts), which themselves use dhcpinfo(1) to retrieve the data
419from the DHCP client.  This is desirable because it keeps the mechanism
420of retrieving the configuration data decoupled from the policy of using
421the data.
422
423However, unless used in "inform" mode, the DHCP client *does*
424configure each IP interface enough to allow it to communicate with
425other hosts.  Specifically, the DHCP client configures the interface's
426IP address, netmask, and broadcast address using the information
427provided by the server.  Further, for IPv4 logical interface 0
428("hme0"), any provided default routes are also configured.
429
430For IPv6, only the IP addresses are set.  The netmask (prefix) is then
431set automatically by in.ndpd, and routes are discovered in the usual
432way by router discovery or routing protocols.  DHCPv6 doesn't set
433routes.
434
435Since logical interfaces cannot be specified as output interfaces in
436the kernel forwarding table, and in most cases, logical interfaces
437share a default route with their associated physical interface, the
438DHCP client does not automatically add or remove default routes when
439IPv4 leases are acquired or expired on logical interfaces.
440
441Event Scripting
442---------------
443
444The DHCP client supports user program invocations on DHCP events.  The
445supported events are BOUND, EXTEND, EXPIRE, DROP, RELEASE, and INFORM
446for DHCPv4, and BUILD6, EXTEND6, EXPIRE6, DROP6, LOSS6, RELEASE6, and
447INFORM6 for DHCPv6.  The user program runs asynchronous to the DHCP
448client so that the main event loop stays active to process other
449events, including events triggered by the user program (for example,
450when it invokes dhcpinfo).
451
452The user program execution is part of the transaction of a DHCP command.
453For example, if the user program is not enabled, the transaction of the
454DHCP command START is considered over when an ACK is received and the
455interface is configured successfully.  If the user program is enabled,
456it is invoked after the interface is configured successfully, and the
457transaction is considered over only when the user program exits.  The
458event scripting implementation makes use of the asynchronous operations
459discussed in the "Transactions" section.
460
461An upper bound of 58 seconds is imposed on how long the user program
462can run. If the user program does not exit after 55 seconds, the signal
463SIGTERM is sent to it. If it still does not exit after additional 3
464seconds, the signal SIGKILL is sent to it.  Since the event handler is
465a wrapper around poll(), the DHCP client cannot directly observe the
466completion of the user program.  Instead, the DHCP client creates a
467child "helper" process to synchronously monitor the user program (this
468process is also used to send the aformentioned signals to the process,
469if necessary).  The DHCP client and the helper process share a pipe
470which is included in the set of poll descriptors monitored by the DHCP
471client's event handler.  When the user program exits, the helper process
472passes the user program exit status to the DHCP client through the pipe,
473informing the DHCP client that the user program has finished.  When the
474DHCP client is asked to shut down, it will wait for any running instances
475of the user program to complete.
476

README.v6

1CDDL HEADER START
2
3The contents of this file are subject to the terms of the
4Common Development and Distribution License (the "License").
5You may not use this file except in compliance with the License.
6
7You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
8or http://www.opensolaris.org/os/licensing.
9See the License for the specific language governing permissions
10and limitations under the License.
11
12When distributing Covered Code, include this CDDL HEADER in each
13file and include the License file at usr/src/OPENSOLARIS.LICENSE.
14If applicable, add the following below this CDDL HEADER, with the
15fields enclosed by brackets "[]" replaced with your own identifying
16information: Portions Copyright [yyyy] [name of copyright owner]
17
18CDDL HEADER END
19
20Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
21Use is subject to license terms.
22
23
24**  PLEASE NOTE:
25**
26**  This document discusses aspects of the DHCPv4 client design that have
27**  since changed (e.g., DLPI is no longer used).  However, since those
28**  aspects affected the DHCPv6 design, the discussion has been left for
29**  historical record.
30
31
32DHCPv6 Client Low-Level Design
33
34Introduction
35
36  This project adds DHCPv6 client-side (not server) support to
37  Solaris.  Future projects may add server-side support as well as
38  enhance the basic capabilities added here.  These future projects
39  are not discussed in detail in this document.
40
41  This document assumes that the reader is familiar with the following
42  other documents:
43
44  - RFC 3315: the primary description of DHCPv6
45  - RFCs 2131 and 2132: IPv4 DHCP
46  - RFCs 2461 and 2462: IPv6 NDP and stateless autoconfiguration
47  - RFC 3484: IPv6 default address selection
48  - ifconfig(8): Solaris IP interface configuration
49  - in.ndpd(8): Solaris IPv6 Neighbor and Router Discovery daemon
50  - dhcpagent(8): Solaris DHCP client
51  - dhcpinfo(1): Solaris DHCP parameter utility
52  - ndpd.conf(5): in.ndpd configuration file
53  - netstat(8): Solaris network status utility
54  - snoop(8): Solaris network packet capture and inspection
55  - "DHCPv6 Client High-Level Design"
56
57  Several terms from those documents (such as the DHCPv6 IA_NA and
58  IAADDR options) are used without further explanation in this
59  document; see the reference documents above for details.
60
61  The overall plan is to enhance the existing Solaris dhcpagent so
62  that it is able to process DHCPv6.  It would also have been possible
63  to create a new, separate daemon process for this, or to integrate
64  the feature into in.ndpd.  These alternatives, and the reason for
65  the chosen design, are discussed in Appendix A.
66
67  This document discusses the internal design issues involved in the
68  protocol implementation, and with the associated components (such as
69  in.ndpd, snoop, and the kernel's source address selection
70  algorithm).  It does not discuss the details of the protocol itself,
71  which are more than adequately described in the RFC, nor the
72  individual lines of code, which will be in the code review.
73
74  As a cross-reference, Appendix B has a summary of the components
75  involved and the changes to each.
76
77
78Background
79
80  In order to discuss the design changes for DHCPv6, it's necessary
81  first to talk about the current IPv4-only design, and the
82  assumptions built into that design.
83
84  The main data structure used in dhcpagent is the 'struct ifslist'.
85  Each instance of this structure represents a Solaris logical IP
86  interface under DHCP's control.  It also represents the shared state
87  with the DHCP server that granted the address, the address itself,
88  and copies of the negotiated options.
89
90  There is one list in dhcpagent containing all of the IP interfaces
91  that are under DHCP control.  IP interfaces not under DHCP control
92  (for example, those that are statically addressed) are not included
93  in this list, even when plumbed on the system.  These ifslist
94  entries are chained like this:
95
96  ifsheadp -> ifslist -> ifslist -> ifslist -> NULL
97	        net0	  net0:1     net1
98
99  Each ifslist entry contains the address, mask, lease information,
100  interface name, hardware information, packets, protocol state, and
101  timers.  The name of the logical IP interface under DHCP's control
102  is also the name used in the administrative interfaces (dhcpinfo,
103  ifconfig) and when logging events.
104
105  Each entry holds open a DLPI stream and two sockets.  The DLPI
106  stream is nulled-out with a filter when not in use, but still
107  consumes system resources.  (Most significantly, it causes data
108  copies in the driver layer that end up sapping performance.)
109
110  The entry storage is managed by a insert/hold/release/remove model
111  and reference counts.  In this model, insert_ifs() allocates a new
112  ifslist entry and inserts it into the global list, with the global
113  list holding a reference.  remove_ifs() removes it from the global
114  list and drops that reference.  hold_ifs() and release_ifs() are
115  used by data structures that refer to ifslist entries, such as timer
116  entries, to make sure that the ifslist entry isn't freed until the
117  timer has been dispatched or deleted.
118
119  The design is single-threaded, so code that walks the global list
120  needn't bother taking holds on the ifslist structure.  Only
121  references that may be used at a different time (i.e., pointers
122  stored in other data structures) need to be recorded.
123
124  Packets are handled using PKT (struct dhcp; <netinet/dhcp.h>),
125  PKT_LIST (struct dhcp_list; <dhcp_impl.h>), and dhcp_pkt_t (struct
126  dhcp_pkt; "packet.h").  PKT is just the RFC 2131 DHCP packet
127  structure, and has no additional information, such as packet length.
128  PKT_LIST contains a PKT pointer, length, decoded option arrays, and
129  linkage for putting the packet in a list.  Finally, dhcp_pkt_t has a
130  PKT pointer and length values suitable for modifying the packet.
131
132  Essentially, PKT_LIST is a wrapper for received packets, and
133  dhcp_pkt_t is a wrapper for packets to be sent.
134
135  The basic PKT structure is used in dhcpagent, inetboot, in.dhcpd,
136  libdhcpagent, libdhcputil, and others.  PKT_LIST is used
137  in a similar set of places, including the kernel NFS modules.
138  dhcp_pkt_t is (as the header file implies) limited to dhcpagent.
139
140  In addition to these structures, dhcpagent maintains a set of
141  internal supporting abstractions.  Two key ones involved in this
142  project are the "async operation" and the "IPC action."  An async
143  operation encapsulates the actions needed for a given operation, so
144  that if cancellation is needed, there's a single point where the
145  associated resources can be freed.  An IPC action represents the
146  user state related to the private interface used by ifconfig.
147
148
149DHCPv6 Inherent Differences
150
151  DHCPv6 naturally has some commonality with IPv4 DHCP, but also has
152  some significant differences.
153
154  Unlike IPv4 DHCP, DHCPv6 relies on link-local IP addresses to do its
155  work.  This means that, on Solaris, the client doesn't need DLPI to
156  perform any of the I/O; regular IP sockets will do the job.  It also
157  means that, unlike IPv4 DHCP, DHCPv6 does not need to obtain a lease
158  for the address used in its messages to the server.  The system
159  provides the address automatically.
160
161  IPv4 DHCP expects some messages from the server to be broadcast.
162  DHCPv6 has no such mechanism; all messages from the server to the
163  client are unicast.  In the case where the client and server aren't
164  on the same subnet, a relay agent is used to get the unicast replies
165  back to the client's link-local address.
166
167  With IPv4 DHCP, a single address plus configuration options is
168  leased with a given client ID and a single state machine instance,
169  and the implementation binds that to a single IP logical interface
170  specified by the user.  The lease has a "Lease Time," a required
171  option, as well as two timers, called T1 (renew) and T2 (rebind),
172  which are controlled by regular options.
173
174  DHCPv6 uses a single client/server session to control the
175  acquisition of configuration options and "identity associations"
176  (IAs).  The identity associations, in turn, contain lists of
177  addresses for the client to use and the T1/T2 timer values.  Each
178  individual address has its own preferred and valid lifetime, with
179  the address being marked "deprecated" at the end of the preferred
180  interval, and removed at the end of the valid interval.
181
182  IPv4 DHCP leaves many of the retransmit decisions up to the client,
183  and some things (such as RELEASE and DECLINE) are sent just once.
184  Others (such as the REQUEST message used for renew and rebind) are
185  dealt with by heuristics.  DHCPv6 treats each message to the server
186  as a separate transaction, and resends each message using a common
187  retransmission mechanism.  DHCPv6 also has separate messages for
188  Renew, Rebind, and Confirm rather than reusing the Request
189  mechanism.
190
191  The set of options (which are used to convey configuration
192  information) for each protocol are distinct.  Notably, two of the
193  mistakes from IPv4 DHCP have been fixed: DHCPv6 doesn't carry a
194  client name, and doesn't attempt to impersonate a routing protocol
195  by setting a "default route."
196
197  Another welcome change is the lack of a netmask/prefix length with
198  DHCPv6.  Instead, the client uses the Router Advertisement prefixes
199  to set the correct interface netmask.  This reduces the number of
200  databases that need to be kept in sync.  (The equivalent mechanism
201  in IPv4 would have been the use of ICMP Address Mask Request /
202  Reply, but the BOOTP designers chose to embed it in the address
203  assignment protocol itself.)
204
205  Otherwise, DHCPv6 is similar to IPv4 DHCP.  The same overall
206  renew/rebind and lease expiry strategy is used, although the state
207  machine events must now take into account multiple IAs and the fact
208  that each can cause RENEWING or REBINDING state independently.
209
210
211DHCPv6 And Solaris
212
213  The protocol distinctions above have several important implications.
214  For the logical interfaces:
215
216    - Because Solaris uses IP logical interfaces to configure
217      addresses, we must have multiple IP logical interfaces per IA
218      with IPv6.
219
220    - Because we need to support multiple addresses (and thus multiple
221      IP logical interfaces) per IA and multiple IAs per client/server
222      session, the IP logical interface name isn't a unique name for
223      the lease.
224
225  As a result, IP logical interfaces will come and go with DHCPv6,
226  just as happens with the existing stateless address
227  autoconfiguration support in in.ndpd.  The logical interface names
228  (visible in ifconfig) have no administrative significance.
229
230  Fortunately, DHCPv6 does end up with one fixed name that can be used
231  to identify a session.  Because DHCPv6 uses link local addresses for
232  communication with the server, the name of the IP logical interface
233  that has this link local address (normally the same as the IP
234  physical interface) can be used as an identifier for dhcpinfo and
235  logging purposes.
236
237
238Dhcpagent Redesign Overview
239
240  The redesign starts by refactoring the IP interface representation.
241  Because we need to have multiple IP logical interfaces (LIFs) for a
242  single identity association (IA), we should not store all of the
243  DHCP state information along with the LIF information.
244
245  For DHCPv6, we will need to keep LIFs on a single IP physical
246  interface (PIF) together, so this is probably also a good time to
247  reconsider the way dhcpagent represents physical interfaces.  The
248  current design simply replicates the state (notably the DLPI stream,
249  but also the hardware address and other bits) among all of the
250  ifslist entries on the same physical interface.
251
252  The new design creates two lists of dhcp_pif_t entries, one list for
253  IPv4 and the other for IPv6.  Each dhcp_pif_t represents a PIF, with
254  a list of dhcp_lif_t entries attached, each of which represents a
255  LIF used by dhcpagent.  This structure mirrors the kernel's ill_t
256  and ipif_t interface representations.
257
258  Next, the lease-tracking needs to be refactored.  DHCPv6 is the
259  functional superset in this case, as it has two lifetimes per
260  address (LIF) and IA groupings with shared T1/T2 timers.  To
261  represent these groupings, we will use a new dhcp_lease_t structure.
262  IPv4 DHCP will have one such structure per state machine, while
263  DHCPv6 will have a list.  (Note: the initial implementation will
264  have only one lease per DHCPv6 state machine, because each state
265  machine uses a single link-local address, a single DUID+IAID pair,
266  and supports only Non-temporary Addresses [IA_NA option].  Future
267  enhancements may use multiple leases per DHCPv6 state machine or
268  support other IA types.)
269
270  For all of these new structures, we will use the same insert/hold/
271  release/remove model as with the original ifslist.
272
273  Finally, the remaining items (and the bulk of the original ifslist
274  members) are kept on a per-state-machine basis.  As this is no
275  longer just an "interface," a new dhcp_smach_t structure will hold
276  these, and the ifslist structure is gone.
277
278
279Lease Representation
280
281  For DHCPv6, we need to track multiple LIFs per lease (IA), but we
282  also need multiple LIFs per PIF.  Rather than having two sets of
283  list linkage for each LIF, we can observe that a LIF is on exactly
284  one PIF and is a member of at most one lease, and then simplify: the
285  lease structure will use a base pointer for the first LIF in the
286  lease, and a count for the number of consecutive LIFs in the PIF's
287  list of LIFs that belong to the lease.
288
289  When removing a LIF from the system, we need to decrement the count
290  of LIFs in the lease, and advance the base pointer if the LIF being
291  removed is the first one.  Inserting a LIF means just moving it into
292  this list and bumping the counter.
293
294  When removing a lease from a state machine, we need to dispose of
295  the LIFs referenced.  If the LIF being disposed is the main LIF for
296  a state machine, then all that we can do is canonize the LIF
297  (returning it to a default state); this represents the normal IPv4
298  DHCP operation on lease expiry.  Otherwise, the lease is the owner
299  of that LIF (it was created because of a DHCPv6 IA), and disposal
300  means unplumbing the LIF from the actual system and removing the LIF
301  entry from the PIF.
302
303
304Main Structure Linkage
305
306  For IPv4 DHCP, the new linkage is straightforward.  Using the same
307  system configuration example as in the initial design discussion:
308
309          +- lease  +- lease       +- lease
310          |  ^      |  ^           |  ^
311          |  |      |  |           |  |
312          \  smach  \  smach       \  smach
313           \ ^|      \ ^|           \ ^|
314            v|v       v|v            v|v
315            lif ----> lif -> NULL     lif -> NULL
316            net0      net0:1          net1
317            ^                         ^
318            |                         |
319  v4root -> pif --------------------> pif -> NULL
320            net0                      net1
321
322  This diagram shows three separate state machines running (with
323  backpointers omitted for clarity).  Each state machine has a single
324  "main" LIF with which it's associated (and named).  Each also has a
325  single lease structure that points back to the same LIF (count of
326  1), because IPv4 DHCP controls a single address allocation per state
327  machine.
328
329  DHCPv6 is a bit more complex.  This shows DHCPv6 running on two
330  interfaces (more or fewer interfaces are of course possible) and
331  with multiple leases on the first interface, and each lease with
332  multiple addresses (one with two addresses, the second with one).
333
334            lease ----------------> lease -> NULL   lease -> NULL
335            ^   \(2)                |(1)            ^   \ (1)
336            |    \                  |               |    \
337            smach \                 |               smach \
338            ^ |    \                |               ^ |    \
339            | v     v               v               | v     v
340            lif --> lif --> lif --> lif --> NULL    lif --> lif -> NULL
341            net0    net0:1  net0:4  net0:2          net1    net1:5
342            ^                                       ^
343            |                                       |
344  v6root -> pif ----------------------------------> pif -> NULL
345            net0                                    net1
346
347  Note that there's intentionally no ordering based on name in the
348  list of LIFs.  Instead, the contiguous LIF structures in that list
349  represent the addresses in each lease.  The logical interfaces
350  themselves are allocated and numbered by the system kernel, so they
351  may not be sequential, and there may be gaps in the list if other
352  entities (such as in.ndpd) are also configuring interfaces.
353
354  Note also that with IPv4 DHCP, the lease points to the LIF that's
355  also the main LIF for the state machine, because that's the IP
356  interface that dhcpagent controls.  With DHCPv6, the lease (one per
357  IA structure) points to a separate set of LIFs that are created just
358  for the leased addresses (one per IA address in an IAADDR option).
359  The state machine alone points to the main LIF.
360
361
362Packet Structure Extensions
363
364  Obviously, we need some DHCPv6 packet data structures and
365  definitions.  A new <netinet/dhcp6.h> file will be introduced with
366  the necessary #defines and structures.  The key structure there will
367  be:
368
369	struct dhcpv6_message {
370		uint8_t		d6m_msg_type;
371		uint8_t		d6m_transid_ho;
372		uint16_t	d6m_transid_lo;
373	};
374	typedef	struct dhcpv6_message	dhcpv6_message_t;
375
376  This defines the usual (non-relay) DHCPv6 packet header, and is
377  roughly equivalent to PKT for IPv4.
378
379  Extending dhcp_pkt_t for DHCPv6 is straightforward, as it's used
380  only within dhcpagent.  This structure will be amended to use a
381  union for v4/v6 and include a boolean to flag which version is in
382  use.
383
384  For the PKT_LIST structure, things are more complex.  This defines
385  both a queuing mechanism for received packets (typically OFFERs) and
386  a set of packet decoding structures.  The decoding structures are
387  highly specific to IPv4 DHCP -- they have no means to handle nested
388  or repeated options (as used heavily in DHCPv6) and make use of the
389  DHCP_OPT structure which is specific to IPv4 DHCP -- and are
390  somewhat expensive in storage, due to the use of arrays indexed by
391  option code number.
392
393  Worse, this structure is used throughout the system, so changes to
394  it need to be made carefully.  (For example, the existing 'pkt'
395  member can't just be turned into a union.)
396
397  For an initial prototype, since discarded, I created a new
398  dhcp_plist_t structure to represent packet lists as used inside
399  dhcpagent and made dhcp_pkt_t valid for use on input and output.
400  The result is unsatisfying, though, as it results in code that
401  manipulates far too many data structures in common cases; it's a sea
402  of pointers to pointers.
403
404  The better answer is to use PKT_LIST for both IPv4 and IPv6, adding
405  the few new bits of metadata required to the end (receiving ifIndex,
406  packet source/destination addresses), and staying within the overall
407  existing design.
408
409  For option parsing, dhcpv6_find_option() and dhcpv6_pkt_option()
410  functions will be added to libdhcputil.  The former function will
411  walk a DHCPv6 option list, and provide safe (bounds-checked) access
412  to the options inside.  The function can be called recursively, so
413  that option nesting can be handled fairly simply by nested loops,
414  and can be called repeatedly to return each instance of a given
415  option code number.  The latter function is just a convenience
416  wrapper on dhcpv6_find_option() that starts with a PKT_LIST pointer
417  and iterates over the top-level options with a given code number.
418
419  There are two special considerations for the use of these library
420  interfaces: there's no "pad" option for DHCPv6 or alignment
421  requirements on option headers or contents, and nested options
422  always follow a structure that has type-dependent length.  This
423  means that code that handles options must all be written to deal
424  with unaligned data, and suboption code must index the pointer past
425  the type-dependent part.
426
427
428Packet Construction
429
430  Unlike DHCPv4, DHCPv6 places the transaction timer value in an
431  option.  The existing code sets the current time value in
432  send_pkt_internal(), which allows it to be updated in a
433  straightforward way when doing retransmits.
434
435  To make this work in a simple manner for DHCPv6, I added a
436  remove_pkt_opt() function.  The update logic just does a remove and
437  re-adds the option.  We could also just assume the presence of the
438  option, find it, and modify in place, but the remove feature seems
439  more general.
440
441  DHCPv6 uses nesting options.  To make this work, two new utility
442  functions are needed.  First, an add_pkt_subopt() function will take
443  a pointer to an existing option and add an embedded option within
444  it.  The packet length and existing option length are updated.  If
445  that existing option isn't a top-level option, though, this means
446  that the caller must update the lengths of all of the enclosing
447  options up to the top level.  To do this, update_v6opt_len() will be
448  added.  This is used in the special case of adding a Status Code
449  option to an IAADDR option within an IA_NA top-level option.
450
451
452Sockets and I/O Handling
453
454  DHCPv6 doesn't need or use either a DLPI or a broadcast IP socket.
455  Instead, a single unicast-bound IP socket on a link-local address
456  would be the most that is needed.  This is roughly equivalent to
457  if_sock_ip_fd in the existing design, but that existing socket is
458  bound only after DHCP reaches BOUND state -- that is, when it
459  switches away from DLPI.  We need something different.
460
461  This, along with the excess of open file descriptors in an otherwise
462  idle daemon and the potentially serious performance problems in
463  leaving DLPI open at all times, argues for a larger redesign of the
464  I/O logic in dhcpagent.
465
466  The first thing that we can do is eliminate the need for the
467  per-ifslist if_sock_fd.  This is used primarily for issuing ioctls
468  to configure interfaces -- a task that would work as well with any
469  open socket -- and is also registered to receive any ACK/NAK packets
470  that may arrive via broadcast.  Both of these can be eliminated by
471  creating a pair of global sockets (IPv4 and IPv6), bound and
472  configured for ACK/NAK reception.  The only functional difference is
473  that the list of running state machines must be scanned on reception
474  to find the correct transaction ID, but the existing design
475  effectively already goes to this effort because the kernel
476  replicates received datagrams among all matching sockets, and each
477  ifslist entry has a socket open.
478
479  (The existing code for if_sock_fd makes oblique reference to unknown
480  problems in the system that may prevent binding from working in some
481  cases.  The reference dates back some seven years to the original
482  DHCP implementation.  I've observed no such problems in extensive
483  testing and if any do show up, they will be dealt with by fixing the
484  underlying bugs.)
485
486  This leads to an important simplification: it's no longer necessary
487  to register, unregister, and re-register for packet reception while
488  changing state -- register_acknak() and unregister_acknak() are
489  gone.  Instead, we always receive, and we dispatch the packets as
490  they arrive.  As a result, when receiving a DHCPv4 ACK or DHCPv6
491  Reply when in BOUND state, we know it's a duplicate, and we can
492  discard.
493
494  The next part is in minimizing DLPI usage.  A DLPI stream is needed
495  at most for each IPv4 PIF, and it's not needed when all of the
496  DHCP instances on that PIF are bound.  In fact, the current
497  implementation deals with this in configure_bound() by setting a
498  "blackhole" packet filter.  The stream is left open.
499
500  To simplify this, we will open at most one DLPI stream on a PIF, and
501  use reference counts from the state machines to determine when the
502  stream must be open and when it can be closed.  This mechanism will
503  be centralized in a set_smach_state() function that changes the
504  state and opens/closes the DLPI stream when needed.
505
506  This leads to another simplification.  The I/O logic in the existing
507  dhcpagent makes use of the protocol state to select between DLPI and
508  sockets.  Now that we keep track of this in a simpler manner, we no
509  longer need to switch out on state in when sending a packet; just
510  test the dsm_using_dlpi flag instead.
511
512  Still another simplification is in the handling of DHCPv4 INFORM.
513  The current code has separate logic in it for getting the interface
514  state and address information.  This is no longer necessary, as the
515  LIF mechanism keeps track of the interface state.  And since we have
516  separate lease structures, and INFORM doesn't acquire a lease, we no
517  longer have to be careful about canonizing the interface on
518  shutdown.
519
520  Although the default is to send all client messages to a well-known
521  multicast address for servers and relays, DHCPv6 also has a
522  mechanism that allows the client to send unicast messages to the
523  server.  The operation of this mechanism is slightly complex.
524  First, the server sends the client a unicast address via an option.
525  We may use this address as the destination (rather than the
526  well-known multicast address for local DHCPv6 servers and relays)
527  only if we have a viable local source address.  This means using
528  SIOCGDSTINFO each time we try to send unicast.  Next, the server may
529  send back a special status code: UseMulticast.  If this is received,
530  and if we were actually using unicast in our messages to the server,
531  then we need to forget the unicast address, switch back to
532  multicast, and resend our last message.
533
534  Note that it's important to avoid the temptation to resend the last
535  message every time UseMulticast is seen, and do it only once on
536  switching back to multicast: otherwise, a potential feedback loop is
537  created.
538
539  Because IP_PKTINFO (PSARC 2006/466) has integrated, we could go a
540  step further by removing the need for any per-LIF sockets and just
541  use the global sockets for all but DLPI.  However, in order to
542  facilitate a Solaris 10 backport, this will be done separately as CR
543  6509317.
544
545  In the case of DHCPv6, we already have IPV6_PKTINFO, so we will pave
546  the way for IPv4 by beginning to using this now, and thus have just
547  a single socket (bound to "::") for all of DHCPv6.  Doing this
548  requires switching from the old BSD4.2 -lsocket -lnsl to the
549  standards-compliant -lxnet in order to use ancillary data.
550
551  It may also be possible to remove the need for DLPI for IPv4, and
552  incidentally simplify the code a fair amount, by adding a kernel
553  option to allow transmission and reception of UDP packets over
554  interfaces that are plumbed but not marked IFF_UP.  This is left for
555  future work.
556
557
558The State Machine
559
560  Several parts of the existing state machine need additions to handle
561  DHCPv6, which is a superset of DHCPv4.
562
563  First, there are the RENEWING and REBINDING states.  For IPv4 DHCP,
564  these states map one-to-one with a single address and single lease
565  that's undergoing renewal.  It's a simple progression (on timeout)
566  from BOUND, to RENEWING, to REBINDING and finally back to SELECTING
567  to start over.  Each retransmit is done by simply rescheduling the
568  T1 or T2 timer.
569
570  For DHCPv6, things are somewhat more complex.  At any one time,
571  there may be multiple IAs (leases) that are effectively in renewing
572  or rebinding state, based on the T1/T2 timers for each IA, and many
573  addresses that have expired.
574
575  However, because all of the leases are related to a single server,
576  and that server either responds to our requests or doesn't, we can
577  simplify the states to be nearly identical to IPv4 DHCP.
578
579  The revised definition for use with DHCPv6 is:
580
581    - Transition from BOUND to RENEWING state when the first T1 timer
582      (of any lease on the state machine) expires.  At this point, as
583      an optimization, we should begin attempting to renew any IAs
584      that are within REN_TIMEOUT (10 seconds) of reaching T1 as well.
585      We may as well avoid sending an excess of packets.
586
587    - When a T1 lease timer expires and we're in RENEWING or REBINDING
588      state, just ignore it, because the transaction is already in
589      progress.
590
591    - At each retransmit timeout, we should check to see if there are
592      more IAs that need to join in because they've passed point T1 as
593      well, and, if so, add them.  This check isn't necessary at this
594      time, because only a single IA_NA is possible with the initial
595      design.
596
597    - When we reach T2 on any IA and we're in BOUND or RENEWING state,
598      enter REBINDING state.  At this point, we have a choice.  For
599      those other IAs that are past T1 but not yet at T2, we could
600      ignore them (sending only those that have passed point T2),
601      continue to send separate Renew messages for them, or just
602      include them in the Rebind message.  This isn't an issue that
603      must be dealt with for this project, but the plan is to include
604      them in the Rebind message.
605
606    - When a T2 lease timer expires and we're in REBINDING state, just
607      ignore it, as with the corresponding T1 timer.
608
609    - As addresses reach the end of their preferred lifetimes, set the
610      IFF_DEPRECATED flag.  As they reach the end of the valid
611      lifetime, remove them from the system.  When an IA (lease)
612      becomes empty, just remove it.  When there are no more leases
613      left, return to SELECTING state to start over.
614
615  Note that the RFC treats the IAs as separate entities when
616  discussing the renew/rebind T1/T2 timers, but treats them as a unit
617  when doing the initial negotiation.  This is, to say the least,
618  confusing, especially so given that there's no reason to expect that
619  after having failed to elicit any responses at all from the server
620  on one IA, the server will suddenly start responding when we attempt
621  to renew some other IA.  We rationalize this behavior by using a
622  single renew/rebind state for the entire state machine (and thus
623  client/server pair).
624
625  There's a subtle timing difference here between DHCPv4 and DHCPv6.
626  For DHCPv4, the client just sends packets more and more frequently
627  (shorter timeouts) as the next state gets nearer.  DHCPv6 treats
628  each as a transaction, using the same retransmit logic as for other
629  messages.  The DHCPv6 method is a cleaner design, so we will change
630  the DHCPv4 implementation to do the same, and compute the new timer
631  values as part of stop_extending().
632
633  Note that it would be possible to start the SELECTING state earlier
634  than waiting for the last lease to expire, and thus avoid a loss of
635  connectivity.  However, it this point, there are other servers on
636  the network that have seen us attempting to Rebind for quite some
637  time, and they have not responded.  The likelihood that there's a
638  server that will ignore Rebind but then suddenly spring into action
639  on a Solicit message seems low enough that the optimization won't be
640  done now.  (Starting SELECTING state earlier may be done in the
641  future, if it's found to be useful.)
642
643
644Persistent State
645
646  IPv4 DHCP has only minimal need for persistent state, beyond the
647  configuration parameters.  The state is stored when "ifconfig dhcp
648  drop" is run or the daemon receives SIGTERM, which is typically done
649  only well after the system is booted and running.
650
651  The daemon stores this state in /etc/dhcp, because it needs to be
652  available when only the root file system has been mounted.
653
654  Moreover, dhcpagent starts very early in the boot process.  It runs
655  as part of svc:/network/physical:default, which runs well before
656  root is mounted read/write:
657
658     svc:/system/filesystem/root:default ->
659        svc:/system/metainit:default ->
660           svc:/system/identity:node ->
661              svc:/network/physical:default
662           svc:/network/iscsi_initiator:default ->
663              svc:/network/physical:default
664
665  and, of course, well before either /var or /usr is mounted.  This
666  means that any persistent state must be kept in the root file
667  system, and that if we write before shutdown, we have to cope
668  gracefully with the root file system returning EROFS on write
669  attempts.
670
671  For DHCPv6, we need to try to keep our stable DUID and IAID values
672  stable across reboots to fulfill the demands of RFC 3315.
673
674  The DUID is either configured or automatically generated.  When
675  configured, it comes from the /etc/default/dhcpagent file, and thus
676  does not need to be saved by the daemon.  If automatically
677  generated, there's exactly one of these created, and it will
678  eventually be needed before /usr is mounted, if /usr is mounted over
679  IPv6.  This means a new file in the root file system,
680  /etc/dhcp/duid, will be used to hold the automatically generated
681  DUID.
682
683  The determination of whether to use a configured DUID or one saved
684  in a file is made in get_smach_cid().  This function will
685  encapsulate all of the DUID parsing and generation machinery for the
686  rest of dhcpagent.
687
688  If root is not writable at the point when dhcpagent starts, and our
689  attempt fails with EROFS, we will set a timer for 60 second
690  intervals to retry the operation periodically.  In the unlikely case
691  that it just never succeeds or that we're rebooted before root
692  becomes writable, then the impact will be that the daemon will wake
693  up once a minute and, ultimately, we'll choose a different DUID on
694  next start-up, and we'll thus lose our leases across a reboot.
695
696  The IAID similarly must be kept stable if at all possible, but
697  cannot be configured by the user.  To do make these values stable,
698  we will use two strategies.  First the IAID value for a given
699  interface (if not known) will just default to the IP ifIndex value,
700  provided that there's no known saved IAID using that value.  Second,
701  we will save off the IAID we choose in a single /etc/dhcp/iaid file,
702  containing an array of entries indexed by logical interface name.
703  Keeping it in a single file allows us to scan for used and unused
704  IAID values when necessary.
705
706  This mechanism depends on the interface name, and thus will need to
707  be revisited when Clearview vanity naming and NWAM are available.
708
709  Currently, the boot system (GRUB, OBP, the miniroot) does not
710  support installing over IPv6.  This could change in the future, so
711  one of the goals of the above stability plan is to support that
712  event.
713
714  When running in the miniroot on an x86 system, /etc/dhcp (and the
715  rest of the root) is mounted on a read-only ramdisk.  In this case,
716  writing to /etc/dhcp will just never work.  A possible solution
717  would be to add a new privileged command in ifconfig that forces
718  dhcpagent to write to an alternate location.  The initial install
719  process could then do "ifconfig <x> dhcp write /a" to get the needed
720  state written out to the newly-constructed system root.
721
722  This part (the new write option) won't be implemented as part of
723  this project, because it's not needed yet.
724
725
726Router Advertisements
727
728  IPv6 Router Advertisements perform two functions related to DHCPv6:
729
730    - they specify whether and how to run DHCPv6 on a given interface.
731    - they provide a list of the valid prefixes on an interface.
732
733  For the first function, in.ndpd needs to use the same DHCP control
734  interfaces that ifconfig uses, so that it can launch dhcpagent and
735  trigger DHCPv6 when necessary.  Note that it never needs to shut
736  down DHCPv6, as router advertisements can't do that.
737
738  However, launching dhcpagent presents new problems.  As a part of
739  the "Quagga SMF Modifications" project (PSARC 2006/552), in.ndpd in
740  Nevada is now privilege-aware and runs with limited privileges,
741  courtesy of SMF.  Dhcpagent, on the other hand, must run with all
742  privileges.
743
744  A simple work-around for this issue is to rip out the "privileges="
745  clause from the method_credential for in.ndpd.  I've taken this
746  direction initially, but the right longer-term answer seems to be
747  converting dhcpagent into an SMF service.  This is quite a bit more
748  complex, as it means turning the /sbin/dhcpagent command line
749  interface into a utility that manipulates the service and passes the
750  command line options via IPC extensions.
751
752  Such a design also begs the question of whether dhcpagent itself
753  ought to run with reduced privileges.  It could, but it still needs
754  the ability to grant "all" (traditional UNIX root) privileges to the
755  eventhook script, if present.  There seem to be few ways to do this,
756  though it's a good area for research.
757
758  The second function, prefix handling, is also subtle.  Unlike IPv4
759  DHCP, DHCPv6 does not give the netmask or prefix length along with
760  the leased address.  The client is on its own to determine the right
761  netmask to use.  This is where the advertised prefixes come in:
762  these must be used to finish the interface configuration.
763
764  We will have the DHCPv6 client configure each interface with an
765  all-ones (/128) netmask by default.  In.ndpd will be modified so
766  that when it detects a new IFF_DHCPRUNNING IP logical interface, it
767  checks for a known matching prefix, and sets the netmask as
768  necessary.  If no matching prefix is known, it will send a new
769  Router Solicitation message to try to find one.
770
771  When in.ndpd learns of a new prefix from a Router Advertisement, it
772  will scan all of the IFF_DHCPRUNNING IP logical interfaces on the
773  same physical interface and set the netmasks when necessary.
774  Dhcpagent, for its part, will ignore the netmask on IPv6 interfaces
775  when checking for changes that would require it to "abandon" the
776  interface.
777
778  Given the way that DHCPv6 and in.ndpd control both the horizontal
779  and the vertical in plumbing and removing logical interfaces, and
780  users do not, it might be worthwhile to consider roping off any
781  direct user changes to IPv6 logical interfaces under control of
782  in.ndpd or dhcpagent, and instead force users through a higher-level
783  interface.  This won't be done as part of this project, however.
784
785
786ARP Hardware Types
787
788  There are multiple places within the DHCPv6 client where the mapping
789  of DLPI MAC type to ARP Hardware Type is required:
790
791  - When we are constructing an automatic, stable DUID for our own
792    identity, we prefer to use a DUID-LLT if possible.  This is done
793    by finding a link-layer interface, opening it, reading the MAC
794    address and type, and translating in the make_stable_duid()
795    function in libdhcpagent.
796
797  - When we translate a user-configured DUID from
798    /etc/default/dhcpagent into a binary representation, we may have
799    to deal with a physical interface name.  In this case, we must
800    open that interface and read the MAC address and type.
801
802  - As part of the PIF data structure initialization, we need to read
803    out the MAC type so that it can be used in the BOOTP/DHCPv4
804    'htype' field.
805
806  Ideally, these would all be provided by a single libdlpi
807  implementation.  However, that project is on-going at this time and
808  has not yet integrated.  For the time being, a dlpi_to_arp()
809  translation function (taking dl_mac_type and returning an ARP
810  Hardware Type number) will be placed in libdhcputil.
811
812  This temporary function should be removed and this section of the
813  code updated when the new libdlpi from Clearview integrates.
814
815
816Field Mappings
817
818  Old (all in ifslist)	New
819  next			dhcp_smach_t.dsm_next
820  prev			dhcp_smach_t.dsm_prev
821  if_hold_count		dhcp_smach_t.dsm_hold_count
822  if_ia			dhcp_smach_t.dsm_ia
823  if_async		dhcp_smach_t.dsm_async
824  if_state		dhcp_smach_t.dsm_state
825  if_dflags		dhcp_smach_t.dsm_dflags
826  if_name		dhcp_smach_t.dsm_name (see text)
827  if_index		dhcp_pif_t.pif_index
828  if_max		dhcp_lif_t.lif_max and dhcp_pif_t.pif_max
829  if_min		(was unused; removed)
830  if_opt		(was unused; removed)
831  if_hwaddr		dhcp_pif_t.pif_hwaddr
832  if_hwlen		dhcp_pif_t.pif_hwlen
833  if_hwtype		dhcp_pif_t.pif_hwtype
834  if_cid		dhcp_smach_t.dsm_cid
835  if_cidlen		dhcp_smach_t.dsm_cidlen
836  if_prl		dhcp_smach_t.dsm_prl
837  if_prllen		dhcp_smach_t.dsm_prllen
838  if_daddr		dhcp_pif_t.pif_daddr
839  if_dlen		dhcp_pif_t.pif_dlen
840  if_saplen		dhcp_pif_t.pif_saplen
841  if_sap_before		dhcp_pif_t.pif_sap_before
842  if_dlpi_fd		dhcp_pif_t.pif_dlpi_fd
843  if_sock_fd		v4_sock_fd and v6_sock_fd (globals)
844  if_sock_ip_fd		dhcp_lif_t.lif_sock_ip_fd
845  if_timer		(see text)
846  if_t1			dhcp_lease_t.dl_t1
847  if_t2			dhcp_lease_t.dl_t2
848  if_lease		dhcp_lif_t.lif_expire
849  if_nrouters		dhcp_smach_t.dsm_nrouters
850  if_routers		dhcp_smach_t.dsm_routers
851  if_server		dhcp_smach_t.dsm_server
852  if_addr		dhcp_lif_t.lif_v6addr
853  if_netmask		dhcp_lif_t.lif_v6mask
854  if_broadcast		dhcp_lif_t.lif_v6peer
855  if_ack		dhcp_smach_t.dsm_ack
856  if_orig_ack		dhcp_smach_t.dsm_orig_ack
857  if_offer_wait		dhcp_smach_t.dsm_offer_wait
858  if_offer_timer	dhcp_smach_t.dsm_offer_timer
859  if_offer_id		dhcp_pif_t.pif_dlpi_id
860  if_acknak_id		dhcp_lif_t.lif_acknak_id
861  if_acknak_bcast_id	v4_acknak_bcast_id (global)
862  if_neg_monosec	dhcp_smach_t.dsm_neg_monosec
863  if_newstart_monosec	dhcp_smach_t.dsm_newstart_monosec
864  if_curstart_monosec	dhcp_smach_t.dsm_curstart_monosec
865  if_disc_secs		dhcp_smach_t.dsm_disc_secs
866  if_reqhost		dhcp_smach_t.dsm_reqhost
867  if_recv_pkt_list	dhcp_smach_t.dsm_recv_pkt_list
868  if_sent		dhcp_smach_t.dsm_sent
869  if_received		dhcp_smach_t.dsm_received
870  if_bad_offers		dhcp_smach_t.dsm_bad_offers
871  if_send_pkt		dhcp_smach_t.dsm_send_pkt
872  if_send_timeout	dhcp_smach_t.dsm_send_timeout
873  if_send_dest		dhcp_smach_t.dsm_send_dest
874  if_send_stop_func	dhcp_smach_t.dsm_send_stop_func
875  if_packet_sent	dhcp_smach_t.dsm_packet_sent
876  if_retrans_timer	dhcp_smach_t.dsm_retrans_timer
877  if_script_fd		dhcp_smach_t.dsm_script_fd
878  if_script_pid		dhcp_smach_t.dsm_script_pid
879  if_script_helper_pid	dhcp_smach_t.dsm_script_helper_pid
880  if_script_event	dhcp_smach_t.dsm_script_event
881  if_script_event_id	dhcp_smach_t.dsm_script_event_id
882  if_callback_msg	dhcp_smach_t.dsm_callback_msg
883  if_script_callback	dhcp_smach_t.dsm_script_callback
884
885  Notes:
886
887    - The dsm_name field currently just points to the lif_name on the
888      controlling LIF.  This may need to be named differently in the
889      future; perhaps when Zones are supported.
890
891    - The timer mechanism will be refactored.  Rather than using the
892      separate if_timer[] array to hold the timer IDs and
893      if_{t1,t2,lease} to hold the relative timer values, we will
894      gather this information into a dhcp_timer_t structure:
895
896	dt_id		timer ID value
897	dt_start	relative start time
898
899  New fields not accounted for above:
900
901  dhcp_pif_t.pif_next		linkage in global list of PIFs
902  dhcp_pif_t.pif_prev		linkage in global list of PIFs
903  dhcp_pif_t.pif_lifs		pointer to list of LIFs on this PIF
904  dhcp_pif_t.pif_isv6		IPv6 flag
905  dhcp_pif_t.pif_dlpi_count	number of state machines using DLPI
906  dhcp_pif_t.pif_hold_count	reference count
907  dhcp_pif_t.pif_name		name of physical interface
908  dhcp_lif_t.lif_next		linkage in per-PIF list of LIFs
909  dhcp_lif_t.lif_prev		linkage in per-PIF list of LIFs
910  dhcp_lif_t.lif_pif		backpointer to parent PIF
911  dhcp_lif_t.lif_smachs		pointer to list of state machines
912  dhcp_lif_t.lif_lease		backpointer to lease holding LIF
913  dhcp_lif_t.lif_flags		interface flags (IFF_*)
914  dhcp_lif_t.lif_hold_count	reference count
915  dhcp_lif_t.lif_dad_wait	waiting for DAD resolution flag
916  dhcp_lif_t.lif_removed	removed from list flag
917  dhcp_lif_t.lif_plumbed	plumbed by dhcpagent flag
918  dhcp_lif_t.lif_expired	lease has expired flag
919  dhcp_lif_t.lif_declined	reason to refuse this address (string)
920  dhcp_lif_t.lif_iaid		unique and stable 32-bit identifier
921  dhcp_lif_t.lif_iaid_id	timer for delayed /etc writes
922  dhcp_lif_t.lif_preferred	preferred timer for v6; deprecate after
923  dhcp_lif_t.lif_name		name of logical interface
924  dhcp_smach_t.dsm_lif		controlling (main) LIF
925  dhcp_smach_t.dsm_leases	pointer to list of leases
926  dhcp_smach_t.dsm_lif_wait	number of LIFs waiting on DAD
927  dhcp_smach_t.dsm_lif_down	number of LIFs that have failed
928  dhcp_smach_t.dsm_using_dlpi	currently using DLPI flag
929  dhcp_smach_t.dsm_send_tcenter	v4 central timer value; v6 MRT
930  dhcp_lease_t.dl_next		linkage in per-state-machine list of leases
931  dhcp_lease_t.dl_prev		linkage in per-state-machine list of leases
932  dhcp_lease_t.dl_smach		back pointer to state machine
933  dhcp_lease_t.dl_lifs		pointer to first LIF configured by lease
934  dhcp_lease_t.dl_nlifs		number of configured consecutive LIFs
935  dhcp_lease_t.dl_hold_count	reference counter
936  dhcp_lease_t.dl_removed	removed from list flag
937  dhcp_lease_t.dl_stale		lease was not updated by Renew/Rebind
938
939
940Snoop
941
942  The snoop changes are fairly straightforward.  As snoop just decodes
943  the messages, and the message format is quite different between
944  DHCPv4 and DHCPv6, a new module will be created to handle DHCPv6
945  decoding, and will export a interpret_dhcpv6() function.
946
947  The one bit of commonality between the two protocols is the use of
948  ARP Hardware Type numbers, which are found in the underlying BOOTP
949  message format for DHCPv4 and in the DUID-LL and DUID-LLT
950  construction for DHCPv6.  To simplify this, the existing static
951  show_htype() function in snoop_dhcp.c will be renamed to arp_htype()
952  (to better reflect its functionality), updated with more modern
953  hardware types, moved to snoop_arp.c (where it belongs), and made a
954  public symbol within snoop.
955
956  While I'm there, I'll update snoop_arp.c so that when it prints an
957  ARP message in verbose mode, it uses arp_htype() to translate the
958  ar_hrd value.
959
960  The snoop updates also involve the addition of a new "dhcp6" keyword
961  for filtering.  As a part of this, CR 6487534 will be fixed.
962
963
964IPv6 Source Address Selection
965
966  One of the customer requests for DHCPv6 is to be able to predict the
967  address selection behavior in the presence of both stateful and
968  stateless addresses on the same network.
969
970  Solaris implements RFC 3484 address selection behavior.  In this
971  scheme, the first seven rules implement some basic preferences for
972  addresses, with Rule 8 being a deterministic tie breaker.
973
974  Rule 8 relies on a special function, CommonPrefixLen, defined in the
975  RFC, that compares leading bits of the address without regard to
976  configured prefix length.  As Rule 1 eliminates equal addresses,
977  this always picks a single address.
978
979  This rule, though, allows for additional checks:
980
981   Rule 8 may be superseded if the implementation has other means of
982   choosing among source addresses.  For example, if the implementation
983   somehow knows which source address will result in the "best"
984   communications performance.
985
986  We will thus split Rule 8 into three separate rules:
987
988  - First, compare on configured prefix.  The interface with the
989    longest configured prefix length that also matches the candidate
990    address will be preferred.
991
992  - Next, check the type of address.  Prefer statically configured
993    addresses above all others.  Next, those from DHCPv6.  Next,
994    stateless autoconfigured addresses.  Finally, temporary addresses.
995    (Note that Rule 7 will take care of temporary address preferences,
996    so that this rule doesn't actually need to look at them.)
997
998  - Finally, run the check-all-bits (CommonPrefixLen) tie breaker.
999
1000  The result of this is that if there's a local address in the same
1001  configured prefix, then we'll prefer that over other addresses.  If
1002  there are multiple to choose from, then will pick static first, then
1003  DHCPv6, then dynamic.  Finally, if there are still multiples, we'll
1004  use the "closest" address, bitwise.
1005
1006  Also, this basic implementation scheme also addresses CR 6485164, so
1007  a fix for that will be included with this project.
1008
1009
1010Minor Improvements
1011
1012  Various small problems with the system encountered during
1013  development will be fixed along with this project.  Some of these
1014  are:
1015
1016  - List of ARPHRD_* types is a bit short; add some new ones.
1017
1018  - List of IPPORT_* values is similarly sparse; add others in use by
1019    snoop.
1020
1021  - dhcpmsg.h lacks PRINTFLIKE for dhcpmsg(); add it.
1022
1023  - CR 6482163 causes excessive lint errors with libxnet; will fix.
1024
1025  - libdhcpagent uses gettimeofday() for I/O timing, and this can
1026    drift on systems with NTP.  It should use a stable time source
1027    (gethrtime()) instead, and should return better error values.
1028
1029  - Controlling debug mode in the daemon shouldn't require changing
1030    the command line arguments or jumping through special hoops.  I've
1031    added undocumented ".DEBUG_LEVEL=[0-3]" and ".VERBOSE=[01]"
1032    features to /etc/default/dhcpagent.
1033
1034  - The various attributes of the IPC commands (requires privileges,
1035    creates a new session, valid with BOOTP, immediate reply) should
1036    be gathered together into one look-up table rather than scattered
1037    as hard-coded tests.
1038
1039  - Remove the event unregistration from the command dispatch loop and
1040    get rid of the ipc_action_pending() botch.  We'll get a
1041    zero-length read any time the client goes away, and that will be
1042    enough to trigger termination.  This fix removes async_pending()
1043    and async_timeout() as well, and fixes CR 6487958 as a
1044    side-effect.
1045
1046  - Throughout the dhcpagent code, there are private implementations
1047    of doubly-linked and singly-linked lists for each data type.
1048    These will all be removed and replaced with insque(3C) and
1049    remque(3C).
1050
1051
1052Testing
1053
1054  The implementation was tested using the TAHI test suite for DHCPv6
1055  (www.tahi.org).  There are some peculiar aspects to this test suite,
1056  and these issues directed some of the design.  In particular:
1057
1058  - If Renew/Rebind doesn't mention one of our leases, then we need to
1059    allow the message to be retransmitted.  Real servers are unlikely
1060    to do this.
1061
1062  - We must look for a status code within IAADDR and within IA_NA, and
1063    handle the paradoxical case of "NoAddrAvail."  That doesn't make
1064    sense, as a server with no addresses wouldn't use those options.
1065    That option makes more sense at the top level of the message.
1066
1067  - If we get "UseMulticast" when we were already using multicast,
1068    then ignore the error code.  Sending another request would cause a
1069    loop.
1070
1071  - TAHI uses "NoBinding" at the top level of the message.  This
1072    status code only makes sense within an IA, as it refers to the
1073    GUID:IAID binding, which doesn't exist outside an IA.  We must
1074    ignore such errors -- treat them as success.
1075
1076
1077Interactions With Other Projects
1078
1079  Clearview UV (vanity naming) will cause link names, and thus IP
1080  interface names, to become changeable over time.  This will break
1081  the IAID stability mechanism if UV is used for arbitrary renaming,
1082  rather than as just a DR enhancement.
1083
1084  When this portion of Clearview integrates, this part of the DHCPv6
1085  design may need to be revisited.  (The solution will likely be
1086  handled at some higher layer, such as within Network Automagic.)
1087
1088  Clearview is also contributing a new libdlpi that will work for
1089  dhcpagent, and is thus removing the private dlpi_io.[ch] functions
1090  from this daemon.  When that Clearview project integrates, the
1091  DHCPv6 project will need to adjust to the new interfaces, and remove
1092  or relocate the dlpi_to_arp() function.
1093
1094
1095Futures
1096
1097  Zones currently cannot address any IP interfaces by way of DHCP.
1098  This project will not fix that problem, but the DUID/IAID could be
1099  used to help fix it in the future.
1100
1101  In particular, the DUID allows the client to obtain separate sets of
1102  addresses and configuration parameters on a single interface, just
1103  like an IPv4 Client ID, but it includes a clean mechanism for vendor
1104  extensions.  If we associate the DUID with the zone identifier or
1105  name through an extension, then we have a really simple way of
1106  allocating per-zone addresses.
1107
1108  Moreover, RFC 4361 describes a handy way of using DHCPv6 DUID/IAID
1109  values with IPv4 DHCP, which would quickly solve the problem of
1110  using DHCP for IPv4 address assignment in non-global zones as well.
1111
1112  (One potential risk with this plan is that there may be server
1113  implementations that either do not implement the RFC correctly or
1114  otherwise mishandle the DUID.  This has apparently bitten some early
1115  adopters.)
1116
1117  Implementing the FQDN option for DHCPv6 would, given the current
1118  libdhcputil design, require a new 'type' of entry for the inittab6
1119  file.  This is because the design does not allow for any simple
1120  means to ``compose'' a sequence of basic types together.  Thus,
1121  every type of option must either be a basic type, or an array of
1122  multiple instances of the same basic type.
1123
1124  If we implement FQDN in the future, it may be useful to explore some
1125  means of allowing a given option instance to be a sequence of basic
1126  types.
1127
1128  This project does not make the DNS resolver or any other subsystem
1129  use the data gathered by DHCPv6.  It just makes the data available
1130  through dhcpinfo(1).  Future projects should modify those services
1131  to use configuration data learned via DHCPv6.  (One of the reasons
1132  this is not being done now is that Network Automagic [NWAM] will
1133  likely be changing this area substantially in the very near future,
1134  and thus the effort would be largely wasted.)
1135
1136
1137Appendix A - Choice of Venue
1138
1139  There are three logical places to implement DHCPv6:
1140
1141    - in dhcpagent
1142    - in in.ndpd
1143    - in a new daemon (say, 'dhcp6agent')
1144
1145  We need to access parameters via dhcpinfo, and should provide the
1146  same set of status and control features via ifconfig as are present
1147  for IPv4.  (For the latter, if we fail to do that, it will likely
1148  confuse users.  The expense for doing it is comparatively small, and
1149  it will be useful for testing, even though it should not be needed
1150  in normal operation.)
1151
1152  If we implement somewhere other than dhcpagent, then we need to give
1153  that new daemon (in.ndpd or dhcp6agent) the same basic IPC features
1154  as dhcpagent already has.  This means either extracting those bits
1155  (async.c and ipc_action.c) into a shared library or just copying
1156  them.  Obviously, the former would be preferred, but as those bits
1157  depend on the rest of the dhcpagent infrastructure for timers and
1158  state handling, this means that the new process would have to look a
1159  lot like dhcpagent.
1160
1161  Implementing DHCPv6 as part of in.ndpd is attractive, as it
1162  eliminates the confusion that the router discovery process for
1163  determining interface netmasks can cause, along with the need to do
1164  any signaling at all to bring DHCPv6 up.  However, the need to make
1165  in.ndpd more like dhcpagent is unattractive.
1166
1167  Having a new dhcp6agent daemon seems to have little to recommend it,
1168  other than leaving the existing dhcpagent code untouched.  If we do
1169  that, then we end up with two implementations that do many similar
1170  things, and must be maintained in parallel.
1171
1172  Thus, although it leads to some complexity in reworking the data
1173  structures to fit both protocols, on balance the simplest solution
1174  is to extend dhcpagent.
1175
1176
1177Appendix B - Cross-Reference
1178
1179  in.ndpd
1180
1181    - Start dhcpagent and issue "dhcp start" command via libdhcpagent
1182    - Parse StatefulAddrConf interface option from ndpd.conf
1183    - Watch for M and O bits to trigger DHCPv6
1184    - Handle "no routers found" case and start DHCPv6
1185    - Track prefixes and set prefix length on IFF_DHCPRUNNING aliases
1186    - Send new Router Solicitation when prefix unknown
1187    - Change privileges so that dhcpagent can be launched successfully
1188
1189  libdhcputil
1190
1191    - Parse new /etc/dhcp/inittab6 file
1192    - Handle new UNUMBER24, SNUMBER64, IPV6, DUID and DOMAIN types
1193    - Add DHCPv6 option iterators (dhcpv6_find_option and
1194      dhcpv6_pkt_option)
1195    - Add dlpi_to_arp function (temporary)
1196
1197  libdhcpagent
1198
1199    - Add stable DUID and IAID creation and storage support
1200      functions and add new dhcp_stable.h include file
1201    - Support new DECLINING and RELEASING states introduced by DHCPv6.
1202    - Update implementation so that it doesn't rely on gettimeofday()
1203      for I/O timeouts
1204    - Extend the hostconf functions to support DHCPv6, using a new
1205      ".dh6" file
1206
1207  snoop
1208
1209    - Add support for DHCPv6 packet decoding (all types)
1210    - Add "dhcp6" filter keyword
1211    - Fix known bugs in DHCP filtering
1212
1213  ifconfig
1214
1215    - Remove inet-only restriction on "dhcp" keyword
1216
1217  netstat
1218
1219    - Remove strange "-I list" feature.
1220    - Add support for DHCPv6 and iterating over IPv6 interfaces.
1221
1222  ip
1223
1224    - Add extensions to IPv6 source address selection to prefer DHCPv6
1225      addresses when all else is equal
1226    - Fix known bugs in source address selection (remaining from TX
1227      integration)
1228
1229  other
1230
1231    - Add ifindex and source/destination address into PKT_LIST.
1232    - Add more ARPHDR_* and IPPORT_* values.
1233