xref: /titanic_52/usr/src/cmd/cmd-inet/sbin/dhcpagent/README.v6 (revision c0c79a3f09914f35651895ffc111883455b7f62d)
1CDDL HEADER START
2
3The contents of this file are subject to the terms of the
4Common Development and Distribution License (the "License").
5You may not use this file except in compliance with the License.
6
7You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
8or http://www.opensolaris.org/os/licensing.
9See the License for the specific language governing permissions
10and limitations under the License.
11
12When distributing Covered Code, include this CDDL HEADER in each
13file and include the License file at usr/src/OPENSOLARIS.LICENSE.
14If applicable, add the following below this CDDL HEADER, with the
15fields enclosed by brackets "[]" replaced with your own identifying
16information: Portions Copyright [yyyy] [name of copyright owner]
17
18CDDL HEADER END
19
20Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
21Use is subject to license terms.
22
23ident	"%Z%%M%	%I%	%E% SMI"
24
25DHCPv6 Client Low-Level Design
26
27Introduction
28
29  This project adds DHCPv6 client-side (not server) support to
30  Solaris.  Future projects may add server-side support as well as
31  enhance the basic capabilities added here.  These future projects
32  are not discussed in detail in this document.
33
34  This document assumes that the reader is familiar with the following
35  other documents:
36
37  - RFC 3315: the primary description of DHCPv6
38  - RFCs 2131 and 2132: IPv4 DHCP
39  - RFCs 2461 and 2462: IPv6 NDP and stateless autoconfiguration
40  - RFC 3484: IPv6 default address selection
41  - ifconfig(1M): Solaris IP interface configuration
42  - in.ndpd(1M): Solaris IPv6 Neighbor and Router Discovery daemon
43  - dhcpagent(1M): Solaris DHCP client
44  - dhcpinfo(1): Solaris DHCP parameter utility
45  - ndpd.conf(4): in.ndpd configuration file
46  - netstat(1M): Solaris network status utility
47  - snoop(1M): Solaris network packet capture and inspection
48  - "DHCPv6 Client High-Level Design"
49
50  Several terms from those documents (such as the DHCPv6 IA_NA and
51  IAADDR options) are used without further explanation in this
52  document; see the reference documents above for details.
53
54  The overall plan is to enhance the existing Solaris dhcpagent so
55  that it is able to process DHCPv6.  It would also have been possible
56  to create a new, separate daemon process for this, or to integrate
57  the feature into in.ndpd.  These alternatives, and the reason for
58  the chosen design, are discussed in Appendix A.
59
60  This document discusses the internal design issues involved in the
61  protocol implementation, and with the associated components (such as
62  in.ndpd, snoop, and the kernel's source address selection
63  algorithm).  It does not discuss the details of the protocol itself,
64  which are more than adequately described in the RFC, nor the
65  individual lines of code, which will be in the code review.
66
67  As a cross-reference, Appendix B has a summary of the components
68  involved and the changes to each.
69
70
71Background
72
73  In order to discuss the design changes for DHCPv6, it's necessary
74  first to talk about the current IPv4-only design, and the
75  assumptions built into that design.
76
77  The main data structure used in dhcpagent is the 'struct ifslist'.
78  Each instance of this structure represents a Solaris logical IP
79  interface under DHCP's control.  It also represents the shared state
80  with the DHCP server that granted the address, the address itself,
81  and copies of the negotiated options.
82
83  There is one list in dhcpagent containing all of the IP interfaces
84  that are under DHCP control.  IP interfaces not under DHCP control
85  (for example, those that are statically addressed) are not included
86  in this list, even when plumbed on the system.  These ifslist
87  entries are chained like this:
88
89  ifsheadp -> ifslist -> ifslist -> ifslist -> NULL
90	        net0	  net0:1     net1
91
92  Each ifslist entry contains the address, mask, lease information,
93  interface name, hardware information, packets, protocol state, and
94  timers.  The name of the logical IP interface under DHCP's control
95  is also the name used in the administrative interfaces (dhcpinfo,
96  ifconfig) and when logging events.
97
98  Each entry holds open a DLPI stream and two sockets.  The DLPI
99  stream is nulled-out with a filter when not in use, but still
100  consumes system resources.  (Most significantly, it causes data
101  copies in the driver layer that end up sapping performance.)
102
103  The entry storage is managed by a insert/hold/release/remove model
104  and reference counts.  In this model, insert_ifs() allocates a new
105  ifslist entry and inserts it into the global list, with the global
106  list holding a reference.  remove_ifs() removes it from the global
107  list and drops that reference.  hold_ifs() and release_ifs() are
108  used by data structures that refer to ifslist entries, such as timer
109  entries, to make sure that the ifslist entry isn't freed until the
110  timer has been dispatched or deleted.
111
112  The design is single-threaded, so code that walks the global list
113  needn't bother taking holds on the ifslist structure.  Only
114  references that may be used at a different time (i.e., pointers
115  stored in other data structures) need to be recorded.
116
117  Packets are handled using PKT (struct dhcp; <netinet/dhcp.h>),
118  PKT_LIST (struct dhcp_list; <dhcp_impl.h>), and dhcp_pkt_t (struct
119  dhcp_pkt; "packet.h").  PKT is just the RFC 2131 DHCP packet
120  structure, and has no additional information, such as packet length.
121  PKT_LIST contains a PKT pointer, length, decoded option arrays, and
122  linkage for putting the packet in a list.  Finally, dhcp_pkt_t has a
123  PKT pointer and length values suitable for modifying the packet.
124
125  Essentially, PKT_LIST is a wrapper for received packets, and
126  dhcp_pkt_t is a wrapper for packets to be sent.
127
128  The basic PKT structure is used in dhcpagent, inetboot, in.dhcpd,
129  libdhcpagent, libwanboot, libdhcputil, and others.  PKT_LIST is used
130  in a similar set of places, including the kernel NFS modules.
131  dhcp_pkt_t is (as the header file implies) limited to dhcpagent.
132
133  In addition to these structures, dhcpagent maintains a set of
134  internal supporting abstractions.  Two key ones involved in this
135  project are the "async operation" and the "IPC action."  An async
136  operation encapsulates the actions needed for a given operation, so
137  that if cancellation is needed, there's a single point where the
138  associated resources can be freed.  An IPC action represents the
139  user state related to the private interface used by ifconfig.
140
141
142DHCPv6 Inherent Differences
143
144  DHCPv6 naturally has some commonality with IPv4 DHCP, but also has
145  some significant differences.
146
147  Unlike IPv4 DHCP, DHCPv6 relies on link-local IP addresses to do its
148  work.  This means that, on Solaris, the client doesn't need DLPI to
149  perform any of the I/O; regular IP sockets will do the job.  It also
150  means that, unlike IPv4 DHCP, DHCPv6 does not need to obtain a lease
151  for the address used in its messages to the server.  The system
152  provides the address automatically.
153
154  IPv4 DHCP expects some messages from the server to be broadcast.
155  DHCPv6 has no such mechanism; all messages from the server to the
156  client are unicast.  In the case where the client and server aren't
157  on the same subnet, a relay agent is used to get the unicast replies
158  back to the client's link-local address.
159
160  With IPv4 DHCP, a single address plus configuration options is
161  leased with a given client ID and a single state machine instance,
162  and the implementation binds that to a single IP logical interface
163  specified by the user.  The lease has a "Lease Time," a required
164  option, as well as two timers, called T1 (renew) and T2 (rebind),
165  which are controlled by regular options.
166
167  DHCPv6 uses a single client/server session to control the
168  acquisition of configuration options and "identity associations"
169  (IAs).  The identity associations, in turn, contain lists of
170  addresses for the client to use and the T1/T2 timer values.  Each
171  individual address has its own preferred and valid lifetime, with
172  the address being marked "deprecated" at the end of the preferred
173  interval, and removed at the end of the valid interval.
174
175  IPv4 DHCP leaves many of the retransmit decisions up to the client,
176  and some things (such as RELEASE and DECLINE) are sent just once.
177  Others (such as the REQUEST message used for renew and rebind) are
178  dealt with by heuristics.  DHCPv6 treats each message to the server
179  as a separate transaction, and resends each message using a common
180  retransmission mechanism.  DHCPv6 also has separate messages for
181  Renew, Rebind, and Confirm rather than reusing the Request
182  mechanism.
183
184  The set of options (which are used to convey configuration
185  information) for each protocol are distinct.  Notably, two of the
186  mistakes from IPv4 DHCP have been fixed: DHCPv6 doesn't carry a
187  client name, and doesn't attempt to impersonate a routing protocol
188  by setting a "default route."
189
190  Another welcome change is the lack of a netmask/prefix length with
191  DHCPv6.  Instead, the client uses the Router Advertisement prefixes
192  to set the correct interface netmask.  This reduces the number of
193  databases that need to be kept in sync.  (The equivalent mechanism
194  in IPv4 would have been the use of ICMP Address Mask Request /
195  Reply, but the BOOTP designers chose to embed it in the address
196  assignment protocol itself.)
197
198  Otherwise, DHCPv6 is similar to IPv4 DHCP.  The same overall
199  renew/rebind and lease expiry strategy is used, although the state
200  machine events must now take into account multiple IAs and the fact
201  that each can cause RENEWING or REBINDING state independently.
202
203
204DHCPv6 And Solaris
205
206  The protocol distinctions above have several important implications.
207  For the logical interfaces:
208
209    - Because Solaris uses IP logical interfaces to configure
210      addresses, we must have multiple IP logical interfaces per IA
211      with IPv6.
212
213    - Because we need to support multiple addresses (and thus multiple
214      IP logical interfaces) per IA and multiple IAs per client/server
215      session, the IP logical interface name isn't a unique name for
216      the lease.
217
218  As a result, IP logical interfaces will come and go with DHCPv6,
219  just as happens with the existing stateless address
220  autoconfiguration support in in.ndpd.  The logical interface names
221  (visible in ifconfig) have no administrative significance.
222
223  Fortunately, DHCPv6 does end up with one fixed name that can be used
224  to identify a session.  Because DHCPv6 uses link local addresses for
225  communication with the server, the name of the IP logical interface
226  that has this link local address (normally the same as the IP
227  physical interface) can be used as an identifier for dhcpinfo and
228  logging purposes.
229
230
231Dhcpagent Redesign Overview
232
233  The redesign starts by refactoring the IP interface representation.
234  Because we need to have multiple IP logical interfaces (LIFs) for a
235  single identity association (IA), we should not store all of the
236  DHCP state information along with the LIF information.
237
238  For DHCPv6, we will need to keep LIFs on a single IP physical
239  interface (PIF) together, so this is probably also a good time to
240  reconsider the way dhcpagent represents physical interfaces.  The
241  current design simply replicates the state (notably the DLPI stream,
242  but also the hardware address and other bits) among all of the
243  ifslist entries on the same physical interface.
244
245  The new design creates two lists of dhcp_pif_t entries, one list for
246  IPv4 and the other for IPv6.  Each dhcp_pif_t represents a PIF, with
247  a list of dhcp_lif_t entries attached, each of which represents a
248  LIF used by dhcpagent.  This structure mirrors the kernel's ill_t
249  and ipif_t interface representations.
250
251  Next, the lease-tracking needs to be refactored.  DHCPv6 is the
252  functional superset in this case, as it has two lifetimes per
253  address (LIF) and IA groupings with shared T1/T2 timers.  To
254  represent these groupings, we will use a new dhcp_lease_t structure.
255  IPv4 DHCP will have one such structure per state machine, while
256  DHCPv6 will have a list.  (Note: the initial implementation will
257  have only one lease per DHCPv6 state machine, because each state
258  machine uses a single link-local address, a single DUID+IAID pair,
259  and supports only Non-temporary Addresses [IA_NA option].  Future
260  enhancements may use multiple leases per DHCPv6 state machine or
261  support other IA types.)
262
263  For all of these new structures, we will use the same insert/hold/
264  release/remove model as with the original ifslist.
265
266  Finally, the remaining items (and the bulk of the original ifslist
267  members) are kept on a per-state-machine basis.  As this is no
268  longer just an "interface," a new dhcp_smach_t structure will hold
269  these, and the ifslist structure is gone.
270
271
272Lease Representation
273
274  For DHCPv6, we need to track multiple LIFs per lease (IA), but we
275  also need multiple LIFs per PIF.  Rather than having two sets of
276  list linkage for each LIF, we can observe that a LIF is on exactly
277  one PIF and is a member of at most one lease, and then simplify: the
278  lease structure will use a base pointer for the first LIF in the
279  lease, and a count for the number of consecutive LIFs in the PIF's
280  list of LIFs that belong to the lease.
281
282  When removing a LIF from the system, we need to decrement the count
283  of LIFs in the lease, and advance the base pointer if the LIF being
284  removed is the first one.  Inserting a LIF means just moving it into
285  this list and bumping the counter.
286
287  When removing a lease from a state machine, we need to dispose of
288  the LIFs referenced.  If the LIF being disposed is the main LIF for
289  a state machine, then all that we can do is canonize the LIF
290  (returning it to a default state); this represents the normal IPv4
291  DHCP operation on lease expiry.  Otherwise, the lease is the owner
292  of that LIF (it was created because of a DHCPv6 IA), and disposal
293  means unplumbing the LIF from the actual system and removing the LIF
294  entry from the PIF.
295
296
297Main Structure Linkage
298
299  For IPv4 DHCP, the new linkage is straightforward.  Using the same
300  system configuration example as in the initial design discussion:
301
302          +- lease  +- lease       +- lease
303          |  ^      |  ^           |  ^
304          |  |      |  |           |  |
305          \  smach  \  smach       \  smach
306           \ ^|      \ ^|           \ ^|
307            v|v       v|v            v|v
308            lif ----> lif -> NULL     lif -> NULL
309            net0      net0:1          net1
310            ^                         ^
311            |                         |
312  v4root -> pif --------------------> pif -> NULL
313            net0                      net1
314
315  This diagram shows three separate state machines running (with
316  backpointers omitted for clarity).  Each state machine has a single
317  "main" LIF with which it's associated (and named).  Each also has a
318  single lease structure that points back to the same LIF (count of
319  1), because IPv4 DHCP controls a single address allocation per state
320  machine.
321
322  DHCPv6 is a bit more complex.  This shows DHCPv6 running on two
323  interfaces (more or fewer interfaces are of course possible) and
324  with multiple leases on the first interface, and each lease with
325  multiple addresses (one with two addresses, the second with one).
326
327            lease ----------------> lease -> NULL   lease -> NULL
328            ^   \(2)                |(1)            ^   \ (1)
329            |    \                  |               |    \
330            smach \                 |               smach \
331            ^ |    \                |               ^ |    \
332            | v     v               v               | v     v
333            lif --> lif --> lif --> lif --> NULL    lif --> lif -> NULL
334            net0    net0:1  net0:4  net0:2          net1    net1:5
335            ^                                       ^
336            |                                       |
337  v6root -> pif ----------------------------------> pif -> NULL
338            net0                                    net1
339
340  Note that there's intentionally no ordering based on name in the
341  list of LIFs.  Instead, the contiguous LIF structures in that list
342  represent the addresses in each lease.  The logical interfaces
343  themselves are allocated and numbered by the system kernel, so they
344  may not be sequential, and there may be gaps in the list if other
345  entities (such as in.ndpd) are also configuring interfaces.
346
347  Note also that with IPv4 DHCP, the lease points to the LIF that's
348  also the main LIF for the state machine, because that's the IP
349  interface that dhcpagent controls.  With DHCPv6, the lease (one per
350  IA structure) points to a separate set of LIFs that are created just
351  for the leased addresses (one per IA address in an IAADDR option).
352  The state machine alone points to the main LIF.
353
354
355Packet Structure Extensions
356
357  Obviously, we need some DHCPv6 packet data structures and
358  definitions.  A new <netinet/dhcp6.h> file will be introduced with
359  the necessary #defines and structures.  The key structure there will
360  be:
361
362	struct dhcpv6_message {
363		uint8_t		d6m_msg_type;
364		uint8_t		d6m_transid_ho;
365		uint16_t	d6m_transid_lo;
366	};
367	typedef	struct dhcpv6_message	dhcpv6_message_t;
368
369  This defines the usual (non-relay) DHCPv6 packet header, and is
370  roughly equivalent to PKT for IPv4.
371
372  Extending dhcp_pkt_t for DHCPv6 is straightforward, as it's used
373  only within dhcpagent.  This structure will be amended to use a
374  union for v4/v6 and include a boolean to flag which version is in
375  use.
376
377  For the PKT_LIST structure, things are more complex.  This defines
378  both a queuing mechanism for received packets (typically OFFERs) and
379  a set of packet decoding structures.  The decoding structures are
380  highly specific to IPv4 DHCP -- they have no means to handle nested
381  or repeated options (as used heavily in DHCPv6) and make use of the
382  DHCP_OPT structure which is specific to IPv4 DHCP -- and are
383  somewhat expensive in storage, due to the use of arrays indexed by
384  option code number.
385
386  Worse, this structure is used throughout the system, so changes to
387  it need to be made carefully.  (For example, the existing 'pkt'
388  member can't just be turned into a union.)
389
390  For an initial prototype, since discarded, I created a new
391  dhcp_plist_t structure to represent packet lists as used inside
392  dhcpagent and made dhcp_pkt_t valid for use on input and output.
393  The result is unsatisfying, though, as it results in code that
394  manipulates far too many data structures in common cases; it's a sea
395  of pointers to pointers.
396
397  The better answer is to use PKT_LIST for both IPv4 and IPv6, adding
398  the few new bits of metadata required to the end (receiving ifIndex,
399  packet source/destination addresses), and staying within the overall
400  existing design.
401
402  For option parsing, dhcpv6_find_option() and dhcpv6_pkt_option()
403  functions will be added to libdhcputil.  The former function will
404  walk a DHCPv6 option list, and provide safe (bounds-checked) access
405  to the options inside.  The function can be called recursively, so
406  that option nesting can be handled fairly simply by nested loops,
407  and can be called repeatedly to return each instance of a given
408  option code number.  The latter function is just a convenience
409  wrapper on dhcpv6_find_option() that starts with a PKT_LIST pointer
410  and iterates over the top-level options with a given code number.
411
412  There are two special considerations for the use of these library
413  interfaces: there's no "pad" option for DHCPv6 or alignment
414  requirements on option headers or contents, and nested options
415  always follow a structure that has type-dependent length.  This
416  means that code that handles options must all be written to deal
417  with unaligned data, and suboption code must index the pointer past
418  the type-dependent part.
419
420
421Packet Construction
422
423  Unlike DHCPv4, DHCPv6 places the transaction timer value in an
424  option.  The existing code sets the current time value in
425  send_pkt_internal(), which allows it to be updated in a
426  straightforward way when doing retransmits.
427
428  To make this work in a simple manner for DHCPv6, I added a
429  remove_pkt_opt() function.  The update logic just does a remove and
430  re-adds the option.  We could also just assume the presence of the
431  option, find it, and modify in place, but the remove feature seems
432  more general.
433
434  DHCPv6 uses nesting options.  To make this work, two new utility
435  functions are needed.  First, an add_pkt_subopt() function will take
436  a pointer to an existing option and add an embedded option within
437  it.  The packet length and existing option length are updated.  If
438  that existing option isn't a top-level option, though, this means
439  that the caller must update the lengths of all of the enclosing
440  options up to the top level.  To do this, update_v6opt_len() will be
441  added.  This is used in the special case of adding a Status Code
442  option to an IAADDR option within an IA_NA top-level option.
443
444
445Sockets and I/O Handling
446
447  DHCPv6 doesn't need or use either a DLPI or a broadcast IP socket.
448  Instead, a single unicast-bound IP socket on a link-local address
449  would be the most that is needed.  This is roughly equivalent to
450  if_sock_ip_fd in the existing design, but that existing socket is
451  bound only after DHCP reaches BOUND state -- that is, when it
452  switches away from DLPI.  We need something different.
453
454  This, along with the excess of open file descriptors in an otherwise
455  idle daemon and the potentially serious performance problems in
456  leaving DLPI open at all times, argues for a larger redesign of the
457  I/O logic in dhcpagent.
458
459  The first thing that we can do is eliminate the need for the
460  per-ifslist if_sock_fd.  This is used primarily for issuing ioctls
461  to configure interfaces -- a task that would work as well with any
462  open socket -- and is also registered to receive any ACK/NAK packets
463  that may arrive via broadcast.  Both of these can be eliminated by
464  creating a pair of global sockets (IPv4 and IPv6), bound and
465  configured for ACK/NAK reception.  The only functional difference is
466  that the list of running state machines must be scanned on reception
467  to find the correct transaction ID, but the existing design
468  effectively already goes to this effort because the kernel
469  replicates received datagrams among all matching sockets, and each
470  ifslist entry has a socket open.
471
472  (The existing code for if_sock_fd makes oblique reference to unknown
473  problems in the system that may prevent binding from working in some
474  cases.  The reference dates back some seven years to the original
475  DHCP implementation.  I've observed no such problems in extensive
476  testing and if any do show up, they will be dealt with by fixing the
477  underlying bugs.)
478
479  This leads to an important simplification: it's no longer necessary
480  to register, unregister, and re-register for packet reception while
481  changing state -- register_acknak() and unregister_acknak() are
482  gone.  Instead, we always receive, and we dispatch the packets as
483  they arrive.  As a result, when receiving a DHCPv4 ACK or DHCPv6
484  Reply when in BOUND state, we know it's a duplicate, and we can
485  discard.
486
487  The next part is in minimizing DLPI usage.  A DLPI stream is needed
488  at most for each IPv4 PIF, and it's not needed when all of the
489  DHCP instances on that PIF are bound.  In fact, the current
490  implementation deals with this in configure_bound() by setting a
491  "blackhole" packet filter.  The stream is left open.
492
493  To simplify this, we will open at most one DLPI stream on a PIF, and
494  use reference counts from the state machines to determine when the
495  stream must be open and when it can be closed.  This mechanism will
496  be centralized in a set_smach_state() function that changes the
497  state and opens/closes the DLPI stream when needed.
498
499  This leads to another simplification.  The I/O logic in the existing
500  dhcpagent makes use of the protocol state to select between DLPI and
501  sockets.  Now that we keep track of this in a simpler manner, we no
502  longer need to switch out on state in when sending a packet; just
503  test the dsm_using_dlpi flag instead.
504
505  Still another simplification is in the handling of DHCPv4 INFORM.
506  The current code has separate logic in it for getting the interface
507  state and address information.  This is no longer necessary, as the
508  LIF mechanism keeps track of the interface state.  And since we have
509  separate lease structures, and INFORM doesn't acquire a lease, we no
510  longer have to be careful about canonizing the interface on
511  shutdown.
512
513  Although the default is to send all client messages to a well-known
514  multicast address for servers and relays, DHCPv6 also has a
515  mechanism that allows the client to send unicast messages to the
516  server.  The operation of this mechanism is slightly complex.
517  First, the server sends the client a unicast address via an option.
518  We may use this address as the destination (rather than the
519  well-known multicast address for local DHCPv6 servers and relays)
520  only if we have a viable local source address.  This means using
521  SIOCGDSTINFO each time we try to send unicast.  Next, the server may
522  send back a special status code: UseMulticast.  If this is received,
523  and if we were actually using unicast in our messages to the server,
524  then we need to forget the unicast address, switch back to
525  multicast, and resend our last message.
526
527  Note that it's important to avoid the temptation to resend the last
528  message every time UseMulticast is seen, and do it only once on
529  switching back to multicast: otherwise, a potential feedback loop is
530  created.
531
532  Because IP_PKTINFO (PSARC 2006/466) has integrated, we could go a
533  step further by removing the need for any per-LIF sockets and just
534  use the global sockets for all but DLPI.  However, in order to
535  facilitate a Solaris 10 backport, this will be done separately as CR
536  6509317.
537
538  In the case of DHCPv6, we already have IPV6_PKTINFO, so we will pave
539  the way for IPv4 by beginning to using this now, and thus have just
540  a single socket (bound to "::") for all of DHCPv6.  Doing this
541  requires switching from the old BSD4.2 -lsocket -lnsl to the
542  standards-compliant -lxnet in order to use ancillary data.
543
544  It may also be possible to remove the need for DLPI for IPv4, and
545  incidentally simplify the code a fair amount, by adding a kernel
546  option to allow transmission and reception of UDP packets over
547  interfaces that are plumbed but not marked IFF_UP.  This is left for
548  future work.
549
550
551The State Machine
552
553  Several parts of the existing state machine need additions to handle
554  DHCPv6, which is a superset of DHCPv4.
555
556  First, there are the RENEWING and REBINDING states.  For IPv4 DHCP,
557  these states map one-to-one with a single address and single lease
558  that's undergoing renewal.  It's a simple progression (on timeout)
559  from BOUND, to RENEWING, to REBINDING and finally back to SELECTING
560  to start over.  Each retransmit is done by simply rescheduling the
561  T1 or T2 timer.
562
563  For DHCPv6, things are somewhat more complex.  At any one time,
564  there may be multiple IAs (leases) that are effectively in renewing
565  or rebinding state, based on the T1/T2 timers for each IA, and many
566  addresses that have expired.
567
568  However, because all of the leases are related to a single server,
569  and that server either responds to our requests or doesn't, we can
570  simplify the states to be nearly identical to IPv4 DHCP.
571
572  The revised definition for use with DHCPv6 is:
573
574    - Transition from BOUND to RENEWING state when the first T1 timer
575      (of any lease on the state machine) expires.  At this point, as
576      an optimization, we should begin attempting to renew any IAs
577      that are within REN_TIMEOUT (10 seconds) of reaching T1 as well.
578      We may as well avoid sending an excess of packets.
579
580    - When a T1 lease timer expires and we're in RENEWING or REBINDING
581      state, just ignore it, because the transaction is already in
582      progress.
583
584    - At each retransmit timeout, we should check to see if there are
585      more IAs that need to join in because they've passed point T1 as
586      well, and, if so, add them.  This check isn't necessary at this
587      time, because only a single IA_NA is possible with the initial
588      design.
589
590    - When we reach T2 on any IA and we're in BOUND or RENEWING state,
591      enter REBINDING state.  At this point, we have a choice.  For
592      those other IAs that are past T1 but not yet at T2, we could
593      ignore them (sending only those that have passed point T2),
594      continue to send separate Renew messages for them, or just
595      include them in the Rebind message.  This isn't an issue that
596      must be dealt with for this project, but the plan is to include
597      them in the Rebind message.
598
599    - When a T2 lease timer expires and we're in REBINDING state, just
600      ignore it, as with the corresponding T1 timer.
601
602    - As addresses reach the end of their preferred lifetimes, set the
603      IFF_DEPRECATED flag.  As they reach the end of the valid
604      lifetime, remove them from the system.  When an IA (lease)
605      becomes empty, just remove it.  When there are no more leases
606      left, return to SELECTING state to start over.
607
608  Note that the RFC treats the IAs as separate entities when
609  discussing the renew/rebind T1/T2 timers, but treats them as a unit
610  when doing the initial negotiation.  This is, to say the least,
611  confusing, especially so given that there's no reason to expect that
612  after having failed to elicit any responses at all from the server
613  on one IA, the server will suddenly start responding when we attempt
614  to renew some other IA.  We rationalize this behavior by using a
615  single renew/rebind state for the entire state machine (and thus
616  client/server pair).
617
618  There's a subtle timing difference here between DHCPv4 and DHCPv6.
619  For DHCPv4, the client just sends packets more and more frequently
620  (shorter timeouts) as the next state gets nearer.  DHCPv6 treats
621  each as a transaction, using the same retransmit logic as for other
622  messages.  The DHCPv6 method is a cleaner design, so we will change
623  the DHCPv4 implementation to do the same, and compute the new timer
624  values as part of stop_extending().
625
626  Note that it would be possible to start the SELECTING state earlier
627  than waiting for the last lease to expire, and thus avoid a loss of
628  connectivity.  However, it this point, there are other servers on
629  the network that have seen us attempting to Rebind for quite some
630  time, and they have not responded.  The likelihood that there's a
631  server that will ignore Rebind but then suddenly spring into action
632  on a Solicit message seems low enough that the optimization won't be
633  done now.  (Starting SELECTING state earlier may be done in the
634  future, if it's found to be useful.)
635
636
637Persistent State
638
639  IPv4 DHCP has only minimal need for persistent state, beyond the
640  configuration parameters.  The state is stored when "ifconfig dhcp
641  drop" is run or the daemon receives SIGTERM, which is typically done
642  only well after the system is booted and running.
643
644  The daemon stores this state in /etc/dhcp, because it needs to be
645  available when only the root file system has been mounted.
646
647  Moreover, dhcpagent starts very early in the boot process.  It runs
648  as part of svc:/network/physical:default, which runs well before
649  root is mounted read/write:
650
651     svc:/system/filesystem/root:default ->
652        svc:/system/metainit:default ->
653           svc:/system/identity:node ->
654              svc:/network/physical:default
655           svc:/network/iscsi_initiator:default ->
656              svc:/network/physical:default
657
658  and, of course, well before either /var or /usr is mounted.  This
659  means that any persistent state must be kept in the root file
660  system, and that if we write before shutdown, we have to cope
661  gracefully with the root file system returning EROFS on write
662  attempts.
663
664  For DHCPv6, we need to try to keep our stable DUID and IAID values
665  stable across reboots to fulfill the demands of RFC 3315.
666
667  The DUID is either configured or automatically generated.  When
668  configured, it comes from the /etc/default/dhcpagent file, and thus
669  does not need to be saved by the daemon.  If automatically
670  generated, there's exactly one of these created, and it will
671  eventually be needed before /usr is mounted, if /usr is mounted over
672  IPv6.  This means a new file in the root file system,
673  /etc/dhcp/duid, will be used to hold the automatically generated
674  DUID.
675
676  The determination of whether to use a configured DUID or one saved
677  in a file is made in get_smach_cid().  This function will
678  encapsulate all of the DUID parsing and generation machinery for the
679  rest of dhcpagent.
680
681  If root is not writable at the point when dhcpagent starts, and our
682  attempt fails with EROFS, we will set a timer for 60 second
683  intervals to retry the operation periodically.  In the unlikely case
684  that it just never succeeds or that we're rebooted before root
685  becomes writable, then the impact will be that the daemon will wake
686  up once a minute and, ultimately, we'll choose a different DUID on
687  next start-up, and we'll thus lose our leases across a reboot.
688
689  The IAID similarly must be kept stable if at all possible, but
690  cannot be configured by the user.  To do make these values stable,
691  we will use two strategies.  First the IAID value for a given
692  interface (if not known) will just default to the IP ifIndex value,
693  provided that there's no known saved IAID using that value.  Second,
694  we will save off the IAID we choose in a single /etc/dhcp/iaid file,
695  containing an array of entries indexed by logical interface name.
696  Keeping it in a single file allows us to scan for used and unused
697  IAID values when necessary.
698
699  This mechanism depends on the interface name, and thus will need to
700  be revisited when Clearview vanity naming and NWAM are available.
701
702  Currently, the boot system (GRUB, OBP, the miniroot) does not
703  support installing over IPv6.  This could change in the future, so
704  one of the goals of the above stability plan is to support that
705  event.
706
707  When running in the miniroot on an x86 system, /etc/dhcp (and the
708  rest of the root) is mounted on a read-only ramdisk.  In this case,
709  writing to /etc/dhcp will just never work.  A possible solution
710  would be to add a new privileged command in ifconfig that forces
711  dhcpagent to write to an alternate location.  The initial install
712  process could then do "ifconfig <x> dhcp write /a" to get the needed
713  state written out to the newly-constructed system root.
714
715  This part (the new write option) won't be implemented as part of
716  this project, because it's not needed yet.
717
718
719Router Advertisements
720
721  IPv6 Router Advertisements perform two functions related to DHCPv6:
722
723    - they specify whether and how to run DHCPv6 on a given interface.
724    - they provide a list of the valid prefixes on an interface.
725
726  For the first function, in.ndpd needs to use the same DHCP control
727  interfaces that ifconfig uses, so that it can launch dhcpagent and
728  trigger DHCPv6 when necessary.  Note that it never needs to shut
729  down DHCPv6, as router advertisements can't do that.
730
731  However, launching dhcpagent presents new problems.  As a part of
732  the "Quagga SMF Modifications" project (PSARC 2006/552), in.ndpd in
733  Nevada is now privilege-aware and runs with limited privileges,
734  courtesy of SMF.  Dhcpagent, on the other hand, must run with all
735  privileges.
736
737  A simple work-around for this issue is to rip out the "privileges="
738  clause from the method_credential for in.ndpd.  I've taken this
739  direction initially, but the right longer-term answer seems to be
740  converting dhcpagent into an SMF service.  This is quite a bit more
741  complex, as it means turning the /sbin/dhcpagent command line
742  interface into a utility that manipulates the service and passes the
743  command line options via IPC extensions.
744
745  Such a design also begs the question of whether dhcpagent itself
746  ought to run with reduced privileges.  It could, but it still needs
747  the ability to grant "all" (traditional UNIX root) privileges to the
748  eventhook script, if present.  There seem to be few ways to do this,
749  though it's a good area for research.
750
751  The second function, prefix handling, is also subtle.  Unlike IPv4
752  DHCP, DHCPv6 does not give the netmask or prefix length along with
753  the leased address.  The client is on its own to determine the right
754  netmask to use.  This is where the advertised prefixes come in:
755  these must be used to finish the interface configuration.
756
757  We will have the DHCPv6 client configure each interface with an
758  all-ones (/128) netmask by default.  In.ndpd will be modified so
759  that when it detects a new IFF_DHCPRUNNING IP logical interface, it
760  checks for a known matching prefix, and sets the netmask as
761  necessary.  If no matching prefix is known, it will send a new
762  Router Solicitation message to try to find one.
763
764  When in.ndpd learns of a new prefix from a Router Advertisement, it
765  will scan all of the IFF_DHCPRUNNING IP logical interfaces on the
766  same physical interface and set the netmasks when necessary.
767  Dhcpagent, for its part, will ignore the netmask on IPv6 interfaces
768  when checking for changes that would require it to "abandon" the
769  interface.
770
771  Given the way that DHCPv6 and in.ndpd control both the horizontal
772  and the vertical in plumbing and removing logical interfaces, and
773  users do not, it might be worthwhile to consider roping off any
774  direct user changes to IPv6 logical interfaces under control of
775  in.ndpd or dhcpagent, and instead force users through a higher-level
776  interface.  This won't be done as part of this project, however.
777
778
779ARP Hardware Types
780
781  There are multiple places within the DHCPv6 client where the mapping
782  of DLPI MAC type to ARP Hardware Type is required:
783
784  - When we are constructing an automatic, stable DUID for our own
785    identity, we prefer to use a DUID-LLT if possible.  This is done
786    by finding a link-layer interface, opening it, reading the MAC
787    address and type, and translating in the make_stable_duid()
788    function in libdhcpagent.
789
790  - When we translate a user-configured DUID from
791    /etc/default/dhcpagent into a binary representation, we may have
792    to deal with a physical interface name.  In this case, we must
793    open that interface and read the MAC address and type.
794
795  - As part of the PIF data structure initialization, we need to read
796    out the MAC type so that it can be used in the BOOTP/DHCPv4
797    'htype' field.
798
799  Ideally, these would all be provided by a single libdlpi
800  implementation.  However, that project is on-going at this time and
801  has not yet integrated.  For the time being, a dlpi_to_arp()
802  translation function (taking dl_mac_type and returning an ARP
803  Hardware Type number) will be placed in libdhcputil.
804
805  This temporary function should be removed and this section of the
806  code updated when the new libdlpi from Clearview integrates.
807
808
809Field Mappings
810
811  Old (all in ifslist)	New
812  next			dhcp_smach_t.dsm_next
813  prev			dhcp_smach_t.dsm_prev
814  if_hold_count		dhcp_smach_t.dsm_hold_count
815  if_ia			dhcp_smach_t.dsm_ia
816  if_async		dhcp_smach_t.dsm_async
817  if_state		dhcp_smach_t.dsm_state
818  if_dflags		dhcp_smach_t.dsm_dflags
819  if_name		dhcp_smach_t.dsm_name (see text)
820  if_index		dhcp_pif_t.pif_index
821  if_max		dhcp_lif_t.lif_max and dhcp_pif_t.pif_max
822  if_min		(was unused; removed)
823  if_opt		(was unused; removed)
824  if_hwaddr		dhcp_pif_t.pif_hwaddr
825  if_hwlen		dhcp_pif_t.pif_hwlen
826  if_hwtype		dhcp_pif_t.pif_hwtype
827  if_cid		dhcp_smach_t.dsm_cid
828  if_cidlen		dhcp_smach_t.dsm_cidlen
829  if_prl		dhcp_smach_t.dsm_prl
830  if_prllen		dhcp_smach_t.dsm_prllen
831  if_daddr		dhcp_pif_t.pif_daddr
832  if_dlen		dhcp_pif_t.pif_dlen
833  if_saplen		dhcp_pif_t.pif_saplen
834  if_sap_before		dhcp_pif_t.pif_sap_before
835  if_dlpi_fd		dhcp_pif_t.pif_dlpi_fd
836  if_sock_fd		v4_sock_fd and v6_sock_fd (globals)
837  if_sock_ip_fd		dhcp_lif_t.lif_sock_ip_fd
838  if_timer		(see text)
839  if_t1			dhcp_lease_t.dl_t1
840  if_t2			dhcp_lease_t.dl_t2
841  if_lease		dhcp_lif_t.lif_expire
842  if_nrouters		dhcp_smach_t.dsm_nrouters
843  if_routers		dhcp_smach_t.dsm_routers
844  if_server		dhcp_smach_t.dsm_server
845  if_addr		dhcp_lif_t.lif_v6addr
846  if_netmask		dhcp_lif_t.lif_v6mask
847  if_broadcast		dhcp_lif_t.lif_v6peer
848  if_ack		dhcp_smach_t.dsm_ack
849  if_orig_ack		dhcp_smach_t.dsm_orig_ack
850  if_offer_wait		dhcp_smach_t.dsm_offer_wait
851  if_offer_timer	dhcp_smach_t.dsm_offer_timer
852  if_offer_id		dhcp_pif_t.pif_dlpi_id
853  if_acknak_id		dhcp_lif_t.lif_acknak_id
854  if_acknak_bcast_id	v4_acknak_bcast_id (global)
855  if_neg_monosec	dhcp_smach_t.dsm_neg_monosec
856  if_newstart_monosec	dhcp_smach_t.dsm_newstart_monosec
857  if_curstart_monosec	dhcp_smach_t.dsm_curstart_monosec
858  if_disc_secs		dhcp_smach_t.dsm_disc_secs
859  if_reqhost		dhcp_smach_t.dsm_reqhost
860  if_recv_pkt_list	dhcp_smach_t.dsm_recv_pkt_list
861  if_sent		dhcp_smach_t.dsm_sent
862  if_received		dhcp_smach_t.dsm_received
863  if_bad_offers		dhcp_smach_t.dsm_bad_offers
864  if_send_pkt		dhcp_smach_t.dsm_send_pkt
865  if_send_timeout	dhcp_smach_t.dsm_send_timeout
866  if_send_dest		dhcp_smach_t.dsm_send_dest
867  if_send_stop_func	dhcp_smach_t.dsm_send_stop_func
868  if_packet_sent	dhcp_smach_t.dsm_packet_sent
869  if_retrans_timer	dhcp_smach_t.dsm_retrans_timer
870  if_script_fd		dhcp_smach_t.dsm_script_fd
871  if_script_pid		dhcp_smach_t.dsm_script_pid
872  if_script_helper_pid	dhcp_smach_t.dsm_script_helper_pid
873  if_script_event	dhcp_smach_t.dsm_script_event
874  if_script_event_id	dhcp_smach_t.dsm_script_event_id
875  if_callback_msg	dhcp_smach_t.dsm_callback_msg
876  if_script_callback	dhcp_smach_t.dsm_script_callback
877
878  Notes:
879
880    - The dsm_name field currently just points to the lif_name on the
881      controlling LIF.  This may need to be named differently in the
882      future; perhaps when Zones are supported.
883
884    - The timer mechanism will be refactored.  Rather than using the
885      separate if_timer[] array to hold the timer IDs and
886      if_{t1,t2,lease} to hold the relative timer values, we will
887      gather this information into a dhcp_timer_t structure:
888
889	dt_id		timer ID value
890	dt_start	relative start time
891
892  New fields not accounted for above:
893
894  dhcp_pif_t.pif_next		linkage in global list of PIFs
895  dhcp_pif_t.pif_prev		linkage in global list of PIFs
896  dhcp_pif_t.pif_lifs		pointer to list of LIFs on this PIF
897  dhcp_pif_t.pif_isv6		IPv6 flag
898  dhcp_pif_t.pif_dlpi_count	number of state machines using DLPI
899  dhcp_pif_t.pif_hold_count	reference count
900  dhcp_pif_t.pif_name		name of physical interface
901  dhcp_lif_t.lif_next		linkage in per-PIF list of LIFs
902  dhcp_lif_t.lif_prev		linkage in per-PIF list of LIFs
903  dhcp_lif_t.lif_pif		backpointer to parent PIF
904  dhcp_lif_t.lif_smachs		pointer to list of state machines
905  dhcp_lif_t.lif_lease		backpointer to lease holding LIF
906  dhcp_lif_t.lif_flags		interface flags (IFF_*)
907  dhcp_lif_t.lif_hold_count	reference count
908  dhcp_lif_t.lif_dad_wait	waiting for DAD resolution flag
909  dhcp_lif_t.lif_removed	removed from list flag
910  dhcp_lif_t.lif_plumbed	plumbed by dhcpagent flag
911  dhcp_lif_t.lif_expired	lease has expired flag
912  dhcp_lif_t.lif_declined	reason to refuse this address (string)
913  dhcp_lif_t.lif_iaid		unique and stable 32-bit identifier
914  dhcp_lif_t.lif_iaid_id	timer for delayed /etc writes
915  dhcp_lif_t.lif_preferred	preferred timer for v6; deprecate after
916  dhcp_lif_t.lif_name		name of logical interface
917  dhcp_smach_t.dsm_lif		controlling (main) LIF
918  dhcp_smach_t.dsm_leases	pointer to list of leases
919  dhcp_smach_t.dsm_lif_wait	number of LIFs waiting on DAD
920  dhcp_smach_t.dsm_lif_down	number of LIFs that have failed
921  dhcp_smach_t.dsm_using_dlpi	currently using DLPI flag
922  dhcp_smach_t.dsm_send_tcenter	v4 central timer value; v6 MRT
923  dhcp_lease_t.dl_next		linkage in per-state-machine list of leases
924  dhcp_lease_t.dl_prev		linkage in per-state-machine list of leases
925  dhcp_lease_t.dl_smach		back pointer to state machine
926  dhcp_lease_t.dl_lifs		pointer to first LIF configured by lease
927  dhcp_lease_t.dl_nlifs		number of configured consecutive LIFs
928  dhcp_lease_t.dl_hold_count	reference counter
929  dhcp_lease_t.dl_removed	removed from list flag
930  dhcp_lease_t.dl_stale		lease was not updated by Renew/Rebind
931
932
933Snoop
934
935  The snoop changes are fairly straightforward.  As snoop just decodes
936  the messages, and the message format is quite different between
937  DHCPv4 and DHCPv6, a new module will be created to handle DHCPv6
938  decoding, and will export a interpret_dhcpv6() function.
939
940  The one bit of commonality between the two protocols is the use of
941  ARP Hardware Type numbers, which are found in the underlying BOOTP
942  message format for DHCPv4 and in the DUID-LL and DUID-LLT
943  construction for DHCPv6.  To simplify this, the existing static
944  show_htype() function in snoop_dhcp.c will be renamed to arp_htype()
945  (to better reflect its functionality), updated with more modern
946  hardware types, moved to snoop_arp.c (where it belongs), and made a
947  public symbol within snoop.
948
949  While I'm there, I'll update snoop_arp.c so that when it prints an
950  ARP message in verbose mode, it uses arp_htype() to translate the
951  ar_hrd value.
952
953  The snoop updates also involve the addition of a new "dhcp6" keyword
954  for filtering.  As a part of this, CR 6487534 will be fixed.
955
956
957IPv6 Source Address Selection
958
959  One of the customer requests for DHCPv6 is to be able to predict the
960  address selection behavior in the presence of both stateful and
961  stateless addresses on the same network.
962
963  Solaris implements RFC 3484 address selection behavior.  In this
964  scheme, the first seven rules implement some basic preferences for
965  addresses, with Rule 8 being a deterministic tie breaker.
966
967  Rule 8 relies on a special function, CommonPrefixLen, defined in the
968  RFC, that compares leading bits of the address without regard to
969  configured prefix length.  As Rule 1 eliminates equal addresses,
970  this always picks a single address.
971
972  This rule, though, allows for additional checks:
973
974   Rule 8 may be superseded if the implementation has other means of
975   choosing among source addresses.  For example, if the implementation
976   somehow knows which source address will result in the "best"
977   communications performance.
978
979  We will thus split Rule 8 into three separate rules:
980
981  - First, compare on configured prefix.  The interface with the
982    longest configured prefix length that also matches the candidate
983    address will be preferred.
984
985  - Next, check the type of address.  Prefer statically configured
986    addresses above all others.  Next, those from DHCPv6.  Next,
987    stateless autoconfigured addresses.  Finally, temporary addresses.
988    (Note that Rule 7 will take care of temporary address preferences,
989    so that this rule doesn't actually need to look at them.)
990
991  - Finally, run the check-all-bits (CommonPrefixLen) tie breaker.
992
993  The result of this is that if there's a local address in the same
994  configured prefix, then we'll prefer that over other addresses.  If
995  there are multiple to choose from, then will pick static first, then
996  DHCPv6, then dynamic.  Finally, if there are still multiples, we'll
997  use the "closest" address, bitwise.
998
999  Also, this basic implementation scheme also addresses CR 6485164, so
1000  a fix for that will be included with this project.
1001
1002
1003Minor Improvements
1004
1005  Various small problems with the system encountered during
1006  development will be fixed along with this project.  Some of these
1007  are:
1008
1009  - List of ARPHRD_* types is a bit short; add some new ones.
1010
1011  - List of IPPORT_* values is similarly sparse; add others in use by
1012    snoop.
1013
1014  - dhcpmsg.h lacks PRINTFLIKE for dhcpmsg(); add it.
1015
1016  - CR 6482163 causes excessive lint errors with libxnet; will fix.
1017
1018  - libdhcpagent uses gettimeofday() for I/O timing, and this can
1019    drift on systems with NTP.  It should use a stable time source
1020    (gethrtime()) instead, and should return better error values.
1021
1022  - Controlling debug mode in the daemon shouldn't require changing
1023    the command line arguments or jumping through special hoops.  I've
1024    added undocumented ".DEBUG_LEVEL=[0-3]" and ".VERBOSE=[01]"
1025    features to /etc/default/dhcpagent.
1026
1027  - The various attributes of the IPC commands (requires privileges,
1028    creates a new session, valid with BOOTP, immediate reply) should
1029    be gathered together into one look-up table rather than scattered
1030    as hard-coded tests.
1031
1032  - Remove the event unregistration from the command dispatch loop and
1033    get rid of the ipc_action_pending() botch.  We'll get a
1034    zero-length read any time the client goes away, and that will be
1035    enough to trigger termination.  This fix removes async_pending()
1036    and async_timeout() as well, and fixes CR 6487958 as a
1037    side-effect.
1038
1039  - Throughout the dhcpagent code, there are private implementations
1040    of doubly-linked and singly-linked lists for each data type.
1041    These will all be removed and replaced with insque(3C) and
1042    remque(3C).
1043
1044
1045Testing
1046
1047  The implementation was tested using the TAHI test suite for DHCPv6
1048  (www.tahi.org).  There are some peculiar aspects to this test suite,
1049  and these issues directed some of the design.  In particular:
1050
1051  - If Renew/Rebind doesn't mention one of our leases, then we need to
1052    allow the message to be retransmitted.  Real servers are unlikely
1053    to do this.
1054
1055  - We must look for a status code within IAADDR and within IA_NA, and
1056    handle the paradoxical case of "NoAddrAvail."  That doesn't make
1057    sense, as a server with no addresses wouldn't use those options.
1058    That option makes more sense at the top level of the message.
1059
1060  - If we get "UseMulticast" when we were already using multicast,
1061    then ignore the error code.  Sending another request would cause a
1062    loop.
1063
1064  - TAHI uses "NoBinding" at the top level of the message.  This
1065    status code only makes sense within an IA, as it refers to the
1066    GUID:IAID binding, which doesn't exist outside an IA.  We must
1067    ignore such errors -- treat them as success.
1068
1069
1070Interactions With Other Projects
1071
1072  Clearview UV (vanity naming) will cause link names, and thus IP
1073  interface names, to become changeable over time.  This will break
1074  the IAID stability mechanism if UV is used for arbitrary renaming,
1075  rather than as just a DR enhancement.
1076
1077  When this portion of Clearview integrates, this part of the DHCPv6
1078  design may need to be revisited.  (The solution will likely be
1079  handled at some higher layer, such as within Network Automagic.)
1080
1081  Clearview is also contributing a new libdlpi that will work for
1082  dhcpagent, and is thus removing the private dlpi_io.[ch] functions
1083  from this daemon.  When that Clearview project integrates, the
1084  DHCPv6 project will need to adjust to the new interfaces, and remove
1085  or relocate the dlpi_to_arp() function.
1086
1087
1088Futures
1089
1090  Zones currently cannot address any IP interfaces by way of DHCP.
1091  This project will not fix that problem, but the DUID/IAID could be
1092  used to help fix it in the future.
1093
1094  In particular, the DUID allows the client to obtain separate sets of
1095  addresses and configuration parameters on a single interface, just
1096  like an IPv4 Client ID, but it includes a clean mechanism for vendor
1097  extensions.  If we associate the DUID with the zone identifier or
1098  name through an extension, then we have a really simple way of
1099  allocating per-zone addresses.
1100
1101  Moreover, RFC 4361 describes a handy way of using DHCPv6 DUID/IAID
1102  values with IPv4 DHCP, which would quickly solve the problem of
1103  using DHCP for IPv4 address assignment in non-global zones as well.
1104
1105  (One potential risk with this plan is that there may be server
1106  implementations that either do not implement the RFC correctly or
1107  otherwise mishandle the DUID.  This has apparently bitten some early
1108  adopters.)
1109
1110  Implementing the FQDN option for DHCPv6 would, given the current
1111  libdhcputil design, require a new 'type' of entry for the inittab6
1112  file.  This is because the design does not allow for any simple
1113  means to ``compose'' a sequence of basic types together.  Thus,
1114  every type of option must either be a basic type, or an array of
1115  multiple instances of the same basic type.
1116
1117  If we implement FQDN in the future, it may be useful to explore some
1118  means of allowing a given option instance to be a sequence of basic
1119  types.
1120
1121  This project does not make the DNS resolver or any other subsystem
1122  use the data gathered by DHCPv6.  It just makes the data available
1123  through dhcpinfo(1).  Future projects should modify those services
1124  to use configuration data learned via DHCPv6.  (One of the reasons
1125  this is not being done now is that Network Automagic [NWAM] will
1126  likely be changing this area substantially in the very near future,
1127  and thus the effort would be largely wasted.)
1128
1129
1130Appendix A - Choice of Venue
1131
1132  There are three logical places to implement DHCPv6:
1133
1134    - in dhcpagent
1135    - in in.ndpd
1136    - in a new daemon (say, 'dhcp6agent')
1137
1138  We need to access parameters via dhcpinfo, and should provide the
1139  same set of status and control features via ifconfig as are present
1140  for IPv4.  (For the latter, if we fail to do that, it will likely
1141  confuse users.  The expense for doing it is comparatively small, and
1142  it will be useful for testing, even though it should not be needed
1143  in normal operation.)
1144
1145  If we implement somewhere other than dhcpagent, then we need to give
1146  that new daemon (in.ndpd or dhcp6agent) the same basic IPC features
1147  as dhcpagent already has.  This means either extracting those bits
1148  (async.c and ipc_action.c) into a shared library or just copying
1149  them.  Obviously, the former would be preferred, but as those bits
1150  depend on the rest of the dhcpagent infrastructure for timers and
1151  state handling, this means that the new process would have to look a
1152  lot like dhcpagent.
1153
1154  Implementing DHCPv6 as part of in.ndpd is attractive, as it
1155  eliminates the confusion that the router discovery process for
1156  determining interface netmasks can cause, along with the need to do
1157  any signaling at all to bring DHCPv6 up.  However, the need to make
1158  in.ndpd more like dhcpagent is unattractive.
1159
1160  Having a new dhcp6agent daemon seems to have little to recommend it,
1161  other than leaving the existing dhcpagent code untouched.  If we do
1162  that, then we end up with two implementations that do many similar
1163  things, and must be maintained in parallel.
1164
1165  Thus, although it leads to some complexity in reworking the data
1166  structures to fit both protocols, on balance the simplest solution
1167  is to extend dhcpagent.
1168
1169
1170Appendix B - Cross-Reference
1171
1172  in.ndpd
1173
1174    - Start dhcpagent and issue "dhcp start" command via libdhcpagent
1175    - Parse StatefulAddrConf interface option from ndpd.conf
1176    - Watch for M and O bits to trigger DHCPv6
1177    - Handle "no routers found" case and start DHCPv6
1178    - Track prefixes and set prefix length on IFF_DHCPRUNNING aliases
1179    - Send new Router Solicitation when prefix unknown
1180    - Change privileges so that dhcpagent can be launched successfully
1181
1182  libdhcputil
1183
1184    - Parse new /etc/dhcp/inittab6 file
1185    - Handle new UNUMBER24, SNUMBER64, IPV6, DUID and DOMAIN types
1186    - Add DHCPv6 option iterators (dhcpv6_find_option and
1187      dhcpv6_pkt_option)
1188    - Add dlpi_to_arp function (temporary)
1189
1190  libdhcpagent
1191
1192    - Add stable DUID and IAID creation and storage support
1193      functions and add new dhcp_stable.h include file
1194    - Support new DECLINING and RELEASING states introduced by DHCPv6.
1195    - Update implementation so that it doesn't rely on gettimeofday()
1196      for I/O timeouts
1197    - Extend the hostconf functions to support DHCPv6, using a new
1198      ".dh6" file
1199
1200  snoop
1201
1202    - Add support for DHCPv6 packet decoding (all types)
1203    - Add "dhcp6" filter keyword
1204    - Fix known bugs in DHCP filtering
1205
1206  ifconfig
1207
1208    - Remove inet-only restriction on "dhcp" keyword
1209
1210  netstat
1211
1212    - Remove strange "-I list" feature.
1213    - Add support for DHCPv6 and iterating over IPv6 interfaces.
1214
1215  ip
1216
1217    - Add extensions to IPv6 source address selection to prefer DHCPv6
1218      addresses when all else is equal
1219    - Fix known bugs in source address selection (remaining from TX
1220      integration)
1221
1222  other
1223
1224    - Add ifindex and source/destination address into PKT_LIST.
1225    - Add more ARPHDR_* and IPPORT_* values.
1226