xref: /freebsd/share/man/man4/netgraph.4 (revision 7660b554bc59a07be0431c17e0e33815818baa69)
1.\" Copyright (c) 1996-1999 Whistle Communications, Inc.
2.\" All rights reserved.
3.\"
4.\" Subject to the following obligations and disclaimer of warranty, use and
5.\" redistribution of this software, in source or object code forms, with or
6.\" without modifications are expressly permitted by Whistle Communications;
7.\" provided, however, that:
8.\" 1. Any and all reproductions of the source or object code must include the
9.\"    copyright notice above and the following disclaimer of warranties; and
10.\" 2. No rights are granted, in any manner or form, to use Whistle
11.\"    Communications, Inc. trademarks, including the mark "WHISTLE
12.\"    COMMUNICATIONS" on advertising, endorsements, or otherwise except as
13.\"    such appears in the above copyright notice or in the software.
14.\"
15.\" THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND
16.\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO
17.\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE,
18.\" INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF
19.\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
20.\" WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY
21.\" REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS
22.\" SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE.
23.\" IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES
24.\" RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING
25.\" WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
26.\" PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR
27.\" SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY
28.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
30.\" THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY
31.\" OF SUCH DAMAGE.
32.\"
33.\" Authors: Julian Elischer <julian@FreeBSD.org>
34.\"          Archie Cobbs <archie@FreeBSD.org>
35.\"
36.\" $FreeBSD$
37.\" $Whistle: netgraph.4,v 1.7 1999/01/28 23:54:52 julian Exp $
38.\"
39.Dd January 19, 1999
40.Dt NETGRAPH 4
41.Os
42.Sh NAME
43.Nm netgraph
44.Nd graph based kernel networking subsystem
45.Sh DESCRIPTION
46The
47.Nm
48system provides a uniform and modular system for the implementation
49of kernel objects which perform various networking functions. The objects,
50known as
51.Em nodes ,
52can be arranged into arbitrarily complicated graphs. Nodes have
53.Em hooks
54which are used to connect two nodes together, forming the edges in the graph.
55Nodes communicate along the edges to process data, implement protocols, etc.
56.Pp
57The aim of
58.Nm
59is to supplement rather than replace the existing kernel networking
60infrastructure.
61It provides:
62.Pp
63.Bl -bullet -compact -offset 2n
64.It
65A flexible way of combining protocol and link level drivers
66.It
67A modular way to implement new protocols
68.It
69A common framework for kernel entities to inter-communicate
70.It
71A reasonably fast, kernel-based implementation
72.El
73.Ss Nodes and Types
74The most fundamental concept in
75.Nm
76is that of a
77.Em node .
78All nodes implement a number of predefined methods which allow them
79to interact with other nodes in a well defined manner.
80.Pp
81Each node has a
82.Em type ,
83which is a static property of the node determined at node creation time.
84A node's type is described by a unique
85.Tn ASCII
86type name.
87The type implies what the node does and how it may be connected
88to other nodes.
89.Pp
90In object-oriented language, types are classes and nodes are instances
91of their respective class. All node types are subclasses of the generic node
92type, and hence inherit certain common functionality and capabilities
93(e.g., the ability to have an
94.Tn ASCII
95name).
96.Pp
97Nodes may be assigned a globally unique
98.Tn ASCII
99name which can be
100used to refer to the node.
101The name must not contain the characters
102.Dq .\&
103or
104.Dq \&:
105and is limited to
106.Dv "NG_NODELEN + 1"
107characters (including NUL byte).
108.Pp
109Each node instance has a unique
110.Em ID number
111which is expressed as a 32-bit hex value.
112This value may be used to refer to a node when there is no
113.Tn ASCII
114name assigned to it.
115.Ss Hooks
116Nodes are connected to other nodes by connecting a pair of
117.Em hooks ,
118one from each node. Data flows bidirectionally between nodes along
119connected pairs of hooks.
120A node may have as many hooks as it
121needs, and may assign whatever meaning it wants to a hook.
122.Pp
123Hooks have these properties:
124.Pp
125.Bl -bullet -compact -offset 2n
126.It
127A hook has an
128.Tn ASCII
129name which is unique among all hooks
130on that node (other hooks on other nodes may have the same name).
131The name must not contain a
132.Dq .\&
133or a
134.Dq \&:
135and is
136limited to
137.Dv "NG_HOOKLEN + 1"
138characters (including NUL byte).
139.It
140A hook is always connected to another hook.
141That is, hooks are
142created at the time they are connected, and breaking an edge by
143removing either hook destroys both hooks.
144.It
145A hook can be set into a state where incoming packets are always queued
146by the input queueing system, rather than being delivered directly.
147This is used when the two joined nodes need to be decoupled, e.g. if they are
148running at different processor priority levels.  (spl)
149.It
150A hook may supply over-riding receive data and receive message functions
151which should be used for data and messages received through that hook
152in preference to the general node-wide methods.
153.El
154.Pp
155A node may decide to assign special meaning to some hooks.
156For example, connecting to the hook named
157.Dq debug
158might trigger
159the node to start sending debugging information to that hook.
160.Ss Data Flow
161Two types of information flow between nodes: data messages and
162control messages.
163Data messages are passed in mbuf chains along the edges
164in the graph, one edge at a time.
165The first mbuf in a chain must have the
166.Dv M_PKTHDR
167flag set. Each node decides how to handle data coming in on its hooks.
168.Pp
169Control messages are type-specific C structures sent from one node
170directly to some arbitrary other node.
171Control messages have a common
172header format, followed by type-specific data, and are binary structures
173for efficiency.
174However, node types also may support conversion of the
175type specific data between binary and
176.Tn ASCII
177for debugging and human interface purposes (see the
178.Dv NGM_ASCII2BINARY
179and
180.Dv NGM_BINARY2ASCII
181generic control messages below).
182Nodes are not required to support these conversions.
183.Pp
184There are three ways to address a control message.
185If there is a sequence of edges connecting the two nodes, the message
186may be
187.Dq source routed
188by specifying the corresponding sequence
189of
190.Tn ASCII
191hook names as the destination address for the message (relative
192addressing).
193If the destination is adjacent to the source, then the source
194node may simply specify (as a pointer in the code) the hook across which the
195message should be sent.
196Otherwise, the recipient node global
197.Tn ASCII
198name
199(or equivalent ID based name) is used as the destination address
200for the message (absolute addressing).
201The two types of
202.Tn ASCII
203addressing
204may be combined, by specifying an absolute start node and a sequence
205of hooks. Only the
206.Tn ASCII
207addressing modes are available to control programs outside the kernel,
208as use of direct pointers is limited of course to kernel modules.
209.Pp
210Messages often represent commands that are followed by a reply message
211in the reverse direction.
212To facilitate this, the recipient of a
213control message is supplied with a
214.Dq return address
215that is suitable for addressing a reply.
216.Pp
217Each control message contains a 32 bit value called a
218.Em typecookie
219indicating the type of the message, i.e., how to interpret it.
220Typically each type defines a unique typecookie for the messages
221that it understands.
222However, a node may choose to recognize and
223implement more than one type of message.
224.Pp
225If a message is delivered to an address that implies that it arrived
226at that node through a particular hook, (as opposed to having been directly
227addressed using its ID or global name), then that hook is identified to the
228receiving node.
229This allows a message to be rerouted or passed on, should
230a node decide that this is required, in much the same way that data packets
231are passed around between nodes. A set of standard
232messages for flow control and link management purposes are
233defined by the base system that are usually
234passed around in this manner.
235Flow control message would usually travel
236in the opposite direction to the data to which they pertain.
237.Ss Netgraph is (usually) Functional
238In order to minimize latency, most
239.Nm
240operations are functional.
241That is, data and control messages are delivered by making function
242calls rather than by using queues and mailboxes.
243For example, if node
244A wishes to send a data mbuf to neighboring node B, it calls the
245generic
246.Nm
247data delivery function.
248This function in turn locates
249node B and calls B's
250.Dq receive data
251method.
252There are exceptions to this.
253.Pp
254Each node has an input queue, and some operations can be considered to
255be 'writers' in that they alter the state of the node.
256Obviously in an SMP
257world it would be bad if the state of a node were changed while another
258data packet were transiting the node.
259For this purpose, the input queue implements a
260.Em reader/writer
261semantic so that when there is a writer in the node, all other requests
262are queued, and while there are readers, a writer, and any following
263packets are queued.
264In the case where there is no reason to queue the
265data, the input method is called directly, as mentioned above.
266.Pp
267A node may declare that all requests should be considered as writers,
268or that requests coming in over a particular hook should be considered to
269be a writer, or even that packets leaving or entering across a particular
270hook should always be queued, rather than delivered directly (often useful
271for interrupt routines who want to get back to the hardware quickly).
272By default, all control message packets are considered to be writers
273unless specifically declared to be a reader in their definition. (see
274NGM_READONLY in
275.Pa ng_message.h )
276.Pp
277While this mode of operation
278results in good performance, it has a few implications for node
279developers:
280.Pp
281.Bl -bullet -compact -offset 2n
282.It
283Whenever a node delivers a data or control message, the node
284may need to allow for the possibility of receiving a returning
285message before the original delivery function call returns.
286.It
287Netgraph nodes and support routines generally run at
288.Fn splnet .
289However, some nodes may want to send data and control messages
290from a different priority level.
291Netgraph supplies a mechanism which
292utilizes the NETISR system to move message and data delivery to
293.Fn splnet .
294Nodes that run at other priorities (e.g. interfaces) can be directly
295linked to other nodes so that the combination runs at the other priority,
296however any interaction with nodes running at splnet MUST be achieved via the
297queueing functions, (which use the
298.Fn netisr
299feature of the kernel).
300Note that messages are always received at
301.Fn splnet .
302.It
303It's possible for an infinite loop to occur if the graph contains cycles.
304.El
305.Pp
306So far, these issues have not proven problematical in practice.
307.Ss Interaction With Other Parts of the Kernel
308A node may have a hidden interaction with other components of the
309kernel outside of the
310.Nm
311subsystem, such as device hardware,
312kernel protocol stacks, etc.  In fact, one of the benefits of
313.Nm
314is the ability to join disparate kernel networking entities together in a
315consistent communication framework.
316.Pp
317An example is the node type
318.Em socket
319which is both a netgraph node and a
320.Xr socket 2
321.Bx
322socket in the protocol family
323.Dv PF_NETGRAPH .
324Socket nodes allow user processes to participate in
325.Nm .
326Other nodes communicate with socket nodes using the usual methods, and the
327node hides the fact that it is also passing information to and from a
328cooperating user process.
329.Pp
330Another example is a device driver that presents
331a node interface to the hardware.
332.Ss Node Methods
333Nodes are notified of the following actions via function calls
334to the following node methods (all at
335.Fn splnet )
336and may accept or reject that action (by returning the appropriate
337error code):
338.Bl -tag -width xxx
339.It Creation of a new node
340The constructor for the type is called. If creation of a new node is
341allowed, the constructor must call the generic node creation
342function (in object-oriented terms, the superclass constructor)
343and then allocate any special resources it needs. For nodes that
344correspond to hardware, this is typically done during the device
345attach routine. Often a global
346.Tn ASCII
347name corresponding to the
348device name is assigned here as well.
349.It Creation of a new hook
350The hook is created and tentatively
351linked to the node, and the node is told about the name that will be
352used to describe this hook. The node sets up any special data structures
353it needs, or may reject the connection, based on the name of the hook.
354.It Successful connection of two hooks
355After both ends have accepted their
356hooks, and the links have been made, the nodes get a chance to
357find out who their peer is across the link and can then decide to reject
358the connection. Tear-down is automatic. This is also the time at which
359a node may decide whether to set a particular hook (or its peer) into
360.Em queueing
361mode.
362.It Destruction of a hook
363The node is notified of a broken connection. The node may consider some hooks
364to be critical to operation and others to be expendable: the disconnection
365of one hook may be an acceptable event while for another it
366may affect a total shutdown for the node.
367.It Shutdown of a node
368This method allows a node to clean up
369and to ensure that any actions that need to be performed
370at this time are taken. The method is called by the generic (i.e., superclass)
371node destructor which will get rid of the generic components of the node.
372Some nodes (usually associated with a piece of hardware) may be
373.Em persistent
374in that a shutdown breaks all edges and resets the node,
375but doesn't remove it. In this case the shutdown method should not
376free its resources, but rather, clean up and then clear the
377.Em NG_INVALID
378flag to signal the generic code that the shutdown is aborted. In
379the case where the shutdown is started by the node itself due to hardware
380removal or unloading, (via ng_rmnode_self()) it should set the
381.Em NG_REALLY_DIE
382flag to signal to its own shutdown method that it is not to persist.
383.El
384.Ss Sending and Receiving Data
385Two other methods are also supported by all nodes:
386.Bl -tag -width xxx
387.It Receive data message
388A
389.Em Netgraph queueable request item ,
390usually referred to as an
391.Em item ,
392is received by the function.
393The item contains a pointer to an mbuf and metadata about the packet.
394.Pp
395The node is notified on which hook the item arrived,
396and can use this information in its processing decision.
397The receiving node must always
398.Fn NG_FREE_M
399the mbuf chain on completion or error, or pass it on to another node
400(or kernel module) which will then be responsible for freeing it.
401Similarly the
402.Em item
403must be freed if it is not to be passed on to another node, by using the
404.Fn NG_FREE_ITEM
405macro. If the item still holds references to mbufs or metadata at the time of
406freeing then they will also be appropriately freed.
407Therefore, if there is any chance that the mbuf or metadata will be
408changed or freed separately from the item, it is very important
409that these fields be retrieved using the
410.Fn NGI_GET_M
411and
412.Fn NGI_GET_META
413macros that also remove the reference within the item. (or multiple frees
414of the same object will occur).
415.Pp
416If it is only required to examine the contents of the mbufs or the
417metadata, then it is possible to use the
418.Fn NGI_M
419and
420.Fn NGI_META
421macros to both read and rewrite these fields.
422.Pp
423In addition to the mbuf chain itself there may also be a pointer to a
424structure describing meta-data about the message
425(e.g. priority information). This pointer may be
426.Dv NULL
427if there is no additional information. The format for this information is
428described in
429.Pa sys/netgraph/netgraph.h .
430The memory for meta-data must allocated via
431.Fn malloc
432with type
433.Dv M_NETGRAPH_META .
434As with the data itself, it is the receiver's responsibility to
435.Fn free
436the meta-data. If the mbuf chain is freed the meta-data must
437be freed at the same time. If the meta-data is freed but the
438real data on is passed on, then a
439.Dv NULL
440pointer must be substituted. It is also the duty of the receiver to free
441the request item itself, or to use it to pass the message on further.
442.Pp
443The receiving node may decide to defer the data by queueing it in the
444.Nm
445NETISR system (see below). It achieves this by setting the
446.Dv HK_QUEUE
447flag in the flags word of the hook on which that data will arrive.
448The infrastructure will respect that bit and queue the data for delivery at
449a later time, rather than deliver it directly. A node may decide to set
450the bit on the
451.Em peer
452node, so that its own output packets are queued. This is used
453by device drivers running at different processor priorities to transfer
454packet delivery to the splnet() level at which the bulk of
455.Nm
456runs.
457.Pp
458The structure and use of meta-data is still experimental, but is
459presently used in frame-relay to indicate that management packets
460should be queued for transmission
461at a higher priority than data packets. This is required for
462conformance with Frame Relay standards.
463.Pp
464The node may elect to nominate a different receive data function
465for data received on a particular hook, to simplify coding. It uses
466the
467.Fn NG_HOOK_SET_RCVDATA hook fn
468macro to do this. The function receives the same arguments in every way
469other than it will receive all (and only) packets from that hook.
470.It Receive control message
471This method is called when a control message is addressed to the node.
472As with the received data, an
473.Em item
474is received, with a pointer to the control message.
475The message can be examined using the
476.Fn NGI_MSG
477macro, or completely extracted from the item using the
478.Fn NGI_GET_MSG
479which also removes the reference within the item.
480If the Item still holds a reference to the message when it is freed
481(using the
482.Fn NG_FREE_ITEM
483macro), then the message will also be freed appropriately. If the
484reference has been removed the node must free the message itself using the
485.Fn NG_FREE_MSG
486macro.
487A return address is always supplied, giving the address of the node
488that originated the message so a reply message can be sent anytime later.
489The return address is retrieved from the
490.Em item
491using the
492.Fn NGI_RETADDR
493macro and is of type
494.Em ng_ID_t .
495All control messages and replies are
496allocated with
497.Fn malloc
498type
499.Dv M_NETGRAPH_MSG ,
500however it is more usual to use the
501.Fn NG_MKMESSAGE
502and
503.Fn NG_MKRESPONSE
504macros to allocate and fill out a message.
505Messages must be freed using the
506.Fn NG_FREE_MSG
507macro.
508.Pp
509If the message was delivered via a specific hook, that hook will
510also be made known, which allows the use of such things as flow-control
511messages, and status change messages, where the node may want to forward
512the message out another hook to that on which it arrived.
513.Pp
514The node may elect to nominate a different receive message function
515for messages received on a particular hook, to simplify coding. It uses
516the
517.Fn NG_HOOK_SET_RCVMSG hook fn
518macro to do this. The function receives the same arguments in every way
519other than it will receive all (and only) messages from that hook.
520.El
521.Pp
522Much use has been made of reference counts, so that nodes being
523free'd of all references are automatically freed, and this behaviour
524has been tested and debugged to present a consistent and trustworthy
525framework for the
526.Dq type module
527writer to use.
528.Ss Addressing
529The
530.Nm
531framework provides an unambiguous and simple to use method of specifically
532addressing any single node in the graph. The naming of a node is
533independent of its type, in that another node, or external component
534need not know anything about the node's type in order to address it so as
535to send it a generic message type. Node and hook names should be
536chosen so as to make addresses meaningful.
537.Pp
538Addresses are either absolute or relative. An absolute address begins
539with a node name, (or ID), followed by a colon, followed by a sequence of hook
540names separated by periods. This addresses the node reached by starting
541at the named node and following the specified sequence of hooks.
542A relative address includes only the sequence of hook names, implicitly
543starting hook traversal at the local node.
544.Pp
545There are a couple of special possibilities for the node name.
546The name
547.Dq .\&
548(referred to as
549.Dq \&.: )
550always refers to the local node.
551Also, nodes that have no global name may be addressed by their ID numbers,
552by enclosing the hex representation of the ID number within square brackets.
553Here are some examples of valid netgraph addresses:
554.Bd -literal -offset 4n -compact
555
556  .:
557  [3f]:
558  foo:
559  .:hook1
560  foo:hook1.hook2
561  [d80]:hook1
562.Ed
563.Pp
564Consider the following set of nodes might be created for a site with
565a single physical frame relay line having two active logical DLCI channels,
566with RFC-1490 frames on DLCI 16 and PPP frames over DLCI 20:
567.Pp
568.Bd -literal
569[type SYNC ]                  [type FRAME]                 [type RFC1490]
570[ "Frame1" ](uplink)<-->(data)[<un-named>](dlci16)<-->(mux)[<un-named>  ]
571[    A     ]                  [    B     ](dlci20)<---+    [     C      ]
572                                                      |
573                                                      |      [ type PPP ]
574                                                      +>(mux)[<un-named>]
575                                                             [    D     ]
576.Ed
577.Pp
578One could always send a control message to node C from anywhere
579by using the name
580.Em "Frame1:uplink.dlci16" .
581In this case, node C would also be notified that the message
582reached it via its hook
583.Dq mux .
584Similarly,
585.Em "Frame1:uplink.dlci20"
586could reliably be used to reach node D, and node A could refer
587to node B as
588.Em ".:uplink" ,
589or simply
590.Em "uplink" .
591Conversely, B can refer to A as
592.Em "data" .
593The address
594.Em "mux.data"
595could be used by both nodes C and D to address a message to node A.
596.Pp
597Note that this is only for
598.Em control messages .
599In each of these cases, where a relative addressing mode is
600used, the recipient is notified of the hook on which the
601message arrived, as well as
602the originating node.
603This allows the option of hop-by-hop distribution of messages and
604state information.
605Data messages are
606.Em only
607routed one hop at a time, by specifying the departing
608hook, with each node making
609the next routing decision. So when B receives a frame on hook
610.Dq data
611it decodes the frame relay header to determine the DLCI,
612and then forwards the unwrapped frame to either C or D.
613.Pp
614In a similar way, flow control messages may be routed in the reverse
615direction to outgoing data. For example a "buffer nearly full" message from
616.Em "Frame1:
617would be passed to node
618.Em B
619which might decide to send similar messages to both nodes
620.Em C
621and
622.Em D .
623The nodes would use
624.Em "Direct hook pointer"
625addressing to route the messages. The message may have travelled from
626.Em "Frame1:
627to
628.Em B
629as a synchronous reply, saving time and cycles.
630.Pp
631A similar graph might be used to represent multi-link PPP running
632over an ISDN line:
633.Pp
634.Bd -literal
635[ type BRI ](B1)<--->(link1)[ type MPP  ]
636[  "ISDN1" ](B2)<--->(link2)[ (no name) ]
637[          ](D) <-+
638                  |
639 +----------------+
640 |
641 +->(switch)[ type Q.921 ](term1)<---->(datalink)[ type Q.931 ]
642            [ (no name)  ]                       [ (no name)  ]
643.Ed
644.Ss Netgraph Structures
645Structures are defined in
646.Pa sys/netgraph/netgraph.h
647(for kernel structures only of interest to nodes)
648and
649.Pa sys/netgraph/ng_message.h
650(for message definitions also of interest to user programs).
651.Pp
652The two basic object types that are of interest to node authors are
653.Em nodes
654and
655.Em hooks .
656These two objects have the following
657properties that are also of interest to the node writers.
658.Bl -tag -width xxx
659.It struct  ng_node
660Node authors should always use the following typedef to declare
661their pointers, and should never actually declare the structure.
662.Pp
663typedef struct ng_node *node_p;
664.Pp
665The following properties are associated with a node, and can be
666accessed in the following manner:
667.Bl -bullet -compact -offset 2n
668.Pp
669.It
670Validity
671.Pp
672A driver or interrupt routine may want to check whether
673the node is still valid. It is assumed that the caller holds a reference
674on the node so it will not have been freed, however it may have been
675disabled or otherwise shut down. Using the
676.Fn NG_NODE_IS_VALID "node"
677macro will return this state. Eventually it should be almost impossible
678for code to run in an invalid node but at this time that work has not been
679completed.
680.Pp
681.It
682node ID
683.Pp
684Of type
685.Em ng_ID_t ,
686This property can be retrieved using the macro
687.Fn NG_NODE_ID "node" .
688.Pp
689.It
690node name
691.Pp
692Optional globally unique name, null terminated string. If there
693is a value in here, it is the name of the node.
694.Pp
695if
696.Fn ( NG_NODE_NAME "node"
697[0]) ....
698.Pp
699if (strncmp(
700.Fn NG_NODE_NAME "node" ,
701"fred", NG_NODELEN)) ...
702.Pp
703.It
704A node dependent opaque cookie
705.Pp
706You may place anything of type
707.Em pointer
708here.
709Use the macros
710.Fn NG_NODE_SET_PRIVATE node value
711and
712.Fn NG_NODE_PRIVATE "node"
713to set and retrieve this property.
714.Pp
715.It
716number of hooks
717.Pp
718Use
719.Fn NG_NODE_NUMHOOKS "node"
720to retrieve this value.
721.Pp
722.It
723hooks
724.Pp
725The node may have a number of hooks.
726A traversal method is provided to allow all the hooks to be
727tested for some condition.
728.Fn NG_NODE_FOREACH_HOOK node fn arg rethook
729where fn is a function that will be called for each hook
730with the form
731.Fn fn hook arg
732and returning 0 to terminate the search. If the search is terminated, then
733.Em rethook
734will be set to the hook at which the search was terminated.
735.El
736.It struct  ng_hook
737Node authors should always use the following typedef to declare
738their hook pointers.
739.Pp
740typedef struct ng_hook *hook_p;
741.Pp
742The following properties are associated with a hook, and can be
743accessed in the following manner:
744.Bl -bullet -compact -offset 2n
745.Pp
746.It
747A node dependent opaque cookie.
748.Pp
749You may place anything of type
750.Em pointer
751here.
752Use the macros
753.Fn NG_HOOK_SET_PRIVATE hook value
754and
755.Fn NG_HOOK_PRIVATE "hook"
756to set and retrieve this property.
757.Pp
758.It
759An associate node.
760.Pp
761You may use the macro
762.Fn NG_HOOK_NODE "hook"
763to find the associated node.
764.Pp
765.It
766A peer hook
767.Pp
768The other hook in this connected pair. Of type hook_p. You can
769use
770.Fn NG_HOOK_PEER "hook"
771to find the peer.
772.Pp
773.It
774references
775.Pp
776.Fn NG_HOOK_REF "hook"
777and
778.Fn NG_HOOK_UNREF "hook"
779increment and decrement the hook reference count accordingly.
780After decrement you should always assume the hook has been freed
781unless you have another reference still valid.
782.Pp
783.It
784Over-ride receive functions.
785.Pp
786The
787.Fn NG_HOOK_SET_RCVDATA hook fn
788and
789.Fn NG_HOOK_SET_RCVMSG hook fn
790macros can be used to set over-ride methods that will be used in preference
791to the generic receive data and receive message functions. To unset these
792use the macros to set them to NULL. They will only be used for data and
793messages received on the hook on which they are set.
794.El
795.Pp
796The maintenance of the names, reference counts, and linked list
797of hooks for each node is handled automatically by the
798.Nm
799subsystem.
800Typically a node's private info contains a back-pointer to the node or hook
801structure, which counts as a new reference that must be included
802in the reference count for the node. When the node constructor is called
803there is already a reference for this calculated in, so that
804when the node is destroyed, it should remember to do a
805.Fn NG_NODE_UNREF
806on the node.
807.Pp
808From a hook you can obtain the corresponding node, and from
809a node, it is possible to traverse all the active hooks.
810.Pp
811A current example of how to define a node can always be seen in
812.Em sys/netgraph/ng_sample.c
813and should be used as a starting point for new node writers.
814.El
815.Ss Netgraph Message Structure
816Control messages have the following structure:
817.Bd -literal
818#define NG_CMDSTRLEN    15      /* Max command string (16 with null) */
819
820struct ng_mesg {
821  struct ng_msghdr {
822    u_char      version;        /* Must equal NG_VERSION */
823    u_char      spare;          /* Pad to 2 bytes */
824    u_short     arglen;         /* Length of cmd/resp data */
825    u_long      flags;          /* Message status flags */
826    u_long      token;          /* Reply should have the same token */
827    u_long      typecookie;     /* Node type understanding this message */
828    u_long      cmd;            /* Command identifier */
829    u_char      cmdstr[NG_CMDSTRLEN+1]; /* Cmd string (for debug) */
830  } header;
831  char  data[0];                /* Start of cmd/resp data */
832};
833
834#define NG_ABI_VERSION  5               /* Netgraph kernel ABI version */
835#define NG_VERSION      4               /* Netgraph message version */
836#define NGF_ORIG        0x0000          /* Command */
837#define NGF_RESP        0x0001          /* Response */
838.Ed
839.Pp
840Control messages have the fixed header shown above, followed by a
841variable length data section which depends on the type cookie
842and the command. Each field is explained below:
843.Bl -tag -width xxx
844.It Dv version
845Indicates the version of the netgraph message protocol itself. The current version is
846.Dv NG_VERSION .
847.It Dv arglen
848This is the length of any extra arguments, which begin at
849.Dv data .
850.It Dv flags
851Indicates whether this is a command or a response control message.
852.It Dv token
853The
854.Dv token
855is a means by which a sender can match a reply message to the
856corresponding command message; the reply always has the same token.
857.Pp
858.It Dv typecookie
859The corresponding node type's unique 32-bit value.
860If a node doesn't recognize the type cookie it must reject the message
861by returning
862.Er EINVAL .
863.Pp
864Each type should have an include file that defines the commands,
865argument format, and cookie for its own messages.
866The typecookie
867insures that the same header file was included by both sender and
868receiver; when an incompatible change in the header file is made,
869the typecookie
870.Em must
871be changed.
872The de facto method for generating unique type cookies is to take the
873seconds from the epoch at the time the header file is written
874(i.e., the output of
875.Dv "date -u +'%s'" ) .
876.Pp
877There is a predefined typecookie
878.Dv NGM_GENERIC_COOKIE
879for the
880.Dq generic
881node type, and
882a corresponding set of generic messages which all nodes understand.
883The handling of these messages is automatic.
884.It Dv command
885The identifier for the message command. This is type specific,
886and is defined in the same header file as the typecookie.
887.It Dv cmdstr
888Room for a short human readable version of
889.Dq command
890(for debugging purposes only).
891.El
892.Pp
893Some modules may choose to implement messages from more than one
894of the header files and thus recognize more than one type cookie.
895.Ss Control Message ASCII Form
896Control messages are in binary format for efficiency.  However, for
897debugging and human interface purposes, and if the node type supports
898it, control messages may be converted to and from an equivalent
899.Tn ASCII
900form.  The
901.Tn ASCII
902form is similar to the binary form, with two exceptions:
903.Pp
904.Bl -tag -compact -width xxx
905.It o
906The
907.Dv cmdstr
908header field must contain the
909.Tn ASCII
910name of the command, corresponding to the
911.Dv cmd
912header field.
913.It o
914The
915.Dv args
916field contains a NUL-terminated
917.Tn ASCII
918string version of the message arguments.
919.El
920.Pp
921In general, the arguments field of a control message can be any
922arbitrary C data type.  Netgraph includes parsing routines to support
923some pre-defined datatypes in
924.Tn ASCII
925with this simple syntax:
926.Pp
927.Bl -tag -compact -width xxx
928.It o
929Integer types are represented by base 8, 10, or 16 numbers.
930.It o
931Strings are enclosed in double quotes and respect the normal
932C language backslash escapes.
933.It o
934IP addresses have the obvious form.
935.It o
936Arrays are enclosed in square brackets, with the elements listed
937consecutively starting at index zero.  An element may have an optional
938index and equals sign preceding it.  Whenever an element
939does not have an explicit index, the index is implicitly the previous
940element's index plus one.
941.It o
942Structures are enclosed in curly braces, and each field is specified
943in the form
944.Dq fieldname=value .
945.It o
946Any array element or structure field whose value is equal to its
947.Dq default value
948may be omitted. For integer types, the default value
949is usually zero; for string types, the empty string.
950.It o
951Array elements and structure fields may be specified in any order.
952.El
953.Pp
954Each node type may define its own arbitrary types by providing
955the necessary routines to parse and unparse.
956.Tn ASCII
957forms defined
958for a specific node type are documented in the documentation for
959that node type.
960.Ss Generic Control Messages
961There are a number of standard predefined messages that will work
962for any node, as they are supported directly by the framework itself.
963These are defined in
964.Pa ng_message.h
965along with the basic layout of messages and other similar information.
966.Bl -tag -width xxx
967.It Dv NGM_CONNECT
968Connect to another node, using the supplied hook names on either end.
969.It Dv NGM_MKPEER
970Construct a node of the given type and then connect to it using the
971supplied hook names.
972.It Dv NGM_SHUTDOWN
973The target node should disconnect from all its neighbours and shut down.
974Persistent nodes such as those representing physical hardware
975might not disappear from the node namespace, but only reset themselves.
976The node must disconnect all of its hooks.
977This may result in neighbors shutting themselves down, and possibly a
978cascading shutdown of the entire connected graph.
979.It Dv NGM_NAME
980Assign a name to a node. Nodes can exist without having a name, and this
981is the default for nodes created using the
982.Dv NGM_MKPEER
983method. Such nodes can only be addressed relatively or by their ID number.
984.It Dv NGM_RMHOOK
985Ask the node to break a hook connection to one of its neighbours.
986Both nodes will have their
987.Dq disconnect
988method invoked.
989Either node may elect to totally shut down as a result.
990.It Dv NGM_NODEINFO
991Asks the target node to describe itself. The four returned fields
992are the node name (if named), the node type, the node ID and the
993number of hooks attached. The ID is an internal number unique to that node.
994.It Dv NGM_LISTHOOKS
995This returns the information given by
996.Dv NGM_NODEINFO ,
997but in addition
998includes an array of fields describing each link, and the description for
999the node at the far end of that link.
1000.It Dv NGM_LISTNAMES
1001This returns an array of node descriptions (as for
1002.Dv NGM_NODEINFO ")"
1003where each entry of the array describes a named node.
1004All named nodes will be described.
1005.It Dv NGM_LISTNODES
1006This is the same as
1007.Dv NGM_LISTNAMES
1008except that all nodes are listed regardless of whether they have a name or not.
1009.It Dv NGM_LISTTYPES
1010This returns a list of all currently installed netgraph types.
1011.It Dv NGM_TEXT_STATUS
1012The node may return a text formatted status message.
1013The status information is determined entirely by the node type.
1014It is the only "generic" message
1015that requires any support within the node itself and as such the node may
1016elect to not support this message. The text response must be less than
1017.Dv NG_TEXTRESPONSE
1018bytes in length (presently 1024). This can be used to return general
1019status information in human readable form.
1020.It Dv NGM_BINARY2ASCII
1021This message converts a binary control message to its
1022.Tn ASCII
1023form.
1024The entire control message to be converted is contained within the
1025arguments field of the
1026.Dv NGM_BINARY2ASCII
1027message itself.  If successful, the reply will contain the same control
1028message in
1029.Tn ASCII
1030form.
1031A node will typically only know how to translate messages that it
1032itself understands, so the target node of the
1033.Dv NGM_BINARY2ASCII
1034is often the same node that would actually receive that message.
1035.It Dv NGM_ASCII2BINARY
1036The opposite of
1037.Dv NGM_BINARY2ASCII .
1038The entire control message to be converted, in
1039.Tn ASCII
1040form, is contained
1041in the arguments section of the
1042.Dv NGM_ASCII2BINARY
1043and need only have the
1044.Dv flags ,
1045.Dv cmdstr ,
1046and
1047.Dv arglen
1048header fields filled in, plus the NUL-terminated string version of
1049the arguments in the arguments field.  If successful, the reply
1050contains the binary version of the control message.
1051.El
1052.Ss Flow Control Messages
1053In addition to the control messages that affect nodes with respect to the
1054graph, there are also a number of
1055.Em Flow-control
1056messages defined. At present these are
1057.Em NOT
1058handled automatically by the system, so
1059nodes need to handle them if they are going to be used in a graph utilising
1060flow control, and will be in the likely path of these messages.
1061The default action of a node that doesn't understand these messages should
1062be to pass them onto the next node.
1063Hopefully some helper functions will assist in this eventually.
1064These messages are also defined in
1065.Pa sys/netgraph/ng_message.h
1066and have a separate cookie
1067.Em NG_FLOW_COOKIE
1068to help identify them.
1069They will not be covered in depth here.
1070.Ss Metadata
1071Data moving through the
1072.Nm
1073system can be accompanied by meta-data that describes some
1074aspect of that data.
1075The form of the meta-data is a fixed header,
1076which contains enough information for most uses, and can optionally
1077be supplemented by trailing
1078.Em option
1079structures, which contain a
1080.Em cookie
1081(see the section on control messages), an identifier, a length and optional
1082data. If a node does not recognize the cookie associated with an option,
1083it should ignore that option.
1084.Pp
1085Meta data might include such things as priority, discard eligibility,
1086or special processing requirements.
1087It might also mark a packet for
1088debug status, etc.
1089The use of meta-data is still experimental.
1090.Sh INITIALIZATION
1091The base
1092.Nm
1093code may either be statically compiled
1094into the kernel or else loaded dynamically as a KLD via
1095.Xr kldload 8 .
1096In the former case, include
1097.Pp
1098.Dl options NETGRAPH
1099.Pp
1100in your kernel configuration file.
1101You may also include selected
1102node types in the kernel compilation, for example:
1103.Bd -literal -offset indent
1104options NETGRAPH
1105options NETGRAPH_SOCKET
1106options NETGRAPH_ECHO
1107.Ed
1108.Pp
1109Once the
1110.Nm
1111subsystem is loaded, individual node types may be loaded at any time
1112as KLD modules via
1113.Xr kldload 8 .
1114Moreover,
1115.Nm
1116knows how to automatically do this; when a request to create a new
1117node of unknown type
1118.Em type
1119is made,
1120.Nm
1121will attempt to load the KLD module
1122.Pa ng_type.ko .
1123.Pp
1124Types can also be installed at boot time, as certain device drivers
1125may want to export each instance of the device as a netgraph node.
1126.Pp
1127In general, new types can be installed at any time from within the
1128kernel by calling
1129.Fn ng_newtype ,
1130supplying a pointer to the type's
1131.Dv struct ng_type
1132structure.
1133.Pp
1134The
1135.Fn NETGRAPH_INIT
1136macro automates this process by using a linker set.
1137.Sh EXISTING NODE TYPES
1138Several node types currently exist.
1139Each is fully documented in its own man page:
1140.Bl -tag -width xxx
1141.It SOCKET
1142The socket type implements two new sockets in the new protocol domain
1143.Dv PF_NETGRAPH .
1144The new sockets protocols are
1145.Dv NG_DATA
1146and
1147.Dv NG_CONTROL ,
1148both of type
1149.Dv SOCK_DGRAM .
1150Typically one of each is associated with a socket node.
1151When both sockets have closed, the node will shut down.
1152The
1153.Dv NG_DATA
1154socket is used for sending and receiving data, while the
1155.Dv NG_CONTROL
1156socket is used for sending and receiving control messages.
1157Data and control messages are passed using the
1158.Xr sendto 2
1159and
1160.Xr recvfrom 2
1161calls, using a
1162.Dv struct sockaddr_ng
1163socket address.
1164.Pp
1165.It HOLE
1166Responds only to generic messages and is a
1167.Dq black hole
1168for data, Useful for testing. Always accepts new hooks.
1169.Pp
1170.It ECHO
1171Responds only to generic messages and always echoes data back through the
1172hook from which it arrived. Returns any non generic messages as their
1173own response. Useful for testing.  Always accepts new hooks.
1174.Pp
1175.It TEE
1176This node is useful for
1177.Dq snooping .
1178It has 4 hooks:
1179.Dv left ,
1180.Dv right ,
1181.Dv left2right ,
1182and
1183.Dv right2left .
1184Data entering from the right is passed to the left and duplicated on
1185.Dv right2left ,
1186and data entering from the left is passed to the right and
1187duplicated on
1188.Dv left2right .
1189Data entering from
1190.Dv left2right
1191is sent to the right and data from
1192.Dv right2left
1193to left.
1194.Pp
1195.It RFC1490 MUX
1196Encapsulates/de-encapsulates frames encoded according to RFC 1490.
1197Has a hook for the encapsulated packets
1198.Pq Dq downstream
1199and one hook
1200for each protocol (i.e., IP, PPP, etc.).
1201.Pp
1202.It FRAME RELAY MUX
1203Encapsulates/de-encapsulates Frame Relay frames.
1204Has a hook for the encapsulated packets
1205.Pq Dq downstream
1206and one hook
1207for each DLCI.
1208.Pp
1209.It FRAME RELAY LMI
1210Automatically handles frame relay
1211.Dq LMI
1212(link management interface) operations and packets.
1213Automatically probes and detects which of several LMI standards
1214is in use at the exchange.
1215.Pp
1216.It TTY
1217This node is also a line discipline. It simply converts between mbuf
1218frames and sequential serial data, allowing a tty to appear as a netgraph
1219node. It has a programmable
1220.Dq hotkey
1221character.
1222.Pp
1223.It ASYNC
1224This node encapsulates and de-encapsulates asynchronous frames
1225according to RFC 1662. This is used in conjunction with the TTY node
1226type for supporting PPP links over asynchronous serial lines.
1227.Pp
1228.It INTERFACE
1229This node is also a system networking interface. It has hooks representing
1230each protocol family (IP, AppleTalk, IPX, etc.) and appears in the output of
1231.Xr ifconfig 8 .
1232The interfaces are named
1233.Em ng0 ,
1234.Em ng1 ,
1235etc.
1236.It ONE2MANY
1237This node implements a simple round-robin multiplexer. It can be used
1238for example to make several LAN ports act together to get a higher speed
1239link between two machines.
1240.It Various PPP related nodes.
1241There is a full multilink PPP implementation that runs in Netgraph.
1242The
1243.Em Mpd
1244port can use these modules to make a very low latency high
1245capacity ppp system. It also supports
1246.Em PPTP
1247vpns using the
1248.Em PPTP
1249node.
1250.It PPPOE
1251A server and client side implementation of PPPoE. Used in conjunction with
1252either
1253.Xr ppp 8
1254or the
1255.Em mpd port .
1256.It BRIDGE
1257This node, together with the ethernet nodes allows a very flexible
1258bridging system to be implemented.
1259.It KSOCKET
1260This intriguing node looks like a socket to the system but diverts
1261all data to and from the netgraph system for further processing. This allows
1262such things as UDP tunnels to be almost trivially implemented from the
1263command line.
1264.El
1265.Pp
1266Refer to the section at the end of this man page for more nodes types.
1267.Sh NOTES
1268Whether a named node exists can be checked by trying to send a control message
1269to it (e.g.,
1270.Dv NGM_NODEINFO ) .
1271If it does not exist,
1272.Er ENOENT
1273will be returned.
1274.Pp
1275All data messages are mbuf chains with the M_PKTHDR flag set.
1276.Pp
1277Nodes are responsible for freeing what they allocate.
1278There are three exceptions:
1279.Bl -tag -width xxxx
1280.It 1
1281Mbufs sent across a data link are never to be freed by the sender. In the
1282case of error, they should be considered freed.
1283.It 2
1284Any meta-data information traveling with the data has the same restriction.
1285It might be freed by any node the data passes through, and a
1286.Dv NULL
1287passed onwards, but the caller will never free it.
1288Two macros
1289.Fn NG_FREE_META "meta"
1290and
1291.Fn NG_FREE_M "m"
1292should be used if possible to free data and meta data (see
1293.Pa netgraph.h ) .
1294.It 3
1295Messages sent using
1296.Fn ng_send_message
1297are freed by the recipient. As in the case above, the addresses
1298associated with the message are freed by whatever allocated them so the
1299recipient should copy them if it wants to keep that information.
1300.It 4
1301Both control messages and data are delivered and queued with
1302a netgraph
1303.Em item .
1304The item must be freed using
1305.Fn NG_FREE_ITEM "item"
1306or passed on to another node.
1307.El
1308.Sh FILES
1309.Bl -tag -width xxxxx -compact
1310.It Pa /sys/netgraph/netgraph.h
1311Definitions for use solely within the kernel by
1312.Nm
1313nodes.
1314.It Pa /sys/netgraph/ng_message.h
1315Definitions needed by any file that needs to deal with
1316.Nm
1317messages.
1318.It Pa /sys/netgraph/ng_socket.h
1319Definitions needed to use
1320.Nm
1321socket type nodes.
1322.It Pa /sys/netgraph/ng_{type}.h
1323Definitions needed to use
1324.Nm
1325{type}
1326nodes, including the type cookie definition.
1327.It Pa /boot/kernel/netgraph.ko
1328Netgraph subsystem loadable KLD module.
1329.It Pa /boot/kernel/ng_{type}.ko
1330Loadable KLD module for node type {type}.
1331.It Pa /sys/netgraph/ng_sample.c
1332Skeleton netgraph node.
1333Use this as a starting point for new node types.
1334.El
1335.Sh USER MODE SUPPORT
1336There is a library for supporting user-mode programs that wish
1337to interact with the netgraph system. See
1338.Xr netgraph 3
1339for details.
1340.Pp
1341Two user-mode support programs,
1342.Xr ngctl 8
1343and
1344.Xr nghook 8 ,
1345are available to assist manual configuration and debugging.
1346.Pp
1347There are a few useful techniques for debugging new node types.
1348First, implementing new node types in user-mode first
1349makes debugging easier.
1350The
1351.Em tee
1352node type is also useful for debugging, especially in conjunction with
1353.Xr ngctl 8
1354and
1355.Xr nghook 8 .
1356.Pp
1357Also look in /usr/share/examples/netgraph for solutions to several
1358common networking problems, solved using
1359.Nm .
1360.Sh SEE ALSO
1361.Xr socket 2 ,
1362.Xr netgraph 3 ,
1363.Xr ng_async 4 ,
1364.Xr ng_bpf 4 ,
1365.Xr ng_bridge 4 ,
1366.Xr ng_cisco 4 ,
1367.Xr ng_echo 4 ,
1368.Xr ng_ether 4 ,
1369.Xr ng_frame_relay 4 ,
1370.Xr ng_hole 4 ,
1371.Xr ng_iface 4 ,
1372.Xr ng_ksocket 4 ,
1373.Xr ng_lmi 4 ,
1374.Xr ng_mppc 4 ,
1375.Xr ng_ppp 4 ,
1376.Xr ng_pppoe 4 ,
1377.Xr ng_pptpgre 4 ,
1378.Xr ng_rfc1490 4 ,
1379.Xr ng_socket 4 ,
1380.Xr ng_tee 4 ,
1381.Xr ng_tty 4 ,
1382.Xr ng_UI 4 ,
1383.Xr ng_vjc 4 ,
1384.Xr ngctl 8 ,
1385.Xr nghook 8
1386.Sh HISTORY
1387The
1388.Nm
1389system was designed and first implemented at Whistle Communications, Inc.\&
1390in a version of
1391.Fx 2.2
1392customized for the Whistle InterJet.
1393It first made its debut in the main tree in
1394.Fx 3.4 .
1395.Sh AUTHORS
1396.An -nosplit
1397.An Julian Elischer Aq julian@FreeBSD.org ,
1398with contributions by
1399.An Archie Cobbs Aq archie@FreeBSD.org .
1400