1.\" Copyright (c) 1996-1999 Whistle Communications, Inc. 2.\" All rights reserved. 3.\" 4.\" Subject to the following obligations and disclaimer of warranty, use and 5.\" redistribution of this software, in source or object code forms, with or 6.\" without modifications are expressly permitted by Whistle Communications; 7.\" provided, however, that: 8.\" 1. Any and all reproductions of the source or object code must include the 9.\" copyright notice above and the following disclaimer of warranties; and 10.\" 2. No rights are granted, in any manner or form, to use Whistle 11.\" Communications, Inc. trademarks, including the mark "WHISTLE 12.\" COMMUNICATIONS" on advertising, endorsements, or otherwise except as 13.\" such appears in the above copyright notice or in the software. 14.\" 15.\" THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND 16.\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO 17.\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE, 18.\" INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF 19.\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. 20.\" WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY 21.\" REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS 22.\" SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE. 23.\" IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES 24.\" RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING 25.\" WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, 26.\" PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR 27.\" SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY 28.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 29.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 30.\" THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY 31.\" OF SUCH DAMAGE. 32.\" 33.\" Authors: Julian Elischer <julian@FreeBSD.org> 34.\" Archie Cobbs <archie@FreeBSD.org> 35.\" 36.\" $FreeBSD$ 37.\" $Whistle: netgraph.4,v 1.7 1999/01/28 23:54:52 julian Exp $ 38.\" 39.Dd January 19, 1999 40.Dt NETGRAPH 4 41.Os 42.Sh NAME 43.Nm netgraph 44.Nd graph based kernel networking subsystem 45.Sh DESCRIPTION 46The 47.Nm 48system provides a uniform and modular system for the implementation 49of kernel objects which perform various networking functions. The objects, 50known as 51.Em nodes , 52can be arranged into arbitrarily complicated graphs. Nodes have 53.Em hooks 54which are used to connect two nodes together, forming the edges in the graph. 55Nodes communicate along the edges to process data, implement protocols, etc. 56.Pp 57The aim of 58.Nm 59is to supplement rather than replace the existing kernel networking 60infrastructure. 61It provides: 62.Pp 63.Bl -bullet -compact -offset 2n 64.It 65A flexible way of combining protocol and link level drivers 66.It 67A modular way to implement new protocols 68.It 69A common framework for kernel entities to inter-communicate 70.It 71A reasonably fast, kernel-based implementation 72.El 73.Ss Nodes and Types 74The most fundamental concept in 75.Nm 76is that of a 77.Em node . 78All nodes implement a number of predefined methods which allow them 79to interact with other nodes in a well defined manner. 80.Pp 81Each node has a 82.Em type , 83which is a static property of the node determined at node creation time. 84A node's type is described by a unique 85.Tn ASCII 86type name. 87The type implies what the node does and how it may be connected 88to other nodes. 89.Pp 90In object-oriented language, types are classes and nodes are instances 91of their respective class. All node types are subclasses of the generic node 92type, and hence inherit certain common functionality and capabilities 93(e.g., the ability to have an 94.Tn ASCII 95name). 96.Pp 97Nodes may be assigned a globally unique 98.Tn ASCII 99name which can be 100used to refer to the node. 101The name must not contain the characters 102.Dq .\& 103or 104.Dq \&: 105and is limited to 106.Dv "NG_NODELEN + 1" 107characters (including NUL byte). 108.Pp 109Each node instance has a unique 110.Em ID number 111which is expressed as a 32-bit hex value. 112This value may be used to refer to a node when there is no 113.Tn ASCII 114name assigned to it. 115.Ss Hooks 116Nodes are connected to other nodes by connecting a pair of 117.Em hooks , 118one from each node. Data flows bidirectionally between nodes along 119connected pairs of hooks. 120A node may have as many hooks as it 121needs, and may assign whatever meaning it wants to a hook. 122.Pp 123Hooks have these properties: 124.Pp 125.Bl -bullet -compact -offset 2n 126.It 127A hook has an 128.Tn ASCII 129name which is unique among all hooks 130on that node (other hooks on other nodes may have the same name). 131The name must not contain a 132.Dq .\& 133or a 134.Dq \&: 135and is 136limited to 137.Dv "NG_HOOKLEN + 1" 138characters (including NUL byte). 139.It 140A hook is always connected to another hook. 141That is, hooks are 142created at the time they are connected, and breaking an edge by 143removing either hook destroys both hooks. 144.It 145A hook can be set into a state where incoming packets are always queued 146by the input queueing system, rather than being delivered directly. 147This is used when the two joined nodes need to be decoupled, e.g. if they are 148running at different processor priority levels. (spl) 149.It 150A hook may supply over-riding receive data and receive message functions 151which should be used for data and messages received through that hook 152in preference to the general node-wide methods. 153.El 154.Pp 155A node may decide to assign special meaning to some hooks. 156For example, connecting to the hook named 157.Dq debug 158might trigger 159the node to start sending debugging information to that hook. 160.Ss Data Flow 161Two types of information flow between nodes: data messages and 162control messages. 163Data messages are passed in mbuf chains along the edges 164in the graph, one edge at a time. 165The first mbuf in a chain must have the 166.Dv M_PKTHDR 167flag set. Each node decides how to handle data coming in on its hooks. 168.Pp 169Control messages are type-specific C structures sent from one node 170directly to some arbitrary other node. 171Control messages have a common 172header format, followed by type-specific data, and are binary structures 173for efficiency. 174However, node types also may support conversion of the 175type specific data between binary and 176.Tn ASCII 177for debugging and human interface purposes (see the 178.Dv NGM_ASCII2BINARY 179and 180.Dv NGM_BINARY2ASCII 181generic control messages below). 182Nodes are not required to support these conversions. 183.Pp 184There are three ways to address a control message. 185If there is a sequence of edges connecting the two nodes, the message 186may be 187.Dq source routed 188by specifying the corresponding sequence 189of 190.Tn ASCII 191hook names as the destination address for the message (relative 192addressing). 193If the destination is adjacent to the source, then the source 194node may simply specify (as a pointer in the code) the hook across which the 195message should be sent. 196Otherwise, the recipient node global 197.Tn ASCII 198name 199(or equivalent ID based name) is used as the destination address 200for the message (absolute addressing). 201The two types of 202.Tn ASCII 203addressing 204may be combined, by specifying an absolute start node and a sequence 205of hooks. Only the 206.Tn ASCII 207addressing modes are available to control programs outside the kernel, 208as use of direct pointers is limited of course to kernel modules. 209.Pp 210Messages often represent commands that are followed by a reply message 211in the reverse direction. 212To facilitate this, the recipient of a 213control message is supplied with a 214.Dq return address 215that is suitable for addressing a reply. 216.Pp 217Each control message contains a 32 bit value called a 218.Em typecookie 219indicating the type of the message, i.e., how to interpret it. 220Typically each type defines a unique typecookie for the messages 221that it understands. 222However, a node may choose to recognize and 223implement more than one type of message. 224.Pp 225If a message is delivered to an address that implies that it arrived 226at that node through a particular hook, (as opposed to having been directly 227addressed using its ID or global name), then that hook is identified to the 228receiving node. 229This allows a message to be rerouted or passed on, should 230a node decide that this is required, in much the same way that data packets 231are passed around between nodes. A set of standard 232messages for flow control and link management purposes are 233defined by the base system that are usually 234passed around in this manner. 235Flow control message would usually travel 236in the opposite direction to the data to which they pertain. 237.Ss Netgraph is (usually) Functional 238In order to minimize latency, most 239.Nm 240operations are functional. 241That is, data and control messages are delivered by making function 242calls rather than by using queues and mailboxes. 243For example, if node 244A wishes to send a data mbuf to neighboring node B, it calls the 245generic 246.Nm 247data delivery function. 248This function in turn locates 249node B and calls B's 250.Dq receive data 251method. 252There are exceptions to this. 253.Pp 254Each node has an input queue, and some operations can be considered to 255be 'writers' in that they alter the state of the node. 256Obviously in an SMP 257world it would be bad if the state of a node were changed while another 258data packet were transiting the node. 259For this purpose, the input queue implements a 260.Em reader/writer 261semantic so that when there is a writer in the node, all other requests 262are queued, and while there are readers, a writer, and any following 263packets are queued. 264In the case where there is no reason to queue the 265data, the input method is called directly, as mentioned above. 266.Pp 267A node may declare that all requests should be considered as writers, 268or that requests coming in over a particular hook should be considered to 269be a writer, or even that packets leaving or entering across a particular 270hook should always be queued, rather than delivered directly (often useful 271for interrupt routines who want to get back to the hardware quickly). 272By default, all control message packets are considered to be writers 273unless specifically declared to be a reader in their definition. (see 274NGM_READONLY in 275.Pa ng_message.h ) 276.Pp 277While this mode of operation 278results in good performance, it has a few implications for node 279developers: 280.Pp 281.Bl -bullet -compact -offset 2n 282.It 283Whenever a node delivers a data or control message, the node 284may need to allow for the possibility of receiving a returning 285message before the original delivery function call returns. 286.It 287Netgraph nodes and support routines generally run at 288.Fn splnet . 289However, some nodes may want to send data and control messages 290from a different priority level. 291Netgraph supplies a mechanism which 292utilizes the NETISR system to move message and data delivery to 293.Fn splnet . 294Nodes that run at other priorities (e.g. interfaces) can be directly 295linked to other nodes so that the combination runs at the other priority, 296however any interaction with nodes running at splnet MUST be achieved via the 297queueing functions, (which use the 298.Fn netisr 299feature of the kernel). 300Note that messages are always received at 301.Fn splnet . 302.It 303It's possible for an infinite loop to occur if the graph contains cycles. 304.El 305.Pp 306So far, these issues have not proven problematical in practice. 307.Ss Interaction With Other Parts of the Kernel 308A node may have a hidden interaction with other components of the 309kernel outside of the 310.Nm 311subsystem, such as device hardware, 312kernel protocol stacks, etc. In fact, one of the benefits of 313.Nm 314is the ability to join disparate kernel networking entities together in a 315consistent communication framework. 316.Pp 317An example is the node type 318.Em socket 319which is both a netgraph node and a 320.Xr socket 2 321.Bx 322socket in the protocol family 323.Dv PF_NETGRAPH . 324Socket nodes allow user processes to participate in 325.Nm . 326Other nodes communicate with socket nodes using the usual methods, and the 327node hides the fact that it is also passing information to and from a 328cooperating user process. 329.Pp 330Another example is a device driver that presents 331a node interface to the hardware. 332.Ss Node Methods 333Nodes are notified of the following actions via function calls 334to the following node methods (all at 335.Fn splnet ) 336and may accept or reject that action (by returning the appropriate 337error code): 338.Bl -tag -width xxx 339.It Creation of a new node 340The constructor for the type is called. If creation of a new node is 341allowed, the constructor must call the generic node creation 342function (in object-oriented terms, the superclass constructor) 343and then allocate any special resources it needs. For nodes that 344correspond to hardware, this is typically done during the device 345attach routine. Often a global 346.Tn ASCII 347name corresponding to the 348device name is assigned here as well. 349.It Creation of a new hook 350The hook is created and tentatively 351linked to the node, and the node is told about the name that will be 352used to describe this hook. The node sets up any special data structures 353it needs, or may reject the connection, based on the name of the hook. 354.It Successful connection of two hooks 355After both ends have accepted their 356hooks, and the links have been made, the nodes get a chance to 357find out who their peer is across the link and can then decide to reject 358the connection. Tear-down is automatic. This is also the time at which 359a node may decide whether to set a particular hook (or its peer) into 360.Em queueing 361mode. 362.It Destruction of a hook 363The node is notified of a broken connection. The node may consider some hooks 364to be critical to operation and others to be expendable: the disconnection 365of one hook may be an acceptable event while for another it 366may affect a total shutdown for the node. 367.It Shutdown of a node 368This method allows a node to clean up 369and to ensure that any actions that need to be performed 370at this time are taken. The method is called by the generic (i.e., superclass) 371node destructor which will get rid of the generic components of the node. 372Some nodes (usually associated with a piece of hardware) may be 373.Em persistent 374in that a shutdown breaks all edges and resets the node, 375but doesn't remove it. In this case the shutdown method should not 376free its resources, but rather, clean up and then clear the 377.Em NG_INVALID 378flag to signal the generic code that the shutdown is aborted. In 379the case where the shutdown is started by the node itself due to hardware 380removal or unloading, (via ng_rmnode_self()) it should set the 381.Em NG_REALLY_DIE 382flag to signal to its own shutdown method that it is not to persist. 383.El 384.Ss Sending and Receiving Data 385Two other methods are also supported by all nodes: 386.Bl -tag -width xxx 387.It Receive data message 388A 389.Em Netgraph queueable request item , 390usually referred to as an 391.Em item , 392is received by the function. 393The item contains a pointer to an mbuf and metadata about the packet. 394.Pp 395The node is notified on which hook the item arrived, 396and can use this information in its processing decision. 397The receiving node must always 398.Fn NG_FREE_M 399the mbuf chain on completion or error, or pass it on to another node 400(or kernel module) which will then be responsible for freeing it. 401Similarly the 402.Em item 403must be freed if it is not to be passed on to another node, by using the 404.Fn NG_FREE_ITEM 405macro. If the item still holds references to mbufs or metadata at the time of 406freeing then they will also be appropriately freed. 407Therefore, if there is any chance that the mbuf or metadata will be 408changed or freed separately from the item, it is very important 409that these fields be retrieved using the 410.Fn NGI_GET_M 411and 412.Fn NGI_GET_META 413macros that also remove the reference within the item. (or multiple frees 414of the same object will occur). 415.Pp 416If it is only required to examine the contents of the mbufs or the 417metadata, then it is possible to use the 418.Fn NGI_M 419and 420.Fn NGI_META 421macros to both read and rewrite these fields. 422.Pp 423In addition to the mbuf chain itself there may also be a pointer to a 424structure describing meta-data about the message 425(e.g. priority information). This pointer may be 426.Dv NULL 427if there is no additional information. The format for this information is 428described in 429.Pa sys/netgraph/netgraph.h . 430The memory for meta-data must allocated via 431.Fn malloc 432with type 433.Dv M_NETGRAPH_META . 434As with the data itself, it is the receiver's responsibility to 435.Fn free 436the meta-data. If the mbuf chain is freed the meta-data must 437be freed at the same time. If the meta-data is freed but the 438real data on is passed on, then a 439.Dv NULL 440pointer must be substituted. It is also the duty of the receiver to free 441the request item itself, or to use it to pass the message on further. 442.Pp 443The receiving node may decide to defer the data by queueing it in the 444.Nm 445NETISR system (see below). It achieves this by setting the 446.Dv HK_QUEUE 447flag in the flags word of the hook on which that data will arrive. 448The infrastructure will respect that bit and queue the data for delivery at 449a later time, rather than deliver it directly. A node may decide to set 450the bit on the 451.Em peer 452node, so that its own output packets are queued. This is used 453by device drivers running at different processor priorities to transfer 454packet delivery to the splnet() level at which the bulk of 455.Nm 456runs. 457.Pp 458The structure and use of meta-data is still experimental, but is 459presently used in frame-relay to indicate that management packets 460should be queued for transmission 461at a higher priority than data packets. This is required for 462conformance with Frame Relay standards. 463.Pp 464The node may elect to nominate a different receive data function 465for data received on a particular hook, to simplify coding. It uses 466the 467.Fn NG_HOOK_SET_RCVDATA hook fn 468macro to do this. The function receives the same arguments in every way 469other than it will receive all (and only) packets from that hook. 470.It Receive control message 471This method is called when a control message is addressed to the node. 472As with the received data, an 473.Em item 474is received, with a pointer to the control message. 475The message can be examined using the 476.Fn NGI_MSG 477macro, or completely extracted from the item using the 478.Fn NGI_GET_MSG 479which also removes the reference within the item. 480If the Item still holds a reference to the message when it is freed 481(using the 482.Fn NG_FREE_ITEM 483macro), then the message will also be freed appropriately. If the 484reference has been removed the node must free the message itself using the 485.Fn NG_FREE_MSG 486macro. 487A return address is always supplied, giving the address of the node 488that originated the message so a reply message can be sent anytime later. 489The return address is retrieved from the 490.Em item 491using the 492.Fn NGI_RETADDR 493macro and is of type 494.Em ng_ID_t . 495All control messages and replies are 496allocated with 497.Fn malloc 498type 499.Dv M_NETGRAPH_MSG , 500however it is more usual to use the 501.Fn NG_MKMESSAGE 502and 503.Fn NG_MKRESPONSE 504macros to allocate and fill out a message. 505Messages must be freed using the 506.Fn NG_FREE_MSG 507macro. 508.Pp 509If the message was delivered via a specific hook, that hook will 510also be made known, which allows the use of such things as flow-control 511messages, and status change messages, where the node may want to forward 512the message out another hook to that on which it arrived. 513.Pp 514The node may elect to nominate a different receive message function 515for messages received on a particular hook, to simplify coding. It uses 516the 517.Fn NG_HOOK_SET_RCVMSG hook fn 518macro to do this. The function receives the same arguments in every way 519other than it will receive all (and only) messages from that hook. 520.El 521.Pp 522Much use has been made of reference counts, so that nodes being 523free'd of all references are automatically freed, and this behaviour 524has been tested and debugged to present a consistent and trustworthy 525framework for the 526.Dq type module 527writer to use. 528.Ss Addressing 529The 530.Nm 531framework provides an unambiguous and simple to use method of specifically 532addressing any single node in the graph. The naming of a node is 533independent of its type, in that another node, or external component 534need not know anything about the node's type in order to address it so as 535to send it a generic message type. Node and hook names should be 536chosen so as to make addresses meaningful. 537.Pp 538Addresses are either absolute or relative. An absolute address begins 539with a node name, (or ID), followed by a colon, followed by a sequence of hook 540names separated by periods. This addresses the node reached by starting 541at the named node and following the specified sequence of hooks. 542A relative address includes only the sequence of hook names, implicitly 543starting hook traversal at the local node. 544.Pp 545There are a couple of special possibilities for the node name. 546The name 547.Dq .\& 548(referred to as 549.Dq \&.: ) 550always refers to the local node. 551Also, nodes that have no global name may be addressed by their ID numbers, 552by enclosing the hex representation of the ID number within square brackets. 553Here are some examples of valid netgraph addresses: 554.Bd -literal -offset 4n -compact 555 556 .: 557 [3f]: 558 foo: 559 .:hook1 560 foo:hook1.hook2 561 [d80]:hook1 562.Ed 563.Pp 564Consider the following set of nodes might be created for a site with 565a single physical frame relay line having two active logical DLCI channels, 566with RFC-1490 frames on DLCI 16 and PPP frames over DLCI 20: 567.Pp 568.Bd -literal 569[type SYNC ] [type FRAME] [type RFC1490] 570[ "Frame1" ](uplink)<-->(data)[<un-named>](dlci16)<-->(mux)[<un-named> ] 571[ A ] [ B ](dlci20)<---+ [ C ] 572 | 573 | [ type PPP ] 574 +>(mux)[<un-named>] 575 [ D ] 576.Ed 577.Pp 578One could always send a control message to node C from anywhere 579by using the name 580.Em "Frame1:uplink.dlci16" . 581In this case, node C would also be notified that the message 582reached it via its hook 583.Dq mux . 584Similarly, 585.Em "Frame1:uplink.dlci20" 586could reliably be used to reach node D, and node A could refer 587to node B as 588.Em ".:uplink" , 589or simply 590.Em "uplink" . 591Conversely, B can refer to A as 592.Em "data" . 593The address 594.Em "mux.data" 595could be used by both nodes C and D to address a message to node A. 596.Pp 597Note that this is only for 598.Em control messages . 599In each of these cases, where a relative addressing mode is 600used, the recipient is notified of the hook on which the 601message arrived, as well as 602the originating node. 603This allows the option of hop-by-hop distribution of messages and 604state information. 605Data messages are 606.Em only 607routed one hop at a time, by specifying the departing 608hook, with each node making 609the next routing decision. So when B receives a frame on hook 610.Dq data 611it decodes the frame relay header to determine the DLCI, 612and then forwards the unwrapped frame to either C or D. 613.Pp 614In a similar way, flow control messages may be routed in the reverse 615direction to outgoing data. For example a "buffer nearly full" message from 616.Em "Frame1: 617would be passed to node 618.Em B 619which might decide to send similar messages to both nodes 620.Em C 621and 622.Em D . 623The nodes would use 624.Em "Direct hook pointer" 625addressing to route the messages. The message may have travelled from 626.Em "Frame1: 627to 628.Em B 629as a synchronous reply, saving time and cycles. 630.Pp 631A similar graph might be used to represent multi-link PPP running 632over an ISDN line: 633.Pp 634.Bd -literal 635[ type BRI ](B1)<--->(link1)[ type MPP ] 636[ "ISDN1" ](B2)<--->(link2)[ (no name) ] 637[ ](D) <-+ 638 | 639 +----------------+ 640 | 641 +->(switch)[ type Q.921 ](term1)<---->(datalink)[ type Q.931 ] 642 [ (no name) ] [ (no name) ] 643.Ed 644.Ss Netgraph Structures 645Structures are defined in 646.Pa sys/netgraph/netgraph.h 647(for kernel structures only of interest to nodes) 648and 649.Pa sys/netgraph/ng_message.h 650(for message definitions also of interest to user programs). 651.Pp 652The two basic object types that are of interest to node authors are 653.Em nodes 654and 655.Em hooks . 656These two objects have the following 657properties that are also of interest to the node writers. 658.Bl -tag -width xxx 659.It struct ng_node 660Node authors should always use the following typedef to declare 661their pointers, and should never actually declare the structure. 662.Pp 663typedef struct ng_node *node_p; 664.Pp 665The following properties are associated with a node, and can be 666accessed in the following manner: 667.Bl -bullet -compact -offset 2n 668.Pp 669.It 670Validity 671.Pp 672A driver or interrupt routine may want to check whether 673the node is still valid. It is assumed that the caller holds a reference 674on the node so it will not have been freed, however it may have been 675disabled or otherwise shut down. Using the 676.Fn NG_NODE_IS_VALID "node" 677macro will return this state. Eventually it should be almost impossible 678for code to run in an invalid node but at this time that work has not been 679completed. 680.Pp 681.It 682node ID 683.Pp 684Of type 685.Em ng_ID_t , 686This property can be retrieved using the macro 687.Fn NG_NODE_ID "node" . 688.Pp 689.It 690node name 691.Pp 692Optional globally unique name, null terminated string. If there 693is a value in here, it is the name of the node. 694.Pp 695if 696.Fn ( NG_NODE_NAME "node" 697[0]) .... 698.Pp 699if (strncmp( 700.Fn NG_NODE_NAME "node" , 701"fred", NG_NODELEN)) ... 702.Pp 703.It 704A node dependent opaque cookie 705.Pp 706You may place anything of type 707.Em pointer 708here. 709Use the macros 710.Fn NG_NODE_SET_PRIVATE node value 711and 712.Fn NG_NODE_PRIVATE "node" 713to set and retrieve this property. 714.Pp 715.It 716number of hooks 717.Pp 718Use 719.Fn NG_NODE_NUMHOOKS "node" 720to retrieve this value. 721.Pp 722.It 723hooks 724.Pp 725The node may have a number of hooks. 726A traversal method is provided to allow all the hooks to be 727tested for some condition. 728.Fn NG_NODE_FOREACH_HOOK node fn arg rethook 729where fn is a function that will be called for each hook 730with the form 731.Fn fn hook arg 732and returning 0 to terminate the search. If the search is terminated, then 733.Em rethook 734will be set to the hook at which the search was terminated. 735.El 736.It struct ng_hook 737Node authors should always use the following typedef to declare 738their hook pointers. 739.Pp 740typedef struct ng_hook *hook_p; 741.Pp 742The following properties are associated with a hook, and can be 743accessed in the following manner: 744.Bl -bullet -compact -offset 2n 745.Pp 746.It 747A node dependent opaque cookie. 748.Pp 749You may place anything of type 750.Em pointer 751here. 752Use the macros 753.Fn NG_HOOK_SET_PRIVATE hook value 754and 755.Fn NG_HOOK_PRIVATE "hook" 756to set and retrieve this property. 757.Pp 758.It 759An associate node. 760.Pp 761You may use the macro 762.Fn NG_HOOK_NODE "hook" 763to find the associated node. 764.Pp 765.It 766A peer hook 767.Pp 768The other hook in this connected pair. Of type hook_p. You can 769use 770.Fn NG_HOOK_PEER "hook" 771to find the peer. 772.Pp 773.It 774references 775.Pp 776.Fn NG_HOOK_REF "hook" 777and 778.Fn NG_HOOK_UNREF "hook" 779increment and decrement the hook reference count accordingly. 780After decrement you should always assume the hook has been freed 781unless you have another reference still valid. 782.Pp 783.It 784Over-ride receive functions. 785.Pp 786The 787.Fn NG_HOOK_SET_RCVDATA hook fn 788and 789.Fn NG_HOOK_SET_RCVMSG hook fn 790macros can be used to set over-ride methods that will be used in preference 791to the generic receive data and receive message functions. To unset these 792use the macros to set them to NULL. They will only be used for data and 793messages received on the hook on which they are set. 794.El 795.Pp 796The maintenance of the names, reference counts, and linked list 797of hooks for each node is handled automatically by the 798.Nm 799subsystem. 800Typically a node's private info contains a back-pointer to the node or hook 801structure, which counts as a new reference that must be included 802in the reference count for the node. When the node constructor is called 803there is already a reference for this calculated in, so that 804when the node is destroyed, it should remember to do a 805.Fn NG_NODE_UNREF 806on the node. 807.Pp 808From a hook you can obtain the corresponding node, and from 809a node, it is possible to traverse all the active hooks. 810.Pp 811A current example of how to define a node can always be seen in 812.Em sys/netgraph/ng_sample.c 813and should be used as a starting point for new node writers. 814.El 815.Ss Netgraph Message Structure 816Control messages have the following structure: 817.Bd -literal 818#define NG_CMDSTRLEN 15 /* Max command string (16 with null) */ 819 820struct ng_mesg { 821 struct ng_msghdr { 822 u_char version; /* Must equal NG_VERSION */ 823 u_char spare; /* Pad to 2 bytes */ 824 u_short arglen; /* Length of cmd/resp data */ 825 u_long flags; /* Message status flags */ 826 u_long token; /* Reply should have the same token */ 827 u_long typecookie; /* Node type understanding this message */ 828 u_long cmd; /* Command identifier */ 829 u_char cmdstr[NG_CMDSTRLEN+1]; /* Cmd string (for debug) */ 830 } header; 831 char data[0]; /* Start of cmd/resp data */ 832}; 833 834#define NG_ABI_VERSION 5 /* Netgraph kernel ABI version */ 835#define NG_VERSION 4 /* Netgraph message version */ 836#define NGF_ORIG 0x0000 /* Command */ 837#define NGF_RESP 0x0001 /* Response */ 838.Ed 839.Pp 840Control messages have the fixed header shown above, followed by a 841variable length data section which depends on the type cookie 842and the command. Each field is explained below: 843.Bl -tag -width xxx 844.It Dv version 845Indicates the version of the netgraph message protocol itself. The current version is 846.Dv NG_VERSION . 847.It Dv arglen 848This is the length of any extra arguments, which begin at 849.Dv data . 850.It Dv flags 851Indicates whether this is a command or a response control message. 852.It Dv token 853The 854.Dv token 855is a means by which a sender can match a reply message to the 856corresponding command message; the reply always has the same token. 857.Pp 858.It Dv typecookie 859The corresponding node type's unique 32-bit value. 860If a node doesn't recognize the type cookie it must reject the message 861by returning 862.Er EINVAL . 863.Pp 864Each type should have an include file that defines the commands, 865argument format, and cookie for its own messages. 866The typecookie 867insures that the same header file was included by both sender and 868receiver; when an incompatible change in the header file is made, 869the typecookie 870.Em must 871be changed. 872The de facto method for generating unique type cookies is to take the 873seconds from the epoch at the time the header file is written 874(i.e., the output of 875.Dv "date -u +'%s'" ) . 876.Pp 877There is a predefined typecookie 878.Dv NGM_GENERIC_COOKIE 879for the 880.Dq generic 881node type, and 882a corresponding set of generic messages which all nodes understand. 883The handling of these messages is automatic. 884.It Dv command 885The identifier for the message command. This is type specific, 886and is defined in the same header file as the typecookie. 887.It Dv cmdstr 888Room for a short human readable version of 889.Dq command 890(for debugging purposes only). 891.El 892.Pp 893Some modules may choose to implement messages from more than one 894of the header files and thus recognize more than one type cookie. 895.Ss Control Message ASCII Form 896Control messages are in binary format for efficiency. However, for 897debugging and human interface purposes, and if the node type supports 898it, control messages may be converted to and from an equivalent 899.Tn ASCII 900form. The 901.Tn ASCII 902form is similar to the binary form, with two exceptions: 903.Pp 904.Bl -tag -compact -width xxx 905.It o 906The 907.Dv cmdstr 908header field must contain the 909.Tn ASCII 910name of the command, corresponding to the 911.Dv cmd 912header field. 913.It o 914The 915.Dv args 916field contains a NUL-terminated 917.Tn ASCII 918string version of the message arguments. 919.El 920.Pp 921In general, the arguments field of a control message can be any 922arbitrary C data type. Netgraph includes parsing routines to support 923some pre-defined datatypes in 924.Tn ASCII 925with this simple syntax: 926.Pp 927.Bl -tag -compact -width xxx 928.It o 929Integer types are represented by base 8, 10, or 16 numbers. 930.It o 931Strings are enclosed in double quotes and respect the normal 932C language backslash escapes. 933.It o 934IP addresses have the obvious form. 935.It o 936Arrays are enclosed in square brackets, with the elements listed 937consecutively starting at index zero. An element may have an optional 938index and equals sign preceding it. Whenever an element 939does not have an explicit index, the index is implicitly the previous 940element's index plus one. 941.It o 942Structures are enclosed in curly braces, and each field is specified 943in the form 944.Dq fieldname=value . 945.It o 946Any array element or structure field whose value is equal to its 947.Dq default value 948may be omitted. For integer types, the default value 949is usually zero; for string types, the empty string. 950.It o 951Array elements and structure fields may be specified in any order. 952.El 953.Pp 954Each node type may define its own arbitrary types by providing 955the necessary routines to parse and unparse. 956.Tn ASCII 957forms defined 958for a specific node type are documented in the documentation for 959that node type. 960.Ss Generic Control Messages 961There are a number of standard predefined messages that will work 962for any node, as they are supported directly by the framework itself. 963These are defined in 964.Pa ng_message.h 965along with the basic layout of messages and other similar information. 966.Bl -tag -width xxx 967.It Dv NGM_CONNECT 968Connect to another node, using the supplied hook names on either end. 969.It Dv NGM_MKPEER 970Construct a node of the given type and then connect to it using the 971supplied hook names. 972.It Dv NGM_SHUTDOWN 973The target node should disconnect from all its neighbours and shut down. 974Persistent nodes such as those representing physical hardware 975might not disappear from the node namespace, but only reset themselves. 976The node must disconnect all of its hooks. 977This may result in neighbors shutting themselves down, and possibly a 978cascading shutdown of the entire connected graph. 979.It Dv NGM_NAME 980Assign a name to a node. Nodes can exist without having a name, and this 981is the default for nodes created using the 982.Dv NGM_MKPEER 983method. Such nodes can only be addressed relatively or by their ID number. 984.It Dv NGM_RMHOOK 985Ask the node to break a hook connection to one of its neighbours. 986Both nodes will have their 987.Dq disconnect 988method invoked. 989Either node may elect to totally shut down as a result. 990.It Dv NGM_NODEINFO 991Asks the target node to describe itself. The four returned fields 992are the node name (if named), the node type, the node ID and the 993number of hooks attached. The ID is an internal number unique to that node. 994.It Dv NGM_LISTHOOKS 995This returns the information given by 996.Dv NGM_NODEINFO , 997but in addition 998includes an array of fields describing each link, and the description for 999the node at the far end of that link. 1000.It Dv NGM_LISTNAMES 1001This returns an array of node descriptions (as for 1002.Dv NGM_NODEINFO ")" 1003where each entry of the array describes a named node. 1004All named nodes will be described. 1005.It Dv NGM_LISTNODES 1006This is the same as 1007.Dv NGM_LISTNAMES 1008except that all nodes are listed regardless of whether they have a name or not. 1009.It Dv NGM_LISTTYPES 1010This returns a list of all currently installed netgraph types. 1011.It Dv NGM_TEXT_STATUS 1012The node may return a text formatted status message. 1013The status information is determined entirely by the node type. 1014It is the only "generic" message 1015that requires any support within the node itself and as such the node may 1016elect to not support this message. The text response must be less than 1017.Dv NG_TEXTRESPONSE 1018bytes in length (presently 1024). This can be used to return general 1019status information in human readable form. 1020.It Dv NGM_BINARY2ASCII 1021This message converts a binary control message to its 1022.Tn ASCII 1023form. 1024The entire control message to be converted is contained within the 1025arguments field of the 1026.Dv NGM_BINARY2ASCII 1027message itself. If successful, the reply will contain the same control 1028message in 1029.Tn ASCII 1030form. 1031A node will typically only know how to translate messages that it 1032itself understands, so the target node of the 1033.Dv NGM_BINARY2ASCII 1034is often the same node that would actually receive that message. 1035.It Dv NGM_ASCII2BINARY 1036The opposite of 1037.Dv NGM_BINARY2ASCII . 1038The entire control message to be converted, in 1039.Tn ASCII 1040form, is contained 1041in the arguments section of the 1042.Dv NGM_ASCII2BINARY 1043and need only have the 1044.Dv flags , 1045.Dv cmdstr , 1046and 1047.Dv arglen 1048header fields filled in, plus the NUL-terminated string version of 1049the arguments in the arguments field. If successful, the reply 1050contains the binary version of the control message. 1051.El 1052.Ss Flow Control Messages 1053In addition to the control messages that affect nodes with respect to the 1054graph, there are also a number of 1055.Em Flow-control 1056messages defined. At present these are 1057.Em NOT 1058handled automatically by the system, so 1059nodes need to handle them if they are going to be used in a graph utilising 1060flow control, and will be in the likely path of these messages. 1061The default action of a node that doesn't understand these messages should 1062be to pass them onto the next node. 1063Hopefully some helper functions will assist in this eventually. 1064These messages are also defined in 1065.Pa sys/netgraph/ng_message.h 1066and have a separate cookie 1067.Em NG_FLOW_COOKIE 1068to help identify them. 1069They will not be covered in depth here. 1070.Ss Metadata 1071Data moving through the 1072.Nm 1073system can be accompanied by meta-data that describes some 1074aspect of that data. 1075The form of the meta-data is a fixed header, 1076which contains enough information for most uses, and can optionally 1077be supplemented by trailing 1078.Em option 1079structures, which contain a 1080.Em cookie 1081(see the section on control messages), an identifier, a length and optional 1082data. If a node does not recognize the cookie associated with an option, 1083it should ignore that option. 1084.Pp 1085Meta data might include such things as priority, discard eligibility, 1086or special processing requirements. 1087It might also mark a packet for 1088debug status, etc. 1089The use of meta-data is still experimental. 1090.Sh INITIALIZATION 1091The base 1092.Nm 1093code may either be statically compiled 1094into the kernel or else loaded dynamically as a KLD via 1095.Xr kldload 8 . 1096In the former case, include 1097.Pp 1098.Dl options NETGRAPH 1099.Pp 1100in your kernel configuration file. 1101You may also include selected 1102node types in the kernel compilation, for example: 1103.Bd -literal -offset indent 1104options NETGRAPH 1105options NETGRAPH_SOCKET 1106options NETGRAPH_ECHO 1107.Ed 1108.Pp 1109Once the 1110.Nm 1111subsystem is loaded, individual node types may be loaded at any time 1112as KLD modules via 1113.Xr kldload 8 . 1114Moreover, 1115.Nm 1116knows how to automatically do this; when a request to create a new 1117node of unknown type 1118.Em type 1119is made, 1120.Nm 1121will attempt to load the KLD module 1122.Pa ng_type.ko . 1123.Pp 1124Types can also be installed at boot time, as certain device drivers 1125may want to export each instance of the device as a netgraph node. 1126.Pp 1127In general, new types can be installed at any time from within the 1128kernel by calling 1129.Fn ng_newtype , 1130supplying a pointer to the type's 1131.Dv struct ng_type 1132structure. 1133.Pp 1134The 1135.Fn NETGRAPH_INIT 1136macro automates this process by using a linker set. 1137.Sh EXISTING NODE TYPES 1138Several node types currently exist. 1139Each is fully documented in its own man page: 1140.Bl -tag -width xxx 1141.It SOCKET 1142The socket type implements two new sockets in the new protocol domain 1143.Dv PF_NETGRAPH . 1144The new sockets protocols are 1145.Dv NG_DATA 1146and 1147.Dv NG_CONTROL , 1148both of type 1149.Dv SOCK_DGRAM . 1150Typically one of each is associated with a socket node. 1151When both sockets have closed, the node will shut down. 1152The 1153.Dv NG_DATA 1154socket is used for sending and receiving data, while the 1155.Dv NG_CONTROL 1156socket is used for sending and receiving control messages. 1157Data and control messages are passed using the 1158.Xr sendto 2 1159and 1160.Xr recvfrom 2 1161calls, using a 1162.Dv struct sockaddr_ng 1163socket address. 1164.Pp 1165.It HOLE 1166Responds only to generic messages and is a 1167.Dq black hole 1168for data, Useful for testing. Always accepts new hooks. 1169.Pp 1170.It ECHO 1171Responds only to generic messages and always echoes data back through the 1172hook from which it arrived. Returns any non generic messages as their 1173own response. Useful for testing. Always accepts new hooks. 1174.Pp 1175.It TEE 1176This node is useful for 1177.Dq snooping . 1178It has 4 hooks: 1179.Dv left , 1180.Dv right , 1181.Dv left2right , 1182and 1183.Dv right2left . 1184Data entering from the right is passed to the left and duplicated on 1185.Dv right2left , 1186and data entering from the left is passed to the right and 1187duplicated on 1188.Dv left2right . 1189Data entering from 1190.Dv left2right 1191is sent to the right and data from 1192.Dv right2left 1193to left. 1194.Pp 1195.It RFC1490 MUX 1196Encapsulates/de-encapsulates frames encoded according to RFC 1490. 1197Has a hook for the encapsulated packets 1198.Pq Dq downstream 1199and one hook 1200for each protocol (i.e., IP, PPP, etc.). 1201.Pp 1202.It FRAME RELAY MUX 1203Encapsulates/de-encapsulates Frame Relay frames. 1204Has a hook for the encapsulated packets 1205.Pq Dq downstream 1206and one hook 1207for each DLCI. 1208.Pp 1209.It FRAME RELAY LMI 1210Automatically handles frame relay 1211.Dq LMI 1212(link management interface) operations and packets. 1213Automatically probes and detects which of several LMI standards 1214is in use at the exchange. 1215.Pp 1216.It TTY 1217This node is also a line discipline. It simply converts between mbuf 1218frames and sequential serial data, allowing a tty to appear as a netgraph 1219node. It has a programmable 1220.Dq hotkey 1221character. 1222.Pp 1223.It ASYNC 1224This node encapsulates and de-encapsulates asynchronous frames 1225according to RFC 1662. This is used in conjunction with the TTY node 1226type for supporting PPP links over asynchronous serial lines. 1227.Pp 1228.It INTERFACE 1229This node is also a system networking interface. It has hooks representing 1230each protocol family (IP, AppleTalk, IPX, etc.) and appears in the output of 1231.Xr ifconfig 8 . 1232The interfaces are named 1233.Em ng0 , 1234.Em ng1 , 1235etc. 1236.It ONE2MANY 1237This node implements a simple round-robin multiplexer. It can be used 1238for example to make several LAN ports act together to get a higher speed 1239link between two machines. 1240.It Various PPP related nodes. 1241There is a full multilink PPP implementation that runs in Netgraph. 1242The 1243.Em Mpd 1244port can use these modules to make a very low latency high 1245capacity ppp system. It also supports 1246.Em PPTP 1247vpns using the 1248.Em PPTP 1249node. 1250.It PPPOE 1251A server and client side implementation of PPPoE. Used in conjunction with 1252either 1253.Xr ppp 8 1254or the 1255.Em mpd port . 1256.It BRIDGE 1257This node, together with the ethernet nodes allows a very flexible 1258bridging system to be implemented. 1259.It KSOCKET 1260This intriguing node looks like a socket to the system but diverts 1261all data to and from the netgraph system for further processing. This allows 1262such things as UDP tunnels to be almost trivially implemented from the 1263command line. 1264.El 1265.Pp 1266Refer to the section at the end of this man page for more nodes types. 1267.Sh NOTES 1268Whether a named node exists can be checked by trying to send a control message 1269to it (e.g., 1270.Dv NGM_NODEINFO ) . 1271If it does not exist, 1272.Er ENOENT 1273will be returned. 1274.Pp 1275All data messages are mbuf chains with the M_PKTHDR flag set. 1276.Pp 1277Nodes are responsible for freeing what they allocate. 1278There are three exceptions: 1279.Bl -tag -width xxxx 1280.It 1 1281Mbufs sent across a data link are never to be freed by the sender. In the 1282case of error, they should be considered freed. 1283.It 2 1284Any meta-data information traveling with the data has the same restriction. 1285It might be freed by any node the data passes through, and a 1286.Dv NULL 1287passed onwards, but the caller will never free it. 1288Two macros 1289.Fn NG_FREE_META "meta" 1290and 1291.Fn NG_FREE_M "m" 1292should be used if possible to free data and meta data (see 1293.Pa netgraph.h ) . 1294.It 3 1295Messages sent using 1296.Fn ng_send_message 1297are freed by the recipient. As in the case above, the addresses 1298associated with the message are freed by whatever allocated them so the 1299recipient should copy them if it wants to keep that information. 1300.It 4 1301Both control messages and data are delivered and queued with 1302a netgraph 1303.Em item . 1304The item must be freed using 1305.Fn NG_FREE_ITEM "item" 1306or passed on to another node. 1307.El 1308.Sh FILES 1309.Bl -tag -width xxxxx -compact 1310.It Pa /sys/netgraph/netgraph.h 1311Definitions for use solely within the kernel by 1312.Nm 1313nodes. 1314.It Pa /sys/netgraph/ng_message.h 1315Definitions needed by any file that needs to deal with 1316.Nm 1317messages. 1318.It Pa /sys/netgraph/ng_socket.h 1319Definitions needed to use 1320.Nm 1321socket type nodes. 1322.It Pa /sys/netgraph/ng_{type}.h 1323Definitions needed to use 1324.Nm 1325{type} 1326nodes, including the type cookie definition. 1327.It Pa /boot/kernel/netgraph.ko 1328Netgraph subsystem loadable KLD module. 1329.It Pa /boot/kernel/ng_{type}.ko 1330Loadable KLD module for node type {type}. 1331.It Pa /sys/netgraph/ng_sample.c 1332Skeleton netgraph node. 1333Use this as a starting point for new node types. 1334.El 1335.Sh USER MODE SUPPORT 1336There is a library for supporting user-mode programs that wish 1337to interact with the netgraph system. See 1338.Xr netgraph 3 1339for details. 1340.Pp 1341Two user-mode support programs, 1342.Xr ngctl 8 1343and 1344.Xr nghook 8 , 1345are available to assist manual configuration and debugging. 1346.Pp 1347There are a few useful techniques for debugging new node types. 1348First, implementing new node types in user-mode first 1349makes debugging easier. 1350The 1351.Em tee 1352node type is also useful for debugging, especially in conjunction with 1353.Xr ngctl 8 1354and 1355.Xr nghook 8 . 1356.Pp 1357Also look in /usr/share/examples/netgraph for solutions to several 1358common networking problems, solved using 1359.Nm . 1360.Sh SEE ALSO 1361.Xr socket 2 , 1362.Xr netgraph 3 , 1363.Xr ng_async 4 , 1364.Xr ng_bpf 4 , 1365.Xr ng_bridge 4 , 1366.Xr ng_cisco 4 , 1367.Xr ng_echo 4 , 1368.Xr ng_ether 4 , 1369.Xr ng_frame_relay 4 , 1370.Xr ng_hole 4 , 1371.Xr ng_iface 4 , 1372.Xr ng_ksocket 4 , 1373.Xr ng_lmi 4 , 1374.Xr ng_mppc 4 , 1375.Xr ng_ppp 4 , 1376.Xr ng_pppoe 4 , 1377.Xr ng_pptpgre 4 , 1378.Xr ng_rfc1490 4 , 1379.Xr ng_socket 4 , 1380.Xr ng_tee 4 , 1381.Xr ng_tty 4 , 1382.Xr ng_UI 4 , 1383.Xr ng_vjc 4 , 1384.Xr ngctl 8 , 1385.Xr nghook 8 1386.Sh HISTORY 1387The 1388.Nm 1389system was designed and first implemented at Whistle Communications, Inc.\& 1390in a version of 1391.Fx 2.2 1392customized for the Whistle InterJet. 1393It first made its debut in the main tree in 1394.Fx 3.4 . 1395.Sh AUTHORS 1396.An -nosplit 1397.An Julian Elischer Aq julian@FreeBSD.org , 1398with contributions by 1399.An Archie Cobbs Aq archie@FreeBSD.org . 1400