1.\" Copyright (c) 1996-1999 Whistle Communications, Inc. 2.\" All rights reserved. 3.\" 4.\" Subject to the following obligations and disclaimer of warranty, use and 5.\" redistribution of this software, in source or object code forms, with or 6.\" without modifications are expressly permitted by Whistle Communications; 7.\" provided, however, that: 8.\" 1. Any and all reproductions of the source or object code must include the 9.\" copyright notice above and the following disclaimer of warranties; and 10.\" 2. No rights are granted, in any manner or form, to use Whistle 11.\" Communications, Inc. trademarks, including the mark "WHISTLE 12.\" COMMUNICATIONS" on advertising, endorsements, or otherwise except as 13.\" such appears in the above copyright notice or in the software. 14.\" 15.\" THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND 16.\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO 17.\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE, 18.\" INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF 19.\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. 20.\" WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY 21.\" REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS 22.\" SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE. 23.\" IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES 24.\" RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING 25.\" WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, 26.\" PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR 27.\" SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY 28.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 29.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 30.\" THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY 31.\" OF SUCH DAMAGE. 32.\" 33.\" Authors: Julian Elischer <julian@FreeBSD.org> 34.\" Archie Cobbs <archie@FreeBSD.org> 35.\" 36.\" $Whistle: netgraph.4,v 1.7 1999/01/28 23:54:52 julian Exp $ 37.\" $FreeBSD$ 38.\" 39.Dd November 25, 2013 40.Dt NETGRAPH 4 41.Os 42.Sh NAME 43.Nm netgraph 44.Nd "graph based kernel networking subsystem" 45.Sh DESCRIPTION 46The 47.Nm 48system provides a uniform and modular system for the implementation 49of kernel objects which perform various networking functions. 50The objects, known as 51.Em nodes , 52can be arranged into arbitrarily complicated graphs. 53Nodes have 54.Em hooks 55which are used to connect two nodes together, forming the edges in the graph. 56Nodes communicate along the edges to process data, implement protocols, etc. 57.Pp 58The aim of 59.Nm 60is to supplement rather than replace the existing kernel networking 61infrastructure. 62It provides: 63.Pp 64.Bl -bullet -compact 65.It 66A flexible way of combining protocol and link level drivers. 67.It 68A modular way to implement new protocols. 69.It 70A common framework for kernel entities to inter-communicate. 71.It 72A reasonably fast, kernel-based implementation. 73.El 74.Ss Nodes and Types 75The most fundamental concept in 76.Nm 77is that of a 78.Em node . 79All nodes implement a number of predefined methods which allow them 80to interact with other nodes in a well defined manner. 81.Pp 82Each node has a 83.Em type , 84which is a static property of the node determined at node creation time. 85A node's type is described by a unique 86.Tn ASCII 87type name. 88The type implies what the node does and how it may be connected 89to other nodes. 90.Pp 91In object-oriented language, types are classes, and nodes are instances 92of their respective class. 93All node types are subclasses of the generic node 94type, and hence inherit certain common functionality and capabilities 95(e.g., the ability to have an 96.Tn ASCII 97name). 98.Pp 99Nodes may be assigned a globally unique 100.Tn ASCII 101name which can be 102used to refer to the node. 103The name must not contain the characters 104.Ql .\& 105or 106.Ql \&: , 107and is limited to 108.Dv NG_NODESIZ 109characters (including the terminating 110.Dv NUL 111character). 112.Pp 113Each node instance has a unique 114.Em ID number 115which is expressed as a 32-bit hexadecimal value. 116This value may be used to refer to a node when there is no 117.Tn ASCII 118name assigned to it. 119.Ss Hooks 120Nodes are connected to other nodes by connecting a pair of 121.Em hooks , 122one from each node. 123Data flows bidirectionally between nodes along 124connected pairs of hooks. 125A node may have as many hooks as it 126needs, and may assign whatever meaning it wants to a hook. 127.Pp 128Hooks have these properties: 129.Bl -bullet 130.It 131A hook has an 132.Tn ASCII 133name which is unique among all hooks 134on that node (other hooks on other nodes may have the same name). 135The name must not contain the characters 136.Ql .\& 137or 138.Ql \&: , 139and is 140limited to 141.Dv NG_HOOKSIZ 142characters (including the terminating 143.Dv NUL 144character). 145.It 146A hook is always connected to another hook. 147That is, hooks are 148created at the time they are connected, and breaking an edge by 149removing either hook destroys both hooks. 150.It 151A hook can be set into a state where incoming packets are always queued 152by the input queueing system, rather than being delivered directly. 153This can be used when the data is sent from an interrupt handler, 154and processing must be quick so as not to block other interrupts. 155.It 156A hook may supply overriding receive data and receive message functions, 157which should be used for data and messages received through that hook 158in preference to the general node-wide methods. 159.El 160.Pp 161A node may decide to assign special meaning to some hooks. 162For example, connecting to the hook named 163.Va debug 164might trigger 165the node to start sending debugging information to that hook. 166.Ss Data Flow 167Two types of information flow between nodes: data messages and 168control messages. 169Data messages are passed in 170.Vt mbuf chains 171along the edges 172in the graph, one edge at a time. 173The first 174.Vt mbuf 175in a chain must have the 176.Dv M_PKTHDR 177flag set. 178Each node decides how to handle data received through one of its hooks. 179.Pp 180Along with data, nodes can also receive control messages. 181There are generic and type-specific control messages. 182Control messages have a common 183header format, followed by type-specific data, and are binary structures 184for efficiency. 185However, node types may also support conversion of the 186type-specific data between binary and 187.Tn ASCII 188formats, 189for debugging and human interface purposes (see the 190.Dv NGM_ASCII2BINARY 191and 192.Dv NGM_BINARY2ASCII 193generic control messages below). 194Nodes are not required to support these conversions. 195.Pp 196There are three ways to address a control message. 197If there is a sequence of edges connecting the two nodes, the message 198may be 199.Dq source routed 200by specifying the corresponding sequence 201of 202.Tn ASCII 203hook names as the destination address for the message (relative 204addressing). 205If the destination is adjacent to the source, then the source 206node may simply specify (as a pointer in the code) the hook across which the 207message should be sent. 208Otherwise, the recipient node's global 209.Tn ASCII 210name 211(or equivalent ID-based name) is used as the destination address 212for the message (absolute addressing). 213The two types of 214.Tn ASCII 215addressing 216may be combined, by specifying an absolute start node and a sequence 217of hooks. 218Only the 219.Tn ASCII 220addressing modes are available to control programs outside the kernel; 221use of direct pointers is limited to kernel modules. 222.Pp 223Messages often represent commands that are followed by a reply message 224in the reverse direction. 225To facilitate this, the recipient of a 226control message is supplied with a 227.Dq return address 228that is suitable for addressing a reply. 229.Pp 230Each control message contains a 32-bit value, called a 231.Dq typecookie , 232indicating the type of the message, i.e.\& how to interpret it. 233Typically each type defines a unique typecookie for the messages 234that it understands. 235However, a node may choose to recognize and 236implement more than one type of messages. 237.Pp 238If a message is delivered to an address that implies that it arrived 239at that node through a particular hook (as opposed to having been directly 240addressed using its ID or global name) then that hook is identified to the 241receiving node. 242This allows a message to be re-routed or passed on, should 243a node decide that this is required, in much the same way that data packets 244are passed around between nodes. 245A set of standard 246messages for flow control and link management purposes are 247defined by the base system that are usually 248passed around in this manner. 249Flow control message would usually travel 250in the opposite direction to the data to which they pertain. 251.Ss Netgraph is (Usually) Functional 252In order to minimize latency, most 253.Nm 254operations are functional. 255That is, data and control messages are delivered by making function 256calls rather than by using queues and mailboxes. 257For example, if node 258A wishes to send a data 259.Vt mbuf 260to neighboring node B, it calls the 261generic 262.Nm 263data delivery function. 264This function in turn locates 265node B and calls B's 266.Dq receive data 267method. 268There are exceptions to this. 269.Pp 270Each node has an input queue, and some operations can be considered to 271be 272.Em writers 273in that they alter the state of the node. 274Obviously, in an SMP 275world it would be bad if the state of a node were changed while another 276data packet were transiting the node. 277For this purpose, the input queue implements a 278.Em reader/writer 279semantic so that when there is a writer in the node, all other requests 280are queued, and while there are readers, a writer, and any following 281packets are queued. 282In the case where there is no reason to queue the 283data, the input method is called directly, as mentioned above. 284.Pp 285A node may declare that all requests should be considered as writers, 286or that requests coming in over a particular hook should be considered to 287be a writer, or even that packets leaving or entering across a particular 288hook should always be queued, rather than delivered directly (often useful 289for interrupt routines who want to get back to the hardware quickly). 290By default, all control message packets are considered to be writers 291unless specifically declared to be a reader in their definition. 292(See 293.Dv NGM_READONLY 294in 295.In netgraph/ng_message.h . ) 296.Pp 297While this mode of operation 298results in good performance, it has a few implications for node 299developers: 300.Bl -bullet 301.It 302Whenever a node delivers a data or control message, the node 303may need to allow for the possibility of receiving a returning 304message before the original delivery function call returns. 305.It 306.Nm Netgraph 307provides internal synchronization between nodes. 308Data always enters a 309.Dq graph 310at an 311.Em edge node . 312An 313.Em edge node 314is a node that interfaces between 315.Nm 316and some other part of the system. 317Examples of 318.Dq edge nodes 319include device drivers, the 320.Vt socket , ether , tty , 321and 322.Vt ksocket 323node type. 324In these 325.Em edge nodes , 326the calling thread directly executes code in the node, and from that code 327calls upon the 328.Nm 329framework to deliver data across some edge 330in the graph. 331From an execution point of view, the calling thread will execute the 332.Nm 333framework methods, and if it can acquire a lock to do so, 334the input methods of the next node. 335This continues until either the data is discarded or queued for some 336device or system entity, or the thread is unable to acquire a lock on 337the next node. 338In that case, the data is queued for the node, and execution rewinds 339back to the original calling entity. 340The queued data will be picked up and processed by either the current 341holder of the lock when they have completed their operations, or by 342a special 343.Nm 344thread that is activated when there are such items 345queued. 346.It 347It is possible for an infinite loop to occur if the graph contains cycles. 348.El 349.Pp 350So far, these issues have not proven problematical in practice. 351.Ss Interaction with Other Parts of the Kernel 352A node may have a hidden interaction with other components of the 353kernel outside of the 354.Nm 355subsystem, such as device hardware, 356kernel protocol stacks, etc. 357In fact, one of the benefits of 358.Nm 359is the ability to join disparate kernel networking entities together in a 360consistent communication framework. 361.Pp 362An example is the 363.Vt socket 364node type which is both a 365.Nm 366node and a 367.Xr socket 2 368in the protocol family 369.Dv PF_NETGRAPH . 370Socket nodes allow user processes to participate in 371.Nm . 372Other nodes communicate with socket nodes using the usual methods, and the 373node hides the fact that it is also passing information to and from a 374cooperating user process. 375.Pp 376Another example is a device driver that presents 377a node interface to the hardware. 378.Ss Node Methods 379Nodes are notified of the following actions via function calls 380to the following node methods, 381and may accept or reject that action (by returning the appropriate 382error code): 383.Bl -tag -width 2n 384.It Creation of a new node 385The constructor for the type is called. 386If creation of a new node is allowed, constructor method may allocate any 387special resources it needs. 388For nodes that correspond to hardware, this is typically done during the 389device attach routine. 390Often a global 391.Tn ASCII 392name corresponding to the 393device name is assigned here as well. 394.It Creation of a new hook 395The hook is created and tentatively 396linked to the node, and the node is told about the name that will be 397used to describe this hook. 398The node sets up any special data structures 399it needs, or may reject the connection, based on the name of the hook. 400.It Successful connection of two hooks 401After both ends have accepted their 402hooks, and the links have been made, the nodes get a chance to 403find out who their peer is across the link, and can then decide to reject 404the connection. 405Tear-down is automatic. 406This is also the time at which 407a node may decide whether to set a particular hook (or its peer) into 408the 409.Em queueing 410mode. 411.It Destruction of a hook 412The node is notified of a broken connection. 413The node may consider some hooks 414to be critical to operation and others to be expendable: the disconnection 415of one hook may be an acceptable event while for another it 416may effect a total shutdown for the node. 417.It Preshutdown of a node 418This method is called before real shutdown, which is discussed below. 419While in this method, the node is fully operational and can send a 420.Dq goodbye 421message to its peers, or it can exclude itself from the chain and reconnect 422its peers together, like the 423.Xr ng_tee 4 424node type does. 425.It Shutdown of a node 426This method allows a node to clean up 427and to ensure that any actions that need to be performed 428at this time are taken. 429The method is called by the generic (i.e., superclass) 430node destructor which will get rid of the generic components of the node. 431Some nodes (usually associated with a piece of hardware) may be 432.Em persistent 433in that a shutdown breaks all edges and resets the node, 434but does not remove it. 435In this case, the shutdown method should not 436free its resources, but rather, clean up and then call the 437.Fn NG_NODE_REVIVE 438macro to signal the generic code that the shutdown is aborted. 439In the case where the shutdown is started by the node itself due to hardware 440removal or unloading (via 441.Fn ng_rmnode_self ) , 442it should set the 443.Dv NGF_REALLY_DIE 444flag to signal to its own shutdown method that it is not to persist. 445.El 446.Ss Sending and Receiving Data 447Two other methods are also supported by all nodes: 448.Bl -tag -width 2n 449.It Receive data message 450A 451.Nm 452.Em queueable request item , 453usually referred to as an 454.Em item , 455is received by this function. 456The item contains a pointer to an 457.Vt mbuf . 458.Pp 459The node is notified on which hook the item has arrived, 460and can use this information in its processing decision. 461The receiving node must always 462.Fn NG_FREE_M 463the 464.Vt mbuf chain 465on completion or error, or pass it on to another node 466(or kernel module) which will then be responsible for freeing it. 467Similarly, the 468.Em item 469must be freed if it is not to be passed on to another node, by using the 470.Fn NG_FREE_ITEM 471macro. 472If the item still holds references to 473.Vt mbufs 474at the time of 475freeing then they will also be appropriately freed. 476Therefore, if there is any chance that the 477.Vt mbuf 478will be 479changed or freed separately from the item, it is very important 480that it be retrieved using the 481.Fn NGI_GET_M 482macro that also removes the reference within the item. 483(Or multiple frees of the same object will occur.) 484.Pp 485If it is only required to examine the contents of the 486.Vt mbufs , 487then it is possible to use the 488.Fn NGI_M 489macro to both read and rewrite 490.Vt mbuf 491pointer inside the item. 492.Pp 493If developer needs to pass any meta information along with the 494.Vt mbuf chain , 495he should use 496.Xr mbuf_tags 9 497framework. 498.Bf -symbolic 499Note that old 500.Nm 501specific meta-data format is obsoleted now. 502.Ef 503.Pp 504The receiving node may decide to defer the data by queueing it in the 505.Nm 506NETISR system (see below). 507It achieves this by setting the 508.Dv HK_QUEUE 509flag in the flags word of the hook on which that data will arrive. 510The infrastructure will respect that bit and queue the data for delivery at 511a later time, rather than deliver it directly. 512A node may decide to set 513the bit on the 514.Em peer 515node, so that its own output packets are queued. 516.Pp 517The node may elect to nominate a different receive data function 518for data received on a particular hook, to simplify coding. 519It uses the 520.Fn NG_HOOK_SET_RCVDATA hook fn 521macro to do this. 522The function receives the same arguments in every way 523other than it will receive all (and only) packets from that hook. 524.It Receive control message 525This method is called when a control message is addressed to the node. 526As with the received data, an 527.Em item 528is received, with a pointer to the control message. 529The message can be examined using the 530.Fn NGI_MSG 531macro, or completely extracted from the item using the 532.Fn NGI_GET_MSG 533which also removes the reference within the item. 534If the item still holds a reference to the message when it is freed 535(using the 536.Fn NG_FREE_ITEM 537macro), then the message will also be freed appropriately. 538If the 539reference has been removed, the node must free the message itself using the 540.Fn NG_FREE_MSG 541macro. 542A return address is always supplied, giving the address of the node 543that originated the message so a reply message can be sent anytime later. 544The return address is retrieved from the 545.Em item 546using the 547.Fn NGI_RETADDR 548macro and is of type 549.Vt ng_ID_t . 550All control messages and replies are 551allocated with the 552.Xr malloc 9 553type 554.Dv M_NETGRAPH_MSG , 555however it is more convenient to use the 556.Fn NG_MKMESSAGE 557and 558.Fn NG_MKRESPONSE 559macros to allocate and fill out a message. 560Messages must be freed using the 561.Fn NG_FREE_MSG 562macro. 563.Pp 564If the message was delivered via a specific hook, that hook will 565also be made known, which allows the use of such things as flow-control 566messages, and status change messages, where the node may want to forward 567the message out another hook to that on which it arrived. 568.Pp 569The node may elect to nominate a different receive message function 570for messages received on a particular hook, to simplify coding. 571It uses the 572.Fn NG_HOOK_SET_RCVMSG hook fn 573macro to do this. 574The function receives the same arguments in every way 575other than it will receive all (and only) messages from that hook. 576.El 577.Pp 578Much use has been made of reference counts, so that nodes being 579freed of all references are automatically freed, and this behaviour 580has been tested and debugged to present a consistent and trustworthy 581framework for the 582.Dq type module 583writer to use. 584.Ss Addressing 585The 586.Nm 587framework provides an unambiguous and simple to use method of specifically 588addressing any single node in the graph. 589The naming of a node is 590independent of its type, in that another node, or external component 591need not know anything about the node's type in order to address it so as 592to send it a generic message type. 593Node and hook names should be 594chosen so as to make addresses meaningful. 595.Pp 596Addresses are either absolute or relative. 597An absolute address begins 598with a node name or ID, followed by a colon, followed by a sequence of hook 599names separated by periods. 600This addresses the node reached by starting 601at the named node and following the specified sequence of hooks. 602A relative address includes only the sequence of hook names, implicitly 603starting hook traversal at the local node. 604.Pp 605There are a couple of special possibilities for the node name. 606The name 607.Ql .\& 608(referred to as 609.Ql .: ) 610always refers to the local node. 611Also, nodes that have no global name may be addressed by their ID numbers, 612by enclosing the hexadecimal representation of the ID number within 613the square brackets. 614Here are some examples of valid 615.Nm 616addresses: 617.Bd -literal -offset indent 618\&.: 619[3f]: 620foo: 621\&.:hook1 622foo:hook1.hook2 623[d80]:hook1 624.Ed 625.Pp 626The following set of nodes might be created for a site with 627a single physical frame relay line having two active logical DLCI channels, 628with RFC 1490 frames on DLCI 16 and PPP frames over DLCI 20: 629.Bd -literal 630[type SYNC ] [type FRAME] [type RFC1490] 631[ "Frame1" ](uplink)<-->(data)[<un-named>](dlci16)<-->(mux)[<un-named> ] 632[ A ] [ B ](dlci20)<---+ [ C ] 633 | 634 | [ type PPP ] 635 +>(mux)[<un-named>] 636 [ D ] 637.Ed 638.Pp 639One could always send a control message to node C from anywhere 640by using the name 641.Dq Li Frame1:uplink.dlci16 . 642In this case, node C would also be notified that the message 643reached it via its hook 644.Va mux . 645Similarly, 646.Dq Li Frame1:uplink.dlci20 647could reliably be used to reach node D, and node A could refer 648to node B as 649.Dq Li .:uplink , 650or simply 651.Dq Li uplink . 652Conversely, B can refer to A as 653.Dq Li data . 654The address 655.Dq Li mux.data 656could be used by both nodes C and D to address a message to node A. 657.Pp 658Note that this is only for 659.Em control messages . 660In each of these cases, where a relative addressing mode is 661used, the recipient is notified of the hook on which the 662message arrived, as well as 663the originating node. 664This allows the option of hop-by-hop distribution of messages and 665state information. 666Data messages are 667.Em only 668routed one hop at a time, by specifying the departing 669hook, with each node making 670the next routing decision. 671So when B receives a frame on hook 672.Va data , 673it decodes the frame relay header to determine the DLCI, 674and then forwards the unwrapped frame to either C or D. 675.Pp 676In a similar way, flow control messages may be routed in the reverse 677direction to outgoing data. 678For example a 679.Dq "buffer nearly full" 680message from 681.Dq Li Frame1: 682would be passed to node B 683which might decide to send similar messages to both nodes 684C and D. 685The nodes would use 686.Em "direct hook pointer" 687addressing to route the messages. 688The message may have travelled from 689.Dq Li Frame1: 690to B 691as a synchronous reply, saving time and cycles. 692.Ss Netgraph Structures 693Structures are defined in 694.In netgraph/netgraph.h 695(for kernel structures only of interest to nodes) 696and 697.In netgraph/ng_message.h 698(for message definitions also of interest to user programs). 699.Pp 700The two basic object types that are of interest to node authors are 701.Em nodes 702and 703.Em hooks . 704These two objects have the following 705properties that are also of interest to the node writers. 706.Bl -tag -width 2n 707.It Vt "struct ng_node" 708Node authors should always use the following 709.Ic typedef 710to declare 711their pointers, and should never actually declare the structure. 712.Pp 713.Fd "typedef struct ng_node *node_p;" 714.Pp 715The following properties are associated with a node, and can be 716accessed in the following manner: 717.Bl -tag -width 2n 718.It Validity 719A driver or interrupt routine may want to check whether 720the node is still valid. 721It is assumed that the caller holds a reference 722on the node so it will not have been freed, however it may have been 723disabled or otherwise shut down. 724Using the 725.Fn NG_NODE_IS_VALID node 726macro will return this state. 727Eventually it should be almost impossible 728for code to run in an invalid node but at this time that work has not been 729completed. 730.It Node ID Pq Vt ng_ID_t 731This property can be retrieved using the macro 732.Fn NG_NODE_ID node . 733.It Node name 734Optional globally unique name, 735.Dv NUL 736terminated string. 737If there 738is a value in here, it is the name of the node. 739.Bd -literal -offset indent 740if (NG_NODE_NAME(node)[0] != '\e0') ... 741 742if (strcmp(NG_NODE_NAME(node), "fred") == 0) ... 743.Ed 744.It A node dependent opaque cookie 745Anything of the pointer type can be placed here. 746The macros 747.Fn NG_NODE_SET_PRIVATE node value 748and 749.Fn NG_NODE_PRIVATE node 750set and retrieve this property, respectively. 751.It Number of hooks 752The 753.Fn NG_NODE_NUMHOOKS node 754macro is used 755to retrieve this value. 756.It Hooks 757The node may have a number of hooks. 758A traversal method is provided to allow all the hooks to be 759tested for some condition. 760.Fn NG_NODE_FOREACH_HOOK node fn arg rethook 761where 762.Fa fn 763is a function that will be called for each hook 764with the form 765.Fn fn hook arg 766and returning 0 to terminate the search. 767If the search is terminated, then 768.Fa rethook 769will be set to the hook at which the search was terminated. 770.El 771.It Vt "struct ng_hook" 772Node authors should always use the following 773.Ic typedef 774to declare 775their hook pointers. 776.Pp 777.Fd "typedef struct ng_hook *hook_p;" 778.Pp 779The following properties are associated with a hook, and can be 780accessed in the following manner: 781.Bl -tag -width 2n 782.It A hook dependent opaque cookie 783Anything of the pointer type can be placed here. 784The macros 785.Fn NG_HOOK_SET_PRIVATE hook value 786and 787.Fn NG_HOOK_PRIVATE hook 788set and retrieve this property, respectively. 789.It \&An associate node 790The macro 791.Fn NG_HOOK_NODE hook 792finds the associated node. 793.It A peer hook Pq Vt hook_p 794The other hook in this connected pair. 795The 796.Fn NG_HOOK_PEER hook 797macro finds the peer. 798.It References 799The 800.Fn NG_HOOK_REF hook 801and 802.Fn NG_HOOK_UNREF hook 803macros 804increment and decrement the hook reference count accordingly. 805After decrement you should always assume the hook has been freed 806unless you have another reference still valid. 807.It Override receive functions 808The 809.Fn NG_HOOK_SET_RCVDATA hook fn 810and 811.Fn NG_HOOK_SET_RCVMSG hook fn 812macros can be used to set override methods that will be used in preference 813to the generic receive data and receive message functions. 814To unset these, use the macros to set them to 815.Dv NULL . 816They will only be used for data and 817messages received on the hook on which they are set. 818.El 819.Pp 820The maintenance of the names, reference counts, and linked list 821of hooks for each node is handled automatically by the 822.Nm 823subsystem. 824Typically a node's private info contains a back-pointer to the node or hook 825structure, which counts as a new reference that must be included 826in the reference count for the node. 827When the node constructor is called, 828there is already a reference for this calculated in, so that 829when the node is destroyed, it should remember to do a 830.Fn NG_NODE_UNREF 831on the node. 832.Pp 833From a hook you can obtain the corresponding node, and from 834a node, it is possible to traverse all the active hooks. 835.Pp 836A current example of how to define a node can always be seen in 837.Pa src/sys/netgraph/ng_sample.c 838and should be used as a starting point for new node writers. 839.El 840.Ss Netgraph Message Structure 841Control messages have the following structure: 842.Bd -literal 843#define NG_CMDSTRSIZ 32 /* Max command string (including null) */ 844 845struct ng_mesg { 846 struct ng_msghdr { 847 u_char version; /* Must equal NG_VERSION */ 848 u_char spare; /* Pad to 4 bytes */ 849 uint16_t spare2; 850 uint32_t arglen; /* Length of cmd/resp data */ 851 uint32_t cmd; /* Command identifier */ 852 uint32_t flags; /* Message status flags */ 853 uint32_t token; /* Reply should have the same token */ 854 uint32_t typecookie; /* Node type understanding this message */ 855 u_char cmdstr[NG_CMDSTRSIZ]; /* cmd string + \0 */ 856 } header; 857 char data[]; /* placeholder for actual data */ 858}; 859 860#define NG_ABI_VERSION 12 /* Netgraph kernel ABI version */ 861#define NG_VERSION 8 /* Netgraph message version */ 862#define NGF_ORIG 0x00000000 /* The msg is the original request */ 863#define NGF_RESP 0x00000001 /* The message is a response */ 864.Ed 865.Pp 866Control messages have the fixed header shown above, followed by a 867variable length data section which depends on the type cookie 868and the command. 869Each field is explained below: 870.Bl -tag -width indent 871.It Va version 872Indicates the version of the 873.Nm 874message protocol itself. 875The current version is 876.Dv NG_VERSION . 877.It Va arglen 878This is the length of any extra arguments, which begin at 879.Va data . 880.It Va flags 881Indicates whether this is a command or a response control message. 882.It Va token 883The 884.Va token 885is a means by which a sender can match a reply message to the 886corresponding command message; the reply always has the same token. 887.It Va typecookie 888The corresponding node type's unique 32-bit value. 889If a node does not recognize the type cookie it must reject the message 890by returning 891.Er EINVAL . 892.Pp 893Each type should have an include file that defines the commands, 894argument format, and cookie for its own messages. 895The typecookie 896ensures that the same header file was included by both sender and 897receiver; when an incompatible change in the header file is made, 898the typecookie 899.Em must 900be changed. 901The de-facto method for generating unique type cookies is to take the 902seconds from the Epoch at the time the header file is written 903(i.e., the output of 904.Dq Nm date Fl u Li +%s ) . 905.Pp 906There is a predefined typecookie 907.Dv NGM_GENERIC_COOKIE 908for the 909.Vt generic 910node type, and 911a corresponding set of generic messages which all nodes understand. 912The handling of these messages is automatic. 913.It Va cmd 914The identifier for the message command. 915This is type specific, 916and is defined in the same header file as the typecookie. 917.It Va cmdstr 918Room for a short human readable version of 919.Va command 920(for debugging purposes only). 921.El 922.Pp 923Some modules may choose to implement messages from more than one 924of the header files and thus recognize more than one type cookie. 925.Ss Control Message ASCII Form 926Control messages are in binary format for efficiency. 927However, for 928debugging and human interface purposes, and if the node type supports 929it, control messages may be converted to and from an equivalent 930.Tn ASCII 931form. 932The 933.Tn ASCII 934form is similar to the binary form, with two exceptions: 935.Bl -enum 936.It 937The 938.Va cmdstr 939header field must contain the 940.Tn ASCII 941name of the command, corresponding to the 942.Va cmd 943header field. 944.It 945The arguments field contains a 946.Dv NUL Ns 947-terminated 948.Tn ASCII 949string version of the message arguments. 950.El 951.Pp 952In general, the arguments field of a control message can be any 953arbitrary C data type. 954.Nm Netgraph 955includes parsing routines to support 956some pre-defined datatypes in 957.Tn ASCII 958with this simple syntax: 959.Bl -bullet 960.It 961Integer types are represented by base 8, 10, or 16 numbers. 962.It 963Strings are enclosed in double quotes and respect the normal 964C language backslash escapes. 965.It 966IP addresses have the obvious form. 967.It 968Arrays are enclosed in square brackets, with the elements listed 969consecutively starting at index zero. 970An element may have an optional index and equals sign 971.Pq Ql = 972preceding it. 973Whenever an element 974does not have an explicit index, the index is implicitly the previous 975element's index plus one. 976.It 977Structures are enclosed in curly braces, and each field is specified 978in the form 979.Ar fieldname Ns = Ns Ar value . 980.It 981Any array element or structure field whose value is equal to its 982.Dq default value 983may be omitted. 984For integer types, the default value 985is usually zero; for string types, the empty string. 986.It 987Array elements and structure fields may be specified in any order. 988.El 989.Pp 990Each node type may define its own arbitrary types by providing 991the necessary routines to parse and unparse. 992.Tn ASCII 993forms defined 994for a specific node type are documented in the corresponding man page. 995.Ss Generic Control Messages 996There are a number of standard predefined messages that will work 997for any node, as they are supported directly by the framework itself. 998These are defined in 999.In netgraph/ng_message.h 1000along with the basic layout of messages and other similar information. 1001.Bl -tag -width indent 1002.It Dv NGM_CONNECT 1003Connect to another node, using the supplied hook names on either end. 1004.It Dv NGM_MKPEER 1005Construct a node of the given type and then connect to it using the 1006supplied hook names. 1007.It Dv NGM_SHUTDOWN 1008The target node should disconnect from all its neighbours and shut down. 1009Persistent nodes such as those representing physical hardware 1010might not disappear from the node namespace, but only reset themselves. 1011The node must disconnect all of its hooks. 1012This may result in neighbors shutting themselves down, and possibly a 1013cascading shutdown of the entire connected graph. 1014.It Dv NGM_NAME 1015Assign a name to a node. 1016Nodes can exist without having a name, and this 1017is the default for nodes created using the 1018.Dv NGM_MKPEER 1019method. 1020Such nodes can only be addressed relatively or by their ID number. 1021.It Dv NGM_RMHOOK 1022Ask the node to break a hook connection to one of its neighbours. 1023Both nodes will have their 1024.Dq disconnect 1025method invoked. 1026Either node may elect to totally shut down as a result. 1027.It Dv NGM_NODEINFO 1028Asks the target node to describe itself. 1029The four returned fields 1030are the node name (if named), the node type, the node ID and the 1031number of hooks attached. 1032The ID is an internal number unique to that node. 1033.It Dv NGM_LISTHOOKS 1034This returns the information given by 1035.Dv NGM_NODEINFO , 1036but in addition 1037includes an array of fields describing each link, and the description for 1038the node at the far end of that link. 1039.It Dv NGM_LISTNAMES 1040This returns an array of node descriptions (as for 1041.Dv NGM_NODEINFO ) 1042where each entry of the array describes a named node. 1043All named nodes will be described. 1044.It Dv NGM_LISTNODES 1045This is the same as 1046.Dv NGM_LISTNAMES 1047except that all nodes are listed regardless of whether they have a name or not. 1048.It Dv NGM_LISTTYPES 1049This returns a list of all currently installed 1050.Nm 1051types. 1052.It Dv NGM_TEXT_STATUS 1053The node may return a text formatted status message. 1054The status information is determined entirely by the node type. 1055It is the only 1056.Dq generic 1057message 1058that requires any support within the node itself and as such the node may 1059elect to not support this message. 1060The text response must be less than 1061.Dv NG_TEXTRESPONSE 1062bytes in length (presently 1024). 1063This can be used to return general 1064status information in human readable form. 1065.It Dv NGM_BINARY2ASCII 1066This message converts a binary control message to its 1067.Tn ASCII 1068form. 1069The entire control message to be converted is contained within the 1070arguments field of the 1071.Dv NGM_BINARY2ASCII 1072message itself. 1073If successful, the reply will contain the same control 1074message in 1075.Tn ASCII 1076form. 1077A node will typically only know how to translate messages that it 1078itself understands, so the target node of the 1079.Dv NGM_BINARY2ASCII 1080is often the same node that would actually receive that message. 1081.It Dv NGM_ASCII2BINARY 1082The opposite of 1083.Dv NGM_BINARY2ASCII . 1084The entire control message to be converted, in 1085.Tn ASCII 1086form, is contained 1087in the arguments section of the 1088.Dv NGM_ASCII2BINARY 1089and need only have the 1090.Va flags , cmdstr , 1091and 1092.Va arglen 1093header fields filled in, plus the 1094.Dv NUL Ns 1095-terminated string version of 1096the arguments in the arguments field. 1097If successful, the reply 1098contains the binary version of the control message. 1099.El 1100.Ss Flow Control Messages 1101In addition to the control messages that affect nodes with respect to the 1102graph, there are also a number of 1103.Em flow control 1104messages defined. 1105At present these are 1106.Em not 1107handled automatically by the system, so 1108nodes need to handle them if they are going to be used in a graph utilising 1109flow control, and will be in the likely path of these messages. 1110The default action of a node that does not understand these messages should 1111be to pass them onto the next node. 1112Hopefully some helper functions will assist in this eventually. 1113These messages are also defined in 1114.In netgraph/ng_message.h 1115and have a separate cookie 1116.Dv NG_FLOW_COOKIE 1117to help identify them. 1118They will not be covered in depth here. 1119.Sh INITIALIZATION 1120The base 1121.Nm 1122code may either be statically compiled 1123into the kernel or else loaded dynamically as a KLD via 1124.Xr kldload 8 . 1125In the former case, include 1126.Pp 1127.D1 Cd "options NETGRAPH" 1128.Pp 1129in your kernel configuration file. 1130You may also include selected 1131node types in the kernel compilation, for example: 1132.Pp 1133.D1 Cd "options NETGRAPH" 1134.D1 Cd "options NETGRAPH_SOCKET" 1135.D1 Cd "options NETGRAPH_ECHO" 1136.Pp 1137Once the 1138.Nm 1139subsystem is loaded, individual node types may be loaded at any time 1140as KLD modules via 1141.Xr kldload 8 . 1142Moreover, 1143.Nm 1144knows how to automatically do this; when a request to create a new 1145node of unknown type 1146.Ar type 1147is made, 1148.Nm 1149will attempt to load the KLD module 1150.Pa ng_ Ns Ao Ar type Ac Ns Pa .ko . 1151.Pp 1152Types can also be installed at boot time, as certain device drivers 1153may want to export each instance of the device as a 1154.Nm 1155node. 1156.Pp 1157In general, new types can be installed at any time from within the 1158kernel by calling 1159.Fn ng_newtype , 1160supplying a pointer to the type's 1161.Vt "struct ng_type" 1162structure. 1163.Pp 1164The 1165.Fn NETGRAPH_INIT 1166macro automates this process by using a linker set. 1167.Sh EXISTING NODE TYPES 1168Several node types currently exist. 1169Each is fully documented in its own man page: 1170.Bl -tag -width indent 1171.It SOCKET 1172The socket type implements two new sockets in the new protocol domain 1173.Dv PF_NETGRAPH . 1174The new sockets protocols are 1175.Dv NG_DATA 1176and 1177.Dv NG_CONTROL , 1178both of type 1179.Dv SOCK_DGRAM . 1180Typically one of each is associated with a socket node. 1181When both sockets have closed, the node will shut down. 1182The 1183.Dv NG_DATA 1184socket is used for sending and receiving data, while the 1185.Dv NG_CONTROL 1186socket is used for sending and receiving control messages. 1187Data and control messages are passed using the 1188.Xr sendto 2 1189and 1190.Xr recvfrom 2 1191system calls, using a 1192.Vt "struct sockaddr_ng" 1193socket address. 1194.It HOLE 1195Responds only to generic messages and is a 1196.Dq black hole 1197for data. 1198Useful for testing. 1199Always accepts new hooks. 1200.It ECHO 1201Responds only to generic messages and always echoes data back through the 1202hook from which it arrived. 1203Returns any non-generic messages as their own response. 1204Useful for testing. 1205Always accepts new hooks. 1206.It TEE 1207This node is useful for 1208.Dq snooping . 1209It has 4 hooks: 1210.Va left , right , left2right , 1211and 1212.Va right2left . 1213Data entering from the 1214.Va right 1215is passed to the 1216.Va left 1217and duplicated on 1218.Va right2left , 1219and data entering from the 1220.Va left 1221is passed to the 1222.Va right 1223and duplicated on 1224.Va left2right . 1225Data entering from 1226.Va left2right 1227is sent to the 1228.Va right 1229and data from 1230.Va right2left 1231to 1232.Va left . 1233.It RFC1490 MUX 1234Encapsulates/de-encapsulates frames encoded according to RFC 1490. 1235Has a hook for the encapsulated packets 1236.Pq Va downstream 1237and one hook 1238for each protocol (i.e., IP, PPP, etc.). 1239.It FRAME RELAY MUX 1240Encapsulates/de-encapsulates Frame Relay frames. 1241Has a hook for the encapsulated packets 1242.Pq Va downstream 1243and one hook 1244for each DLCI. 1245.It FRAME RELAY LMI 1246Automatically handles frame relay 1247.Dq LMI 1248(link management interface) operations and packets. 1249Automatically probes and detects which of several LMI standards 1250is in use at the exchange. 1251.It TTY 1252This node is also a line discipline. 1253It simply converts between 1254.Vt mbuf 1255frames and sequential serial data, allowing a TTY to appear as a 1256.Nm 1257node. 1258It has a programmable 1259.Dq hotkey 1260character. 1261.It ASYNC 1262This node encapsulates and de-encapsulates asynchronous frames 1263according to RFC 1662. 1264This is used in conjunction with the TTY node 1265type for supporting PPP links over asynchronous serial lines. 1266.It ETHERNET 1267This node is attached to every Ethernet interface in the system. 1268It allows capturing raw Ethernet frames from the network, as well as 1269sending frames out of the interface. 1270.It INTERFACE 1271This node is also a system networking interface. 1272It has hooks representing each protocol family (IP, IPv6) 1273and appears in the output of 1274.Xr ifconfig 8 . 1275The interfaces are named 1276.Dq Li ng0 , 1277.Dq Li ng1 , 1278etc. 1279.It ONE2MANY 1280This node implements a simple round-robin multiplexer. 1281It can be used 1282for example to make several LAN ports act together to get a higher speed 1283link between two machines. 1284.It Various PPP related nodes 1285There is a full multilink PPP implementation that runs in 1286.Nm . 1287The 1288.Pa net/mpd5 1289port can use these modules to make a very low latency high 1290capacity PPP system. 1291It also supports 1292.Tn PPTP 1293VPNs using the PPTP node. 1294.It PPPOE 1295A server and client side implementation of PPPoE. 1296Used in conjunction with 1297either 1298.Xr ppp 8 1299or the 1300.Pa net/mpd5 1301port. 1302.It BRIDGE 1303This node, together with the Ethernet nodes, allows a very flexible 1304bridging system to be implemented. 1305.It KSOCKET 1306This intriguing node looks like a socket to the system but diverts 1307all data to and from the 1308.Nm 1309system for further processing. 1310This allows 1311such things as UDP tunnels to be almost trivially implemented from the 1312command line. 1313.El 1314.Pp 1315Refer to the section at the end of this man page for more nodes types. 1316.Sh NOTES 1317Whether a named node exists can be checked by trying to send a control message 1318to it (e.g., 1319.Dv NGM_NODEINFO ) . 1320If it does not exist, 1321.Er ENOENT 1322will be returned. 1323.Pp 1324All data messages are 1325.Vt mbuf chains 1326with the 1327.Dv M_PKTHDR 1328flag set. 1329.Pp 1330Nodes are responsible for freeing what they allocate. 1331There are three exceptions: 1332.Bl -enum 1333.It 1334.Vt Mbufs 1335sent across a data link are never to be freed by the sender. 1336In the 1337case of error, they should be considered freed. 1338.It 1339Messages sent using one of 1340.Fn NG_SEND_MSG_* 1341family macros are freed by the recipient. 1342As in the case above, the addresses 1343associated with the message are freed by whatever allocated them so the 1344recipient should copy them if it wants to keep that information. 1345.It 1346Both control messages and data are delivered and queued with a 1347.Nm 1348.Em item . 1349The item must be freed using 1350.Fn NG_FREE_ITEM item 1351or passed on to another node. 1352.El 1353.Sh FILES 1354.Bl -tag -width indent 1355.It In netgraph/netgraph.h 1356Definitions for use solely within the kernel by 1357.Nm 1358nodes. 1359.It In netgraph/ng_message.h 1360Definitions needed by any file that needs to deal with 1361.Nm 1362messages. 1363.It In netgraph/ng_socket.h 1364Definitions needed to use 1365.Nm 1366.Vt socket 1367type nodes. 1368.It In netgraph/ng_ Ns Ao Ar type Ac Ns Pa .h 1369Definitions needed to use 1370.Nm 1371.Ar type 1372nodes, including the type cookie definition. 1373.It Pa /boot/kernel/netgraph.ko 1374The 1375.Nm 1376subsystem loadable KLD module. 1377.It Pa /boot/kernel/ng_ Ns Ao Ar type Ac Ns Pa .ko 1378Loadable KLD module for node type 1379.Ar type . 1380.It Pa src/sys/netgraph/ng_sample.c 1381Skeleton 1382.Nm 1383node. 1384Use this as a starting point for new node types. 1385.El 1386.Sh USER MODE SUPPORT 1387There is a library for supporting user-mode programs that wish 1388to interact with the 1389.Nm 1390system. 1391See 1392.Xr netgraph 3 1393for details. 1394.Pp 1395Two user-mode support programs, 1396.Xr ngctl 8 1397and 1398.Xr nghook 8 , 1399are available to assist manual configuration and debugging. 1400.Pp 1401There are a few useful techniques for debugging new node types. 1402First, implementing new node types in user-mode first 1403makes debugging easier. 1404The 1405.Vt tee 1406node type is also useful for debugging, especially in conjunction with 1407.Xr ngctl 8 1408and 1409.Xr nghook 8 . 1410.Pp 1411Also look in 1412.Pa /usr/share/examples/netgraph 1413for solutions to several 1414common networking problems, solved using 1415.Nm . 1416.Sh SEE ALSO 1417.Xr socket 2 , 1418.Xr netgraph 3 , 1419.Xr ng_async 4 , 1420.Xr ng_atm 4 , 1421.Xr ng_atmllc 4 , 1422.Xr ng_bluetooth 4 , 1423.Xr ng_bpf 4 , 1424.Xr ng_bridge 4 , 1425.Xr ng_bt3c 4 , 1426.Xr ng_btsocket 4 , 1427.Xr ng_car 4 , 1428.Xr ng_cisco 4 , 1429.Xr ng_device 4 , 1430.Xr ng_echo 4 , 1431.Xr ng_eiface 4 , 1432.Xr ng_etf 4 , 1433.Xr ng_ether 4 , 1434.Xr ng_frame_relay 4 , 1435.Xr ng_gif 4 , 1436.Xr ng_gif_demux 4 , 1437.Xr ng_h4 4 , 1438.Xr ng_hci 4 , 1439.Xr ng_hole 4 , 1440.Xr ng_hub 4 , 1441.Xr ng_iface 4 , 1442.Xr ng_ip_input 4 , 1443.Xr ng_ipfw 4 , 1444.Xr ng_ksocket 4 , 1445.Xr ng_l2cap 4 , 1446.Xr ng_l2tp 4 , 1447.Xr ng_lmi 4 , 1448.Xr ng_mppc 4 , 1449.Xr ng_nat 4 , 1450.Xr ng_netflow 4 , 1451.Xr ng_one2many 4 , 1452.Xr ng_patch 4 , 1453.Xr ng_ppp 4 , 1454.Xr ng_pppoe 4 , 1455.Xr ng_pptpgre 4 , 1456.Xr ng_rfc1490 4 , 1457.Xr ng_socket 4 , 1458.Xr ng_split 4 , 1459.Xr ng_sppp 4 , 1460.Xr ng_sscfu 4 , 1461.Xr ng_sscop 4 , 1462.Xr ng_tee 4 , 1463.Xr ng_tty 4 , 1464.Xr ng_ubt 4 , 1465.Xr ng_UI 4 , 1466.Xr ng_uni 4 , 1467.Xr ng_vjc 4 , 1468.Xr ng_vlan 4 , 1469.Xr ngctl 8 , 1470.Xr nghook 8 1471.Sh HISTORY 1472The 1473.Nm 1474system was designed and first implemented at Whistle Communications, Inc.\& 1475in a version of 1476.Fx 2.2 1477customized for the Whistle InterJet. 1478It first made its debut in the main tree in 1479.Fx 3.4 . 1480.Sh AUTHORS 1481.An -nosplit 1482.An Julian Elischer Aq julian@FreeBSD.org , 1483with contributions by 1484.An Archie Cobbs Aq archie@FreeBSD.org . 1485