xref: /illumos-gate/usr/src/man/man9e/mac.9e (revision 1f0a297725d06da62d0c701916f52e7f403ee0fc)
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source.  A copy of the CDDL is also available via the Internet at
9.\" http://www.illumos.org/license/CDDL.
10.\"
11.\"
12.\" Copyright 2016 Joyent, Inc.
13.\"
14.Dd March 26, 2017
15.Dt MAC 9E
16.Os
17.Sh NAME
18.Nm mac ,
19.Nm GLDv3
20.Nd MAC networking device driver overview
21.Sh SYNOPSIS
22.In sys/mac_provider.h
23.In sys/mac_ether.h
24.Sh INTERFACE LEVEL
25illumos DDI specific
26.Sh DESCRIPTION
27The
28.Sy MAC
29framework provides a means for implementing high-performance networking
30device drivers.
31It is the successor to the GLD interfaces and is sometimes referred to as the
32GLDv3.
33The remainder of this manual introduces the aspects of writing devices drivers
34that leverage the MAC framework.
35While both the GLDv3 and MAC framework refer to the same thing, in this manual
36page we use the term the
37.Em MAC framework
38to refer to the device driver interface.
39.Pp
40MAC device drivers are character devices.
41They define the standard
42.Xr _init 9E ,
43.Xr _fini 9E ,
44and
45.Xr _info 9E
46entry points to initialize the module, as well as
47.Xr dev_ops 9S
48and
49.Xr cb_ops 9S
50structures.
51.Pp
52The main interface with MAC is through a series of callbacks defined in
53a
54.Xr mac_callbacks 9S
55structure.
56These callbacks control all the aspects of the device.
57They range from sending data, getting and setting of properties, controlling mac
58address filters, and also managing promiscuous mode.
59.Pp
60The MAC framework takes care of many aspects of the device driver's
61management.
62A device that uses the MAC framework does not have to worry about creating
63device nodes or implementing
64.Xr open 9E
65or
66.Xr close 9E
67routines.
68In addition, all of the work to interact with
69.Xr dlpi 7P
70is taken care of automatically and transparently.
71.Ss Initializing MAC Support
72For a device to be used in the framework, it must register with the
73framework and take specific actions during
74.Xr _init 9E ,
75.Xr attach 9E ,
76.Xr detach 9E ,
77and
78.Xr _fini 9E .
79.Pp
80All device drivers have to define a
81.Xr dev_ops 9S
82structure which is pointed to by a
83.Xr modldrv 9S
84structure and the corresponding NULL-terminated
85.Xr modlinkage 9S
86structure.
87The
88.Xr dev_ops 9S
89structure should have a
90.Xr cb_ops 9S
91structure defined for it; however, it does not need to implement any of
92the standard
93.Xr cb_ops 9S
94entry points.
95.Pp
96Normally, in a driver's
97.Xr _init 9E
98entry point, it passes its
99.Sy modlinkage
100structure directly to
101.Xr mod_install 9F .
102To properly register with MAC, the driver must call
103.Xr mac_init_ops 9F
104before it calls
105.Xr mod_install 9F .
106If for some reason the
107.Xr mod_install 9F
108function fails, then the driver must be removed by a call to
109.Xr mac_fini_ops 9F .
110.Pp
111Conversely, in the driver's
112.Xr _fini 9E
113routine, it should call
114.Xr mac_fini_ops 9F
115after it successfully calls
116.Xr mod_remove 9F .
117For an example of how to use the
118.Xr mac_init_ops 9F
119and
120.Xr mac_fini_ops 9F
121functions, see the examples section in
122.Xr mac_init_ops 9F .
123.Ss Registering with MAC
124Every instance of a device should register separately with MAC.
125To register with MAC, a driver must allocate a
126.Xr mac_register 9S
127structure, fill it in, and then call
128.Xr mac_register 9F .
129The
130.Sy mac_register_t
131structure contains information about the device and all of the required
132function pointers that will be used as callbacks by the framework.
133.Pp
134These steps should all be taken during a device's
135.Xr attach 9E
136entry point.
137It is recommended that the driver perform this sequence of steps after the
138device has finished its initialization of the chipset and interrupts, though
139interrupts should not be enabled at that point.
140After it calls
141.Xr mac_register 9F
142it will start receiving callbacks from the MAC framework.
143.Pp
144To allocate the registration structure, the driver should call
145.Xr mac_alloc 9F .
146Device drivers should generally always pass the symbol
147.Sy MAC_VERSION
148as the argument to
149.Xr mac_alloc 9F .
150Upon successful completion, the driver will receive a
151.Sy mac_register_t
152structure which it should fill in.
153The structure and its members are documented in
154.Xr mac_register 9S .
155.Pp
156The
157.Xr mac_callbacks 9S
158structure is not allocated as a part of the
159.Xr mac_register 9S
160structure.
161In general, device drivers declare this statically.
162See the
163.Sx MAC Callbacks
164section for more information on how to fill it out.
165.Pp
166Once the structure has been filled in, the driver should call
167.Xr mac_register 9F
168to register itself with MAC.
169The handle that it uses to register with should be part of the driver's soft
170state.
171It will be used in various other support functions and callbacks.
172.Pp
173If the call is successful, then the device driver
174should enable interrupts and finish any other initialization required.
175If the call to
176.Xr mac_register 9F
177failed, then it should unwind its initialization and should return
178.Sy DDI_FAILURE
179from its
180.Xr attach 9E
181routine.
182.Ss MAC Callbacks
183The MAC framework interacts with a device driver through a series of
184callbacks.
185These callbacks are described in their individual manual pages and the
186collection of callbacks is indicated in the
187.Xr mac_callbacks 9S
188manual page.
189This section does not focus on the specific functions, but rather on
190interactions between them and the rest of the device driver framework.
191.Pp
192A device driver should make no assumptions about when the various
193callbacks will be called and whether or not they will be called
194simultaneously.
195For example, a device driver may be asked to transmit data through a call to its
196.Xr mc_tx 9E
197entry point while it is being asked to get a device property through a
198call to its
199.Xr mc_getprop 9E
200entry point.
201As such, while some calls may be serialized to the device, such as setting
202properties, the device driver should always presume that all of its data needs
203to be protected with locks.
204While the device is holding locks, it is safe for it call the following MAC
205routines:
206.Bl -bullet -offset indent -compact
207.It
208.Xr mac_hcksum_get 9F
209.It
210.Xr mac_hcksum_set 9F
211.It
212.Xr mac_lso_get 9F
213.It
214.Xr mac_maxsdu_update 9F
215.It
216.Xr mac_prop_info_set_default_link_flowctrl 9F
217.It
218.Xr mac_prop_info_set_default_str 9F
219.It
220.Xr mac_prop_info_set_default_uint8 9F
221.It
222.Xr mac_prop_info_set_default_uint32 9F
223.It
224.Xr mac_prop_info_set_default_uint64 9F
225.It
226.Xr mac_prop_info_set_perm 9F
227.It
228.Xr mac_prop_info_set_range_uint32 9F
229.El
230.Pp
231Any other MAC related routines should not be called with locks held,
232such as
233.Xr mac_link_update 9F
234or
235.Xr mac_rx 9F .
236Other routines in the DDI may be called while locks are held; however,
237device driver writers should be careful about calling blocking routines
238while locks are held or in interrupt context, though it is generally
239legal to do so.
240.Ss Receiving Data
241A device driver will often receive data through the means of an
242interrupt.
243When that interrupt occurs, the device driver will receive one or more frames
244with optional metadata.
245Often each frame has a corresponding descriptor which has information about
246whether or not there were errors or whether or not the device successfully
247checksummed the packet.
248.Pp
249During a single interrupt, a device driver should process a fixed number
250of frames.
251For each frame the device driver should:
252.Bl -enum -offset indent
253.It
254First check whether or not the frame has errors.
255If errors were detected, then the frame should not be sent to the operating
256system.
257It is recommended that devices keep kstats (see
258.Xr kstat_create 9F
259for more information) and bump the counter whenever such an error is
260detected.
261If the device distinguishes between the types of errors, then separate kstats
262for each class of error are recommended.
263See the
264.Sx STATISTICS
265section for more information on the various error cases that should be
266considered.
267.It
268Once the frame has been determined to be valid, the device driver should
269transform the frame into a
270.Xr mblk 9S .
271See the section
272.Sx MBLKS AND DMA
273for more information on how to transform and prepare a message block.
274.It
275If the device supports hardware checksumming (see the
276.Sx CAPABILITIES
277section for more information on checksumming), then the device driver
278should set the corresponding checksumming information with a call to
279.Xr mac_hcksum_set 9F .
280.It
281It should then append this new message block to the
282.Em end
283of the message block chain, linking it to the
284.Sy b_next
285pointer.
286It is vitally important that all the frames be chained in the order that they
287were received.
288If the device driver mistakenly reorders frames, then it may cause performance
289impacts in the TCP stack and potentially impact application correctness.
290.El
291.Pp
292Once all the frames have been processed and assembled, the device driver
293should deliver them to the rest of the operating system by calling
294.Xr mac_rx 9F .
295The device driver should try to give as many mblk_t structures to the
296system at once.
297It
298.Em should not
299call
300.Xr mac_rx 9F
301once for every assembled mblk_t.
302.Pp
303The device driver must not hold any locks across the call to
304.Xr mac_rx 9F .
305When this function is called, received data will be pushed through the
306networking stack and some replies may be generated and given to the
307driver to send out.
308.Pp
309It is not the device driver's responsibility to determine whether or not
310the system can keep up with a driver's delivery rate of frames.
311The rest of the networking stack will handle issues related to keeping up
312appropriately and ensure that kernel memory is not exhausted by packets
313that are not being processed.
314.Pp
315Finally, the device driver should make sure that any other housekeeping
316activities required for the ring are taken care of such that more data
317can be received.
318.Ss Transmitting Data and Back Pressure
319A device driver will be asked to transmit a message block chain by
320having it's
321.Xr mc_tx 9E
322entry point called.
323While the driver is processing the message blocks, it may run out of resources.
324For example, a transmit descriptor ring may become full.
325At that point, the device driver should return the remaining unprocessed frames.
326The act of returning frames indicates that the device has asserted flow control.
327Once this has been done, no additional calls will be made to the
328driver's transmit entry point and the back pressure will be propagated
329throughout the rest of the networking stack.
330.Pp
331At some point in the future when resources have become available again,
332for example after an interrupt indicating that some portion of the
333transmit ring has been sent, then the device driver must notify the
334system that it can continue transmission.
335To do this, the driver should call
336.Xr mac_tx_update 9F .
337After that point, the driver will receive calls to its
338.Xr mc_tx 9E
339entry point again.
340As mentioned in the section on callbacks, the device driver should avoid holding
341any particular locks across the call to
342.Xr mac_tx_update 9F .
343.Ss Interrupt Coalescing
344For devices operating at higher data rates, interrupt coalescing is an
345important part of a well functioning device and may impact the
346performance of the device.
347Not all devices support interrupt coalescing.
348If interrupt coalescing is supported on the device, it is recommended that
349device driver writers provide private properties for their device to control the
350interrupt coalescing rate.
351This will make it much easier to perform experiments and observe the impact of
352different interrupt rates on the rest of the system.
353.Ss MAC Address Filter Management
354The MAC framework will attempt to use as many MAC address filters as a
355device has.
356To program a multicast address filter, the driver's
357.Xr mc_multicst 9E
358entry point will be called.
359If the device driver runs out of filters, it should not take any special action
360and just return the appropriate error as documented in the corresponding manual
361pages for the entry points.
362The framework will ensure that the device is placed in promiscuous mode
363if it needs to.
364.Ss Link Updates
365It is the responsibility of the device driver to keep track of the
366data link's state.
367Many devices provide a means of receiving an interrupt when the state of the
368link changes.
369When such a change happens, the driver should update its internal data
370structures and then call
371.Xr mac_link_update 9F
372to inform the MAC layer that this has occurred.
373If the device driver does not properly inform the system about link changes,
374then various features like link aggregations and other mechanisms that leverage
375the link state will not work correctly.
376.Ss Link Speed and Auto-negotiation
377Many networking devices support more than one possible speed that they
378can operate at.
379The selection of a speed is often performed through
380.Em auto-negotiation ,
381though some devices allow the user to control what speeds are advertised
382and used.
383.Pp
384Logically, there are two different sets of things that the device driver
385needs to keep track of while it's operating:
386.Bl -enum
387.It
388The supported speeds in hardware.
389.It
390The enabled speeds from the user.
391.El
392.Pp
393By default, when a link first comes up, the device driver should
394generally configure the link to support the common set of speeds and
395perform auto-negotiation.
396.Pp
397A user can control what speeds a device advertises via auto-negotiation
398and whether or not it performs auto-negotiation at all by using a series
399of properties that have
400.Sy _EN_
401in the name.
402These are read/write properties and there is one for each speed supported in the
403operating system.
404For a full list of them, see the
405.Sx PROPERTIES
406section.
407.Pp
408In addition to these properties, there is a corresponding set of
409properties with
410.Sy _ADV_
411in the name.
412These are similar to the
413.Sy _EN_
414family of properties, but they are read-only and indicate what the
415device has actually negotiated.
416While they are generally similar to the
417.Sy _EN_
418family of properties, they may change depending on power settings.
419See the
420.Sy Ethernet Link Properties
421section in
422.Xr dladm 1M
423for more information.
424.Pp
425It's worth discussing how these different values get used throughout the
426different entry points.
427The first entry point to consider is the
428.Xr mc_propinfo 9E
429entry point.
430For a given speed, the driver should consult whether or not the hardware
431supports this speed.
432If it does, it should fill in the default value that the hardware takes and
433whether or not the property is writable.
434The properties should also be updated to indicate whether or not it is writable.
435This holds for both the
436.Sy _EN_
437and
438.Sy _ADV_
439family of properties.
440.Pp
441The next entry point is
442.Xr mc_getprop 9E .
443Here, the device should first consult whether the given speed is
444supported.
445If it is not, then the driver should return
446.Er ENOTSUP .
447If it does, then it should return the current value of the property.
448.Pp
449The last property endpoint is the
450.Xr mc_setprop 9E
451entry point.
452Here, the same logic applies.
453Before the driver considers whether or not the property is writable, it should
454first check whether or not it's a supported property.
455If it's not, then it should return
456.Er ENOTSUP .
457Otherwise, it should proceed to check whether the property is writable,
458and if it is and a valid value, then it should update the property and
459restart the link's negotiation.
460.Pp
461Finally, there is the
462.Xr mc_getstat 9E
463entry point.
464Several of the statistics that are queried relate to auto-negotiation and
465hardware capabilities.
466When a statistic relates to the hardware supporting a given speed, the
467.Sy _EN_
468properties should be ignored.
469The only thing that should be consulted is what the hardware itself supports.
470Otherwise, the statistics should look at what is currently being advertised by
471the device.
472.Ss Unregistering from MAC
473During a driver's
474.Xr detach 9E
475routine, it should unregister the device instance from MAC by calling
476.Xr mac_unregister 9F
477on the handle that it originally called it on.
478If the call to
479.Xr mac_unregister 9F
480failed, then the device is likely still in use and the driver should
481fail the call to
482.Xr detach 9E .
483.Ss Interacting with Devices
484Administrators always interact with devices through the
485.Xr dladm 1M
486command line interface.
487The state of devices such as whether the link is considered
488.Sy up
489or
490.Sy down ,
491various link properties such as the
492.Sy MTU ,
493.Sy auto-negotiation
494state,
495and
496.Sy flow control
497state,
498are all exposed.
499It is also the preferred way that these properties are set and configured.
500.Pp
501While device tunables may be presented in a
502.Xr driver.conf 4
503file, it is recommended instead to expose such things through
504.Xr dladm 1M
505private properties, whether explicitly documented or not.
506.Sh CAPABILITIES
507Capabilities in the MAC Framework are optional features that a device
508supports which indicate various hardware features that the device
509supports.
510The two current capabilities that the system supports are related to being able
511to hardware perform large send offloads (LSO), often also known as TCP
512segmentation and the ability for hardware to calculate and verify the checksums
513present in IPv4, IPV6, and protocol headers such as TCP and UDP.
514.Pp
515The MAC framework will query a device for support of a capability
516through the
517.Xr mc_getcapab 9E
518function.
519Each capability has its own constant and may have corresponding data that goes
520along with it and a specific structure that the device is required to fill in.
521Note, the set of capabilities changes over time and there are also private
522capabilities in the system.
523Several of the capabilities are used in the implementation of the MAC framework.
524Others, like
525.Sy MAC_CAPAB_RINGS ,
526represent feature that have not been stabilized and thus both API and binary
527compatibility for them is not guaranteed.
528It is important that the device driver handles unknown capabilities correctly.
529For more information, see
530.Xr mc_getcapab 9E .
531.Pp
532The following capabilities are
533stable and defined in the system:
534.Ss MAC_CAPAB_HCKSUM
535The
536.Sy MAC_CAPAB_HCKSUM
537capability indicates to the system that the device driver supports some
538amount of checksumming.
539The specific data for this capability is a pointer to a
540.Sy uint32_t .
541To indicate no support for any kind of checksumming, the driver should
542either set this value to zero or simply return that it doesn't support
543the capability.
544.Pp
545Note, the values that the driver declares in this capability indicate
546what it can do when it transmits data.
547If the driver can only verify checksums when receiving data, then it should not
548indicate that it supports this capability.
549The following set of flags may be combined through a bitwise inclusive OR:
550.Bl -tag -width Ds
551.It Sy HCKSUM_INET_PARTIAL
552This indicates that the hardware can calculate a partial checksum for
553both IPv4 and IPv6; however, it requires the pseudo-header checksum be
554calculated for it.
555The pseudo-header checksum will be available for the mblk_t when calling
556.Xr mac_hcksum_get 9F .
557Note this does not imply that the hardware is capable of calculating the
558IPv4 header checksum.
559That should be indicated with the
560.Sy HCKSUM_IPHDRCKSUM flag.
561.It Sy HCKSUM_INET_FULL_V4
562This indicates that the hardware will fully calculate the L4 checksum
563for outgoing IPv4 packets and does not require a pseudo-header checksum.
564Note this does not imply that the hardware is capable of calculating the
565IPv4 header checksum.
566That should be indicated with the
567.Sy HCKSUM_IPHDRCKSUM .
568.It Sy HCKSUM_INET_FULL_V6
569This indicates that the hardware will fully calculate the L4 checksum
570for outgoing IPv6 packets and does not require a pseudo-header checksum.
571.It Sy HCKSUM_IPHDRCKSUM
572This indicates that the hardware supports calculating the checksum for
573the IPv4 header itself.
574.El
575.Pp
576When in a driver's transmit function, the driver will be processing a
577single frame.
578It should call
579.Xr mac_hcksum_get 9F
580to see what checksum flags are set on it.
581Note that the flags that are set on it are different from the ones described
582above and are documented in its manual page.
583These flags indicate how the driver is expected to program the hardware and what
584checksumming is required.
585Not all frames will require hardware checksumming or will ask the hardware to
586checksum it.
587.Pp
588If a driver supports offloading the receive checksum and verification,
589it should check to see what the hardware indicated was verified.
590The driver should then call
591.Xr mac_hcksum_set 9F .
592The flags used are different from the ones above and are discussed in
593detail in the
594.Xr mac_hcksum_set 9F
595manual page.
596If there is no checksum information available or the driver does not support
597checksumming, then it should simply not call
598.Xr mac_hcksum_set 9F .
599.Pp
600Note that the checksum flags should be set on the first
601mblk_t that makes up a given message.
602In other words, if multiple mblk_t structures are linked together by the
603.Sy b_cont
604member to describe a single frame, then it should only be called on the
605first mblk_t of that set.
606However, each distinct message should have the checksum bits set on it, if
607applicable.
608In other words, each mblk_t that is linked together by the
609.Sy b_next
610pointer may have checksum flags set.
611.Pp
612It is recommended that device drivers provide a private property or
613.Xr driver.conf 4
614property to control whether or not checksumming is enabled for both rx
615and tx; however, the default disposition is recommended to be enabled
616for both.
617This way if hardware bugs are found in the checksumming implementation, they can
618be disabled without requiring software updates.
619The transmit property should be checked when determining how to reply to
620.Xr mc_getcapab 9E
621and the receive property should be checked in the context of the receive
622function.
623.Ss MAC_CAPAB_LSO
624The
625.Sy MAC_CAPAB_LSO
626capability indicates that the driver supports various forms of large
627send offload (LSO).
628The private data is a pointer to a
629.Sy mac_capab_lso_t
630structure.
631At the moment, LSO support is limited to TCP inside of IPv4.
632This structure has the following members which are used to indicate
633various types of LSO support.
634.Bd -literal -offset indent
635t_uscalar_t		lso_flags;
636lso_basic_tcp_ivr4_t	lso_basic_tcp_ipv4;
637.Ed
638.Pp
639The
640.Sy lso_flags
641member is used to indicate which members are valid and should be
642considered.
643Each flag represents a different form of LSO.
644The member should be set to the bitwise inclusive OR of the following values:
645.Bl -tag -width Dv -offset indent
646.It Sy LSO_TX_BASIC_TCP_IPV4
647This indicates hardware support for performing TCP segmentation
648offloading over IPv4.
649When this flag is set, the
650.Sy lso_basic_tcp_ipv4
651member must be filled in.
652.El
653.Pp
654The
655.Sy lso_basic_tcp_ipv4
656member is a structure with the following members:
657.Bd -literal -offset indent
658t_uscalar_t	lso_max
659.Ed
660.Bd -filled -offset indent
661The
662.Sy lso_max
663member should be set to the maximum size of the TCP data
664payload that can be offloaded to the hardware.
665.Ed
666.Pp
667Like with checksumming, it is recommended that driver writers provide a
668means for disabling the support of LSO even if it is enabled by default.
669This deals with the case where issues that pop up for LSO may be worked
670around without requiring additional driver work.
671.Sh PROPERTIES
672Properties in the MAC framework represent aspects of a link.
673These include things like the link's current state and MTU.
674Many of the properties in the system are focused around auto-negotiation and
675controlling what link speeds are advertised.
676Information about properties is covered by three different device entry points.
677The
678.Xr mc_propinfo 9E
679entry point obtains metadata about the property.
680The
681.Xr mc_getprop 9E
682entry point obtains the property.
683The
684.Xr mc_setprop 9E
685entry point updates the property to a new value.
686.Pp
687Many of the properties listed below are read-only.
688Each property indicates whether it's read-only or it's read/write.
689However, driver writers may not implement the ability to set all writable
690properties.
691Many of these depend on the card itself.
692In particular, all properties that relate to auto-negotiation and are read/write
693may not be updated if the hardware in question does not support toggling what
694link speeds are auto-negotiated.
695While copper Ethernet often does not have this restriction, it often exists with
696various fiber standards and phys.
697.Pp
698The following properties are the subset of MAC framework properties that
699driver writers should be aware of and handle.
700While other properties exist in the system, driver writers should always return
701an error when a property not listed below is encountered.
702See
703.Xr mc_getprop 9E
704and
705.Xr mc_setprop 9E
706for more information on how to handle them.
707.Bl -hang -width Ds
708.It Sy MAC_PROP_DUPLEX
709.Bd -filled -compact
710Type:
711.Sy link_duplex_t |
712Permissions:
713.Sy Read-Only
714.Ed
715.Pp
716The
717.Sy MAC_PROP_DUPLEX
718property is used to indicate whether or not the link is duplex.
719A duplex link may have traffic flowing in both directions at the same time.
720The
721.Sy link_duplex_t
722is an enumeration which may be set to any of the following values:
723.Bl -tag -width Ds
724.It Sy LINK_DUPLEX_UNKNOWN
725The current state of the link is unknown.
726This may be because the link has not negotiated to a specific speed or it is
727down.
728.It Sy LINK_DUPLEX_HALF
729The link is running at half duplex.
730Communication may travel in only one direction on the link at a given time.
731.It Sy LINK_DUPLEX_FULL
732The link is running at full duplex.
733Communication may travel in both directions on the link simultaneously.
734.El
735.It Sy MAC_PROP_SPEED
736.Bd -filled -compact
737Type:
738.Sy uint64_t |
739Permissions:
740.Sy Read-Only
741.Ed
742.Pp
743The
744.Sy MAC_PROP_SPEED
745property stores the current link speed in bits per second.
746A link that is running at 100 MBit/s would store the value 100000000ULL.
747A link that is running at 40 Gbit/s would store the value 40000000000ULL.
748.It Sy MAC_PROP_STATUS
749.Bd -filled -compact
750Type:
751.Sy link_state_t |
752Permissions:
753.Sy Read-Only
754.Ed
755.Pp
756The
757.Sy MAC_PROP_STATUS
758property is used to indicate the current state of the link.
759It indicates whether the link is up or down.
760The
761.Sy link_state_t
762is an enumeration which may be set to any of the following values:
763.Bl -tag -width Ds
764.It Sy LINK_STATE_UNKNOWN
765The current state of the link is unknown.
766This may be because the driver's
767.Xr mc_start 9E
768endpoint has not been called so it has not attempted to start the link.
769.It Sy LINK_STATE_DOWN
770The link is down.
771This may be because of a negotiation problem, a cable problem, or some other
772device specific issue.
773.It Sy LINK_STATE_UP
774The link is up.
775If auto-negotiation is in use, it should have completed.
776Traffic should be able to flow over the link, barring other issues.
777.El
778.It Sy MAC_PROP_AUTONEG
779.Bd -filled -compact
780Type:
781.Sy uint8_t |
782Permissions:
783.Sy Read/Write
784.Ed
785.Pp
786The
787.Sy MAC_PROP_AUTONEG
788property indicates whether or not the device is currently configured to
789perform auto-negotiation.
790A value of
791.Sy 0
792indicates that auto-negotiation is disabled.
793A
794.Sy non-zero
795value indicates that auto-negotiation is enabled.
796Devices should generally default to enabling auto-negotiation.
797.Pp
798When getting this property, the device driver should return the current
799state.
800When setting this property, if the device supports operating in the requested
801mode, then the device driver should reset the link to negotiate to the new speed
802after updating any internal registers.
803.It Sy MAC_PROP_MTU
804.Bd -filled -compact
805Type:
806.Sy uint32_t |
807Permissions:
808.Sy Read/Write
809.Ed
810.Pp
811The
812.Sy MAC_PROP_MTU
813property determines the maximum transmission unit (MTU).
814This indicates the maximum size packet that the device can transmit, ignoring
815its own headers.
816For an Ethernet device, this would exclude the size of the Ethernet header and
817any VLAN headers that would be placed.
818It is up to the driver to ensure that any MTU values that it accepts when adding
819in its margin and header sizes does not exceed its maximum frame size.
820.Pp
821By default, drivers for Ethernet should initialize this value and the
822MTU to
823.Sy 1500 .
824When getting this property, the driver should return its current
825recorded MTU.
826When setting this property, the driver should first validate that it is within
827the device's valid range and then it must call
828.Xr mac_maxsdu_update 9F .
829Note that the call may fail.
830If the call completes successfully, the driver should update the hardware with
831the new value of the MTU and perform any other work needed to handle it.
832.Pp
833If the device does not support changing the MTU after the device's
834.Xr mc_start 9E
835entry point has been called, then driver writers should return
836.Er EBUSY .
837.It Sy MAC_PROP_FLOWCTRL
838.Bd -filled -compact
839Type:
840.Sy link_flowctrl_t |
841Permissions:
842.Sy Read/Write
843.Ed
844.Pp
845The
846.Sy MAC_PROP_FLOWCTRL
847property manages the configuration of pause frames as part of Ethernet
848flow control.
849Note, this only describes what this device will advertise.
850What is actually enabled may be different and is subject to the rules of
851auto-negotiation.
852The
853.Sy link_flowctrl_t
854is an enumeration that may be set to one of the following values:
855.Bl -tag -width Ds
856.It Sy LINK_FLOWCTRL_NONE
857Flow control is disabled.
858No pause frames should be generated or honored.
859.It Sy LINK_FLOWCTRL_RX
860The device can receive pause frames; however, it should not generate
861them.
862.It Sy LINK_FLOWCTRL_TX
863The device can generate pause frames; however, it does not support
864receiving them.
865.It Sy LINK_FLOWCTRL_BI
866The device supports both sending and receiving pause frames.
867.El
868.Pp
869When getting this property, the device driver should return the way that
870it has configured the device, not what the device has actually
871negotiated.
872When setting the property, it should update the hardware and allow the link to
873potentially perform auto-negotiation again.
874.El
875.Pp
876The remaining properties are all about various auto-negotiation link
877speeds.
878They fall into two different buckets: properties with
879.Sy _ADV_
880in the name and properties with
881.Sy _EN_
882in the name.
883For any given supported speed, there is one of each.
884The
885.Sy _EN_
886set of properties are read/write properties that control what should be
887advertised by the device.
888When these are retrieved, they should return the current value of the property.
889When they are set, they should change how the hardware advertises the specific
890speed and trigger any kind of link reset and auto-negotiation, if enabled, to
891occur.
892.Pp
893The
894.Sy _ADV_
895set of properties are read-only properties.
896They are meant to reflect what has actually been negotiated.
897These may be different from the
898.Sy _EN_
899family of properties, especially when different power management
900settings are at play.
901.Pp
902See the
903.Sx Link Speed and Auto-negotiation
904section for more information.
905.Pp
906The properties are ordered in increasing link speed:
907.Bl -hang -width Ds
908.It Sy MAC_PROP_ADV_10HDX_CAP
909.Bd -filled -compact
910Type:
911.Sy uint8_t |
912Permissions:
913.Sy Read-Only
914.Ed
915.Pp
916The
917.Sy MAC_PROP_ADV_10HDX_CAP
918property describes whether or not 10 Mbit/s half-duplex support is
919advertised.
920.It Sy MAC_PROP_EN_10HDX_CAP
921.Bd -filled -compact
922Type:
923.Sy uint8_t |
924Permissions:
925.Sy Read/Write
926.Ed
927.Pp
928The
929.Sy MAC_PROP_EN_10HDX_CAP
930property describes whether or not 10 Mbit/s half-duplex support is
931enabled.
932.It Sy MAC_PROP_ADV_10FDX_CAP
933.Bd -filled -compact
934Type:
935.Sy uint8_t |
936Permissions:
937.Sy Read-Only
938.Ed
939.Pp
940The
941.Sy MAC_PROP_ADV_10FDX_CAP
942property describes whether or not 10 Mbit/s full-duplex support is
943advertised.
944.It Sy MAC_PROP_EN_10FDX_CAP
945.Bd -filled -compact
946Type:
947.Sy uint8_t |
948Permissions:
949.Sy Read/Write
950.Ed
951.Pp
952The
953.Sy MAC_PROP_EN_10FDX_CAP
954property describes whether or not 10 Mbit/s full-duplex support is
955enabled.
956.It Sy MAC_PROP_ADV_100HDX_CAP
957.Bd -filled -compact
958Type:
959.Sy uint8_t |
960Permissions:
961.Sy Read-Only
962.Ed
963.Pp
964The
965.Sy MAC_PROP_ADV_100HDX_CAP
966property describes whether or not 100 Mbit/s half-duplex support is
967advertised.
968.It Sy MAC_PROP_EN_100HDX_CAP
969.Bd -filled -compact
970Type:
971.Sy uint8_t |
972Permissions:
973.Sy Read/Write
974.Ed
975.Pp
976The
977.Sy MAC_PROP_EN_100HDX_CAP
978property describes whether or not 100 Mbit/s half-duplex support is
979enabled.
980.It Sy MAC_PROP_ADV_100FDX_CAP
981.Bd -filled -compact
982Type:
983.Sy uint8_t |
984Permissions:
985.Sy Read-Only
986.Ed
987.Pp
988The
989.Sy MAC_PROP_ADV_100FDX_CAP
990property describes whether or not 100 Mbit/s full-duplex support is
991advertised.
992.It Sy MAC_PROP_EN_100FDX_CAP
993.Bd -filled -compact
994Type:
995.Sy uint8_t |
996Permissions:
997.Sy Read/Write
998.Ed
999.Pp
1000The
1001.Sy MAC_PROP_EN_100FDX_CAP
1002property describes whether or not 100 Mbit/s full-duplex support is
1003enabled.
1004.It Sy MAC_PROP_ADV_100T4_CAP
1005.Bd -filled -compact
1006Type:
1007.Sy uint8_t |
1008Permissions:
1009.Sy Read-Only
1010.Ed
1011.Pp
1012The
1013.Sy MAC_PROP_ADV_100T4_CAP
1014property describes whether or not 100 Mbit/s Ethernet using the
1015100BASE-T4 standard is
1016advertised.
1017.It Sy MAC_PROP_EN_100T4_CAP
1018.Bd -filled -compact
1019Type:
1020.Sy uint8_t |
1021Permissions:
1022.Sy Read/Write
1023.Ed
1024.Pp
1025The
1026.Sy MAC_PROP_ADV_100T4_CAP
1027property describes whether or not 100 Mbit/s Ethernet using the
1028100BASE-T4 standard is
1029enabled.
1030.It Sy MAC_PROP_ADV_1000HDX_CAP
1031.Bd -filled -compact
1032Type:
1033.Sy uint8_t |
1034Permissions:
1035.Sy Read-Only
1036.Ed
1037.Pp
1038The
1039.Sy MAC_PROP_ADV_1000HDX_CAP
1040property describes whether or not 1 Gbit/s half-duplex support is
1041advertised.
1042.It Sy MAC_PROP_EN_1000HDX_CAP
1043.Bd -filled -compact
1044Type:
1045.Sy uint8_t |
1046Permissions:
1047.Sy Read/Write
1048.Ed
1049.Pp
1050The
1051.Sy MAC_PROP_EN_1000HDX_CAP
1052property describes whether or not 1 Gbit/s half-duplex support is
1053enabled.
1054.It Sy MAC_PROP_ADV_1000FDX_CAP
1055.Bd -filled -compact
1056Type:
1057.Sy uint8_t |
1058Permissions:
1059.Sy Read-Only
1060.Ed
1061.Pp
1062The
1063.Sy MAC_PROP_ADV_1000FDX_CAP
1064property describes whether or not 1 Gbit/s full-duplex support is
1065advertised.
1066.It Sy MAC_PROP_EN_1000FDX_CAP
1067.Bd -filled -compact
1068Type:
1069.Sy uint8_t |
1070Permissions:
1071.Sy Read/Write
1072.Ed
1073.Pp
1074The
1075.Sy MAC_PROP_EN_1000FDX_CAP
1076property describes whether or not 1 Gbit/s full-duplex support is
1077enabled.
1078.It Sy MAC_PROP_ADV_2500FDX_CAP
1079.Bd -filled -compact
1080Type:
1081.Sy uint8_t |
1082Permissions:
1083.Sy Read-Only
1084.Ed
1085.Pp
1086The
1087.Sy MAC_PROP_ADV_2500FDX_CAP
1088property describes whether or not 2.5 Gbit/s full-duplex support is
1089advertised.
1090.It Sy MAC_PROP_EN_2500FDX_CAP
1091.Bd -filled -compact
1092Type:
1093.Sy uint8_t |
1094Permissions:
1095.Sy Read/Write
1096.Ed
1097.Pp
1098The
1099.Sy MAC_PROP_EN_2500FDX_CAP
1100property describes whether or not 2.5 Gbit/s full-duplex support is
1101enabled.
1102.It Sy MAC_PROP_ADV_5000FDX_CAP
1103.Bd -filled -compact
1104Type:
1105.Sy uint8_t |
1106Permissions:
1107.Sy Read-Only
1108.Ed
1109.Pp
1110The
1111.Sy MAC_PROP_ADV_5000FDX_CAP
1112property describes whether or not 5.0 Gbit/s full-duplex support is
1113advertised.
1114.It Sy MAC_PROP_EN_5000FDX_CAP
1115.Bd -filled -compact
1116Type:
1117.Sy uint8_t |
1118Permissions:
1119.Sy Read/Write
1120.Ed
1121.Pp
1122The
1123.Sy MAC_PROP_EN_5000FDX_CAP
1124property describes whether or not 5.0 Gbit/s full-duplex support is
1125enabled.
1126.It Sy MAC_PROP_ADV_10GFDX_CAP
1127.Bd -filled -compact
1128Type:
1129.Sy uint8_t |
1130Permissions:
1131.Sy Read-Only
1132.Ed
1133.Pp
1134The
1135.Sy MAC_PROP_ADV_10GFDX_CAP
1136property describes whether or not 10 Gbit/s full-duplex support is
1137advertised.
1138.It Sy MAC_PROP_EN_10GFDX_CAP
1139.Bd -filled -compact
1140Type:
1141.Sy uint8_t |
1142Permissions:
1143.Sy Read/Write
1144.Ed
1145.Pp
1146The
1147.Sy MAC_PROP_EN_10GFDX_CAP
1148property describes whether or not 10 Gbit/s full-duplex support is
1149enabled.
1150.It Sy MAC_PROP_ADV_40GFDX_CAP
1151.Bd -filled -compact
1152Type:
1153.Sy uint8_t |
1154Permissions:
1155.Sy Read-Only
1156.Ed
1157.Pp
1158The
1159.Sy MAC_PROP_ADV_40GFDX_CAP
1160property describes whether or not 40 Gbit/s full-duplex support is
1161advertised.
1162.It Sy MAC_PROP_EN_40GFDX_CAP
1163.Bd -filled -compact
1164Type:
1165.Sy uint8_t |
1166Permissions:
1167.Sy Read/Write
1168.Ed
1169.Pp
1170The
1171.Sy MAC_PROP_EN_40GFDX_CAP
1172property describes whether or not 40 Gbit/s full-duplex support is
1173enabled.
1174.It Sy MAC_PROP_ADV_100GFDX_CAP
1175.Bd -filled -compact
1176Type:
1177.Sy uint8_t |
1178Permissions:
1179.Sy Read-Only
1180.Ed
1181.Pp
1182The
1183.Sy MAC_PROP_ADV_100GFDX_CAP
1184property describes whether or not 100 Gbit/s full-duplex support is
1185advertised.
1186.It Sy MAC_PROP_EN_100GFDX_CAP
1187.Bd -filled -compact
1188Type:
1189.Sy uint8_t |
1190Permissions:
1191.Sy Read/Write
1192.Ed
1193.Pp
1194The
1195.Sy MAC_PROP_EN_100GFDX_CAP
1196property describes whether or not 100 Gbit/s full-duplex support is
1197enabled.
1198.El
1199.Ss Private Properties
1200In addition to the defined properties above, drivers are allowed to
1201define private properties.
1202These private properties are device-specific properties.
1203All private properties share the same constant,
1204.Sy MAC_PROP_PRIVATE .
1205Properties are distinguished by a name, which is a character string.
1206The list of such private properties is defined when registering with mac in the
1207.Sy m_priv_props
1208member of the
1209.Xr mac_register 9S
1210structure.
1211.Pp
1212The driver may define whatever semantics it wants for these private
1213properties.
1214They will not be listed when running
1215.Xr dladm 1M ,
1216unless explicitly requested by name.
1217All such properties should start with a leading underscore character and then
1218consist of alphanumeric ASCII characters and additional underscores or hyphens.
1219.Pp
1220Properties of type
1221.Sy MAC_PROP_PRIVATE
1222may show up in all three property related entry points:
1223.Xr mc_propinfo 9E ,
1224.Xr mc_getprop 9E ,
1225and
1226.Xr mc_setprop 9E .
1227Device drivers should tell the different properties apart by using the
1228.Xr strcmp 9F
1229function to compare it to the set of properties that it knows about.
1230When encountering properties that it doesn't know, it should treat them
1231like all other unknown properties.
1232.Sh STATISTICS
1233The MAC framework defines a couple different sets of statistics which
1234are based on various standards for devices to implement.
1235Statistics are retrieved through the
1236.Xr mc_getstat 9E
1237entry point.
1238There are both statistics that are required for all devices and then there is a
1239separate set of Ethernet specific statistics.
1240Not all devices will support every statistic.
1241In many cases, several device registers will need to be combined to create the
1242proper stat.
1243.Pp
1244In general, if the device is not keeping track of these statistics, then
1245it is recommended that the driver store these values as a
1246.Sy uint64_t
1247to ensure that overflow does not occur.
1248.Pp
1249If a device does not support a specific statistic, then it is fine to
1250return that it is not supported.
1251The same should be used for unrecognized statistics.
1252See
1253.Xr mc_getstat 9E
1254for more information on the proper way to handle these.
1255.Ss General Device Statistics
1256The following statistics are based on MIB-II statistics from both RFC
12571213 and RFC 1573.
1258.Bl -tag -width Ds
1259.It Sy MAC_STAT_IFSPEED
1260The device's current speed in bits per second.
1261.It Sy MAC_STAT_MULTIRCV
1262The total number of received multicast packets.
1263.It Sy MAC_STAT_BRDCSTRCV
1264The total number of received broadcast packets.
1265.It Sy MAC_STAT_MULTIXMT
1266The total number of transmitted multicast packets.
1267.It Sy MAC_STAT_BRDCSTXMT
1268The total number of received broadcast packets.
1269.It Sy MAC_STAT_NORCVBUF
1270The total number of packets discarded by the hardware due to a lack of
1271receive buffers.
1272.It Sy MAC_STAT_IERRORS
1273The total number of errors detected on input.
1274.It Sy MAC_STAT_UNKNOWNS
1275The total number of received packets that were discarded because they
1276were of an unknown protocol.
1277.It Sy MAC_STAT_NOXMTBUF
1278The total number of outgoing packets dropped due to a lack of transmit
1279buffers.
1280.It Sy MAC_STAT_OERRORS
1281The total number of outgoing packets that resulted in errors.
1282.It Sy MAC_STAT_COLLISIONS
1283Total number of collisions encountered by the transmitter.
1284.It Sy MAC_STAT_RBYTES
1285The total number of
1286.Sy bytes
1287received by the device, regardless of packet type.
1288.It Sy MAC_STAT_IPACKETS
1289The total number of
1290.Sy packets
1291received by the device, regardless of packet type.
1292.It Sy MAC_STAT_OBYTES
1293The total number of
1294.Sy bytes
1295transmitted by the device, regardless of packet type.
1296.It Sy MAC_STAT_OPACKETS
1297The total number of
1298.Sy packets
1299sent by the device, regardless of packet type.
1300.It Sy MAC_STAT_UNDERFLOWS
1301The total number of packets that were smaller than the minimum sized
1302packet for the device and were therefore dropped.
1303.It Sy MAC_STAT_OVERFLOWS
1304The total number of packets that were larger than the maximum sized
1305packet for the device and were therefore dropped.
1306.El
1307.Ss Ethernet Specific Statistics
1308The following statistics are specific to Ethernet devices.
1309They refer to values from RFC 1643 and include various MII/GMII specific stats.
1310Many of these are also defined in IEEE 802.3.
1311.Bl -tag -width Ds
1312.It Sy ETHER_STAT_ADV_CAP_1000FDX
1313Indicates that the device is advertising support for 1 Gbit/s
1314full-duplex operation.
1315.It Sy ETHER_STAT_ADV_CAP_1000HDX
1316Indicates that the device is advertising support for 1 Gbit/s
1317half-duplex operation.
1318.It Sy ETHER_STAT_ADV_CAP_100FDX
1319Indicates that the device is advertising support for 100 Mbit/s
1320full-duplex operation.
1321.It Sy ETHER_STAT_ADV_CAP_100GFDX
1322Indicates that the device is advertising support for 100 Gbit/s
1323full-duplex operation.
1324.It Sy ETHER_STAT_ADV_CAP_100HDX
1325Indicates that the device is advertising support for 100 Mbit/s
1326half-duplex operation.
1327.It Sy ETHER_STAT_ADV_CAP_100T4
1328Indicates that the device is advertising support for 100 Mbit/s
1329100BASE-T4 operation.
1330.It Sy ETHER_STAT_ADV_CAP_10FDX
1331Indicates that the device is advertising support for 10 Mbit/s
1332full-duplex operation.
1333.It Sy ETHER_STAT_ADV_CAP_10GFDX
1334Indicates that the device is advertising support for 10 Gbit/s
1335full-duplex operation.
1336.It Sy ETHER_STAT_ADV_CAP_10HDX
1337Indicates that the device is advertising support for 10 Mbit/s
1338half-duplex operation.
1339.It Sy ETHER_STAT_ADV_CAP_2500FDX
1340Indicates that the device is advertising support for 2.5 Gbit/s
1341full-duplex operation.
1342.It Sy ETHER_STAT_ADV_CAP_40GFDX
1343Indicates that the device is advertising support for 40 Gbit/s
1344full-duplex operation.
1345.It Sy ETHER_STAT_ADV_CAP_5000FDX
1346Indicates that the device is advertising support for 5.0 Gbit/s
1347full-duplex operation.
1348.It Sy ETHER_STAT_ADV_CAP_ASMPAUSE
1349Indicates that the device is advertising support for receiving pause
1350frames.
1351.It Sy ETHER_STAT_ADV_CAP_AUTONEG
1352Indicates that the device is advertising support for auto-negotiation.
1353.It Sy ETHER_STAT_ADV_CAP_PAUSE
1354Indicates that the device is advertising support for generating pause
1355frames.
1356.It Sy ETHER_STAT_ADV_REMFAULT
1357Indicates that the device is advertising support for detecting faults in
1358the remote link peer.
1359.It Sy ETHER_STAT_ALIGN_ERRORS
1360Indicates the number of times an alignment error was generated by the
1361Ethernet device.
1362This is a count of packets that were not an integral number of octets and failed
1363the FCS check.
1364.It Sy ETHER_STAT_CAP_1000FDX
1365Indicates the device supports 1 Gbit/s full-duplex operation.
1366.It Sy ETHER_STAT_CAP_1000HDX
1367Indicates the device supports 1 Gbit/s half-duplex operation.
1368.It Sy ETHER_STAT_CAP_100FDX
1369Indicates the device supports 100 Mbit/s full-duplex operation.
1370.It Sy ETHER_STAT_CAP_100GFDX
1371Indicates the device supports 100 Gbit/s full-duplex operation.
1372.It Sy ETHER_STAT_CAP_100HDX
1373Indicates the device supports 100 Mbit/s half-duplex operation.
1374.It Sy ETHER_STAT_CAP_100T4
1375Indicates the device supports 100 Mbit/s 100BASE-T4 operation.
1376.It Sy ETHER_STAT_CAP_10FDX
1377Indicates the device supports 10 Mbit/s full-duplex operation.
1378.It Sy ETHER_STAT_CAP_10GFDX
1379Indicates the device supports 10 Gbit/s full-duplex operation.
1380.It Sy ETHER_STAT_CAP_10HDX
1381Indicates the device supports 10 Mbit/s half-duplex operation.
1382.It Sy ETHER_STAT_CAP_2500FDX
1383Indicates the device supports 2.5 Gbit/s full-duplex operation.
1384.It Sy ETHER_STAT_CAP_40GFDX
1385Indicates the device supports 40 Gbit/s full-duplex operation.
1386.It Sy ETHER_STAT_CAP_5000FDX
1387Indicates the device supports 5.0 Gbit/s full-duplex operation.
1388.It Sy ETHER_STAT_CAP_ASMPAUSE
1389Indicates that the device supports the ability to receive pause frames.
1390.It Sy ETHER_STAT_CAP_AUTONEG
1391Indicates that the device supports the ability to perform link
1392auto-negotiation.
1393.It Sy ETHER_STAT_CAP_PAUSE
1394Indicates that the device supports the ability to transmit pause frames.
1395.It Sy ETHER_STAT_CAP_REMFAULT
1396Indicates that the device supports the ability of detecting a remote
1397fault in a link peer.
1398.It Sy ETHER_STAT_CARRIER_ERRORS
1399Indicates the number of times that the Ethernet carrier sense condition
1400was lost or not asserted.
1401.It Sy ETHER_STAT_DEFER_XMTS
1402Indicates the number of frames for which the device was unable to
1403transmit the frame due to being busy and had to try again.
1404.It Sy ETHER_STAT_EX_COLLISIONS
1405Indicates the number of frames that failed to send due to an excessive
1406number of collisions.
1407.It Sy ETHER_STAT_FCS_ERRORS
1408Indicates the number of times that a frame check sequence failed.
1409.It Sy ETHER_STAT_FIRST_COLLISIONS
1410Indicates the number of times that a frame was eventually transmitted
1411successfully, but only after a single collision.
1412.It Sy ETHER_STAT_JABBER_ERRORS
1413Indicates the number of frames that were received that were both larger
1414than the maximum packet size and failed the frame check sequence.
1415.It Sy ETHER_STAT_LINK_ASMPAUSE
1416Indicates whether the link is currently configured to accept pause
1417frames.
1418.It Sy ETHER_STAT_LINK_AUTONEG
1419Indicates whether the current link state is a result of
1420auto-negotiation.
1421.It Sy ETHER_STAT_LINK_DUPLEX
1422Indicates the current duplex state of the link.
1423The values used here should be the same as documented for
1424.Sy MAC_PROP_DUPLEX .
1425.It Sy ETHER_STAT_LINK_PAUSE
1426Indicates whether the link is currently configured to generate pause
1427frames.
1428.It Sy ETHER_STAT_LP_CAP_1000FDX
1429Indicates the remote device supports 1 Gbit/s full-duplex operation.
1430.It Sy ETHER_STAT_LP_CAP_1000HDX
1431Indicates the remote device supports 1 Gbit/s half-duplex operation.
1432.It Sy ETHER_STAT_LP_CAP_100FDX
1433Indicates the remote device supports 100 Mbit/s full-duplex operation.
1434.It Sy ETHER_STAT_LP_CAP_100GFDX
1435Indicates the remote device supports 100 Gbit/s full-duplex operation.
1436.It Sy ETHER_STAT_LP_CAP_100HDX
1437Indicates the remote device supports 100 Mbit/s half-duplex operation.
1438.It Sy ETHER_STAT_LP_CAP_100T4
1439Indicates the remote device supports 100 Mbit/s 100BASE-T4 operation.
1440.It Sy ETHER_STAT_LP_CAP_10FDX
1441Indicates the remote device supports 10 Mbit/s full-duplex operation.
1442.It Sy ETHER_STAT_LP_CAP_10GFDX
1443Indicates the remote device supports 10 Gbit/s full-duplex operation.
1444.It Sy ETHER_STAT_LP_CAP_10HDX
1445Indicates the remote device supports 10 Mbit/s half-duplex operation.
1446.It Sy ETHER_STAT_LP_CAP_2500FDX
1447Indicates the remote device supports 2.5 Gbit/s full-duplex operation.
1448.It Sy ETHER_STAT_LP_CAP_40GFDX
1449Indicates the remote device supports 40 Gbit/s full-duplex operation.
1450.It Sy ETHER_STAT_LP_CAP_5000FDX
1451Indicates the remote device supports 5.0 Gbit/s full-duplex operation.
1452.It Sy ETHER_STAT_LP_CAP_ASMPAUSE
1453Indicates that the remote device supports the ability to receive pause
1454frames.
1455.It Sy ETHER_STAT_LP_CAP_AUTONEG
1456Indicates that the remote device supports the ability to perform link
1457auto-negotiation.
1458.It Sy ETHER_STAT_LP_CAP_PAUSE
1459Indicates that the remote device supports the ability to transmit pause
1460frames.
1461.It Sy ETHER_STAT_LP_CAP_REMFAULT
1462Indicates that the remote device supports the ability of detecting a
1463remote fault in a link peer.
1464.It Sy ETHER_STAT_MACRCV_ERRORS
1465Indicates the number of times that the internal MAC layer encountered an
1466error when attempting to receive and process a frame.
1467.It Sy ETHER_STAT_MACXMT_ERRORS
1468Indicates the number of times that the internal MAC layer encountered an
1469error when attempting to process and transmit a frame.
1470.It Sy ETHER_STAT_MULTI_COLLISIONS
1471Indicates the number of times that a frame was eventually transmitted
1472successfully, but only after more than one collision.
1473.It Sy ETHER_STAT_SQE_ERRORS
1474Indicates the number of times that an SQE error occurred.
1475The specific conditions for this error are documented in IEEE 802.3.
1476.It Sy ETHER_STAT_TOOLONG_ERRORS
1477Indicates the number of frames that were received that were longer than
1478the maximum frame size supported by the device.
1479.It Sy ETHER_STAT_TOOSHORT_ERRORS
1480Indicates the number of frames that were received that were shorter than
1481the minimum frame size supported by the device.
1482.It Sy ETHER_STAT_TX_LATE_COLLISIONS
1483Indicates the number of times a collision was detected late on the
1484device.
1485.It Sy ETHER_STAT_XCVR_ADDR
1486Indicates the address of the MII/GMII receiver address.
1487.It Sy ETHER_STAT_XCVR_ID
1488Indicates the id of the MII/GMII receiver address.
1489.It Sy ETHER_STAT_XCVR_INUSE
1490Indicates what kind of receiver is in use.
1491The following values may be used:
1492.Bl -tag -width Ds
1493.It Sy XCVR_UNDEFINED
1494The receiver type is undefined by the hardware.
1495.It Sy XCVR_NONE
1496There is no receiver in use by the hardware.
1497.It Sy XCVR_10
1498The receiver supports 10BASE-T operation.
1499.It Sy XCVR_100T4
1500The receiver supports 100BASE-T4 operation.
1501.It Sy XCVR_100X
1502The receiver supports 100BASE-TX operation.
1503.It Sy XCVR_100T2
1504The receiver supports 100BASE-T2 operation.
1505.It Sy XCVR_1000X
1506The receiver supports 1000BASE-X operation.
1507This is used for all fiber receivers.
1508.It Sy XCVR_1000T
1509The receiver supports 1000BASE-T operation.
1510This is used for all copper receivers.
1511.El
1512.El
1513.Ss Device Specific kstats
1514In addition to the defined statistics above, if the device driver
1515maintains additional statistics or the device provides additional
1516statistics, it should create its own kstats through the
1517.Xr kstat_create 9F
1518function to allow operators to observe them.
1519.Sh TX STALL DETECTION, DEVICE RESETS, AND FAULT MANAGEMENT
1520Device drivers are the first line of defense for dealing with broken
1521devices and bugs in their firmware.
1522While most devices will rarely fail, it is important that when designing and
1523implementing the device driver that particular attention is paid in the design
1524with respect to RAS (Reliability, Availability, and Serviceability).
1525While everything described in this section is optional, it is highly recommended
1526that all new device drivers follow these guidelines.
1527.Pp
1528The Fault Management Architecture (FMA) provides facilities for
1529detecting and reporting various classes of defects and faults.
1530Specifically for networking device drivers, issues that should be
1531detected and reported include:
1532.Bl -bullet -offset indent
1533.It
1534Device internal uncorrectable errors
1535.It
1536Device internal correctable errors
1537.It
1538PCI and PCI Express transport errors
1539.It
1540Device temperature alarms
1541.It
1542Device transmission stalls
1543.It
1544Device communication timeouts
1545.It
1546High invalid interrupts
1547.El
1548.Pp
1549All such errors fall into three primary categories:
1550.Bl -enum -offset indent
1551.It
1552Errors detected by the Fault Management Architecture
1553.It
1554Errors detected by the device and indicated to the device driver
1555.It
1556Errors detected by the device driver
1557.El
1558.Ss Fault Management Setup and Teardown
1559Drivers should initialize support for the fault management framework by
1560calling
1561.Xr ddi_fm_init 9F
1562from their
1563.Xr attach 9E
1564routine.
1565By registering with the fault management framework, a device driver is given the
1566chance to detect and notice transport errors as well as report other errors that
1567exist.
1568While a device driver does not need to indicate that it is capable of all such
1569capabilities described in
1570.Xr ddi_fm_init 9F ,
1571we suggest that device drivers at least register the
1572.Sy DDI_FM_EREPORT_CAPABLE
1573so as to allow the driver to report issues that it detects.
1574.Pp
1575If the driver registers with the fault management framework during its
1576.Xr attach 9E
1577entry point, it must call
1578.Xr ddi_fm_fini 9F
1579during its
1580.Xr detach 9E
1581entry point.
1582.Ss Transport Errors
1583Many modern networking devices leverage PCI or PCI Express.
1584As such, there are two primary ways that device drivers access data: they either
1585memory map device registers and use routines like
1586.Xr ddi_get8 9F
1587and
1588.Xr ddi_put8 9F
1589or they use direct memory access (DMA).
1590New device drivers should always enable checking of the transport layer by
1591marking their support in the
1592.Xr ddi_device_acc_attr 9S
1593structure and using routines like
1594.Xr ddi_fm_acc_err_get 9F
1595and
1596.Xr ddi_fm_dma_err_get 9F
1597to detect if errors have occurred.
1598.Ss Device Indicated Errors
1599Many devices have capabilities to announce to a device driver that a
1600fatal correctable error or uncorrectable error has occurred.
1601Other devices have the ability to indicate that various physical issues have
1602occurred such as a fan failing or a temperature sensor having fired.
1603.Pp
1604Drivers should wire themselves to receive notifications when these
1605events occur.
1606The means and capabilities will vary from device to device.
1607For example, some devices will generate information about these notifications
1608through special interrupts.
1609Other devices may have a register that software can poll.
1610In the cases where polling is required, driver writers should try not to poll
1611too frequently and should generally only poll when the device is actively being
1612used, e.g. between calls to the
1613.Xr mc_start 9E
1614and
1615.Xr mc_stop 9E
1616entry points.
1617.Ss Driver Transmit Stall Detection
1618One of the primary responsibilities of a hardened device driver is to
1619perform transmit stall detection.
1620The core idea behind tx stall detection is that the driver should record when
1621it's getting activity related to when data has been successfully transmitted.
1622Most devices should be transmitting data on a regular basis as long as the link
1623is up.
1624If it is not, then this may indicate that the device is stuck and needs to be
1625reset.
1626At this time, the MAC framework does not provide any resources for performing
1627these checks; however, polling on each individual transmit ring for the last
1628completion time while something is actively being transmitted through the use of
1629routines such as
1630.Xr timeout 9F
1631may be a reasonable starting point.
1632.Ss Driver Command Timeout Detection
1633Each device is programmed in different ways.
1634Some devices are programmed through asynchronous commands while others are
1635programmed by writing directly to memory mapped registers.
1636If a device receives asynchronous replies to commands, then the device driver
1637should set reasonable timeouts for all such commands and plan on detecting them.
1638If a timeout occurs, the driver should presume that there is an issue with the
1639hardware and proceed to abort the command or reset the device.
1640.Pp
1641Many devices do not have such a communication mechanism.
1642However, whenever there is some activity where the device driver must wait, then
1643it should be prepared for the fact that the device may never get back to
1644it and react appropriately by performing some kind of device reset.
1645.Ss Reacting to Errors
1646When any of the above categories of errors has been triggered, the
1647behavior that the device driver should take depends on the kind of
1648error.
1649If a fatal error, for example, a transport error, a transmit stall was detected,
1650or the device indicated an uncorrectable error was detected, then it is
1651important that the driver take the following steps:
1652.Bl -enum -offset indent
1653.It
1654Set a flag in the device driver's state that indicates that it has hit
1655an error condition.
1656When this error condition flag is asserted, transmitted packets should be
1657accepted and dropped and actions that would require writing to the device state
1658should fail with an error.
1659This flag should remain until the device has been successfully restarted.
1660.It
1661If the error was not a transport error that was indicated by the fault
1662management architecture, e.g. a transport error that was detected, then
1663the device driver should post an
1664.Sy ereport
1665indicating what has occurred with the
1666.Xr ddi_fm_ereport_post 9F
1667function.
1668.It
1669The device driver should indicate that the device's service was lost
1670with a call to
1671.Xr ddi_fm_service_impact 9F
1672using the symbol
1673.Sy DDI_SERVICE_LOST .
1674.It
1675At this point the device driver should issue a device reset through some
1676device-specific means.
1677.It
1678When the device reset has been completed, then the device driver should
1679restore all of the programmed state to the device.
1680This includes things like the current MTU, advertised auto-negotiation speeds,
1681MAC address filters, and more.
1682.It
1683Finally, when service has been restored, the device driver should call
1684.Xr ddi_fm_service_impact 9F
1685using the symbol
1686.Sy DDI_SERVICE_RESTORED .
1687.El
1688.Pp
1689When a non-fatal error occurs, then the device driver should submit an
1690ereport and should optionally mark the device degraded using
1691.Xr ddi_fm_service_impact 9F
1692with the
1693.Sy DDI_SERVICE_DEGRADED
1694value depending on the nature of the problem that has occurred.
1695.Pp
1696Device drivers should never make the decision to remove a device from
1697service based on errors that have occurred nor should they panic the
1698system.
1699Rather, the device driver should always try to notify the operating system with
1700various ereports and allow its policy decisions to occur.
1701The decision to retire a device lies in the hands of the fault management
1702architecture.
1703It knows more about the operator's intent and the surrounding system's state
1704than the device driver itself does and it will make the call to offline and
1705retire the device if it is required.
1706.Ss Device Resets
1707When resetting a device, a device driver must exercise caution.
1708If a device driver has not been written to plan for a device reset, then it
1709may not correctly restore the device's state after such a reset.
1710Such state should be stored in the instance's private state data as the MAC
1711framework does not know about device resets and will not inform the
1712device again about the expected, programmed state.
1713.Pp
1714One wrinkle with device resets is that many networking cards show up as
1715multiple PCI functions on a single device, for example, each port may
1716show up as a separate function and thus have a separate instance of the
1717device driver attached.
1718When resetting a function, device driver writers should carefully read the
1719device programming manuals and verify whether or not a reset impacts only the
1720stalled function or if it impacts all function across the device.
1721.Pp
1722If the only way to reset a given function is through the device, then
1723this may require more coordination and work on the part of the device
1724driver to ensure that all the other instances are correctly restored.
1725In cases where this occurs, some devices offer ways of injecting
1726interrupts onto those other functions to notify them that this is
1727occurring.
1728.Sh MBLKS AND DMA
1729The networking stack manages framed data through the use of the
1730.Xr mblk 9S
1731structure.
1732The mblk allows for a single message to be made up of individual blocks.
1733Each part is linked together through its
1734.Sy b_cont
1735member.
1736However, it also allows for multiple messages to be chained together through the
1737use of the
1738.Sy b_next
1739member.
1740While the networking stack works with these structures, device drivers generally
1741work with DMA regions.
1742There are two different strategies that device drivers use for handling these
1743two different cases: copying and binding.
1744.Ss Copying Data
1745The first way that device drivers handle interfacing between the two is
1746by having two separate regions of memory.
1747One part is memory which has been allocated for DMA through a call to
1748.Xr ddi_dma_mem_alloc 9F
1749and the other is memory associated with the memory block.
1750.Pp
1751In this case, a driver will use
1752.Xr bcopy 9F
1753to copy memory between the two distinct regions.
1754When transmitting a packet, it will copy the memory from the mblk_t to the DMA
1755region.
1756When receiving memory, it will allocate a mblk_t through the
1757.Xr allocb 9F
1758routine, copy the memory across with
1759.Xr bcopy 9F ,
1760and then increment the mblk_t's
1761.Sy w_ptr
1762structure.
1763.Pp
1764If, when receiving, memory is not available for a new message block,
1765then the frame should be skipped and effectively dropped.
1766A kstat should be bumped when such an occasion occurs.
1767.Ss Binding Data
1768An alternative approach to copying data is to use DMA binding.
1769When using DMA binding, the OS takes care of mapping between DMA memory and
1770normal device memory.
1771The exact process is a bit different between transmit and receive.
1772.Pp
1773When transmitting a device driver has an mblk_t and needs to call the
1774.Xr ddi_dma_addr_bind_handle 9F
1775function to bind it to an already existing DMA handle.
1776At that point, it will receive various DMA cookies that it can use to obtain the
1777addresses to program the device with for transmitting data.
1778Once the transmit is done, the driver must then make sure to call
1779.Xr freemsg 9F
1780to release the data.
1781It must not call
1782.Xr freemsg 9F
1783before it receives an interrupt from the device indicating that the data
1784has been transmitted, otherwise it risks sending arbitrary kernel
1785memory.
1786.Pp
1787When receiving data, the device can perform a similar operation.
1788First, it must bind the DMA memory into the kernel's virtual memory address
1789space through a call to the
1790.Xr ddi_dma_addr_bind_handle 9F
1791function if it has not already.
1792Once it has, it must then call
1793.Xr desballoc 9F
1794to try and create a new mblk_t which leverages the associated memory.
1795It can then pass that mblk_t up to the stack.
1796.Ss Considerations
1797When deciding which of these options to use, there are many different
1798considerations that must be made.
1799The answer as to whether to bind memory or to copy data is not always simpler.
1800.Pp
1801The first thing to remember is that DMA resources may be finite on a
1802given platform.
1803Consider the case of receiving data.
1804A device driver that binds one of its receive descriptors may not get it back
1805for quite some time as it may be used by the kernel until an application
1806actually consumes it.
1807Device drivers that try to bind memory for receive, often work with the
1808constraint that they must be able to replace that DMA memory with another DMA
1809descriptor.
1810If they were not replaced, then eventually the device would not be able to
1811receive additional data into the ring.
1812.Pp
1813On the other hand, particularly for larger frames, copying every packet
1814from one buffer to another can be a source of additional latency and
1815memory waste in the system.
1816For larger copies, the cost of copying may dwarf any potential cost of
1817performing DMA binding.
1818.Pp
1819For device driver authors that are unsure of what to do, they should
1820first employ the copying method to simplify the act of writing the
1821device driver.
1822The copying method is simpler and also allows the device driver author not to
1823worry about allocated DMA memory that is still outstanding when it is asked to
1824unload.
1825.Pp
1826If device driver writers are worried about the cost, it is recommended
1827to make the decision as to whether or not to copy or bind DMA data
1828a separate private property for both transmitting and receiving.
1829That private property should indicate the size of the received frame at which
1830to switch from one format to the other.
1831This way, data can be gathered to determine what the impact of each method is on
1832a given platform.
1833.Sh SEE ALSO
1834.Xr dladm 1M ,
1835.Xr driver.conf 4 ,
1836.Xr ieee802.3 5 ,
1837.Xr dlpi 7P ,
1838.Xr _fini 9E ,
1839.Xr _info 9E ,
1840.Xr _init 9E ,
1841.Xr attach 9E ,
1842.Xr close 9E ,
1843.Xr detach 9E ,
1844.Xr mc_close 9E ,
1845.Xr mc_getcapab 9E ,
1846.Xr mc_getprop 9E ,
1847.Xr mc_getstat 9E ,
1848.Xr mc_multicst 9E  ,
1849.Xr mc_open 9E ,
1850.Xr mc_propinfo 9E  ,
1851.Xr mc_setpromisc 9E  ,
1852.Xr mc_setprop 9E ,
1853.Xr mc_start 9E ,
1854.Xr mc_stop 9E ,
1855.Xr mc_tx 9E ,
1856.Xr mc_unicst 9E  ,
1857.Xr open 9E ,
1858.Xr allocb 9F ,
1859.Xr bcopy 9F ,
1860.Xr ddi_dma_addr_bind_handle 9F ,
1861.Xr ddi_dma_mem_alloc 9F ,
1862.Xr ddi_fm_acc_err_get 9F ,
1863.Xr ddi_fm_dma_err_get 9F ,
1864.Xr ddi_fm_ereport_post 9F ,
1865.Xr ddi_fm_fini 9F ,
1866.Xr ddi_fm_init 9F ,
1867.Xr ddi_fm_service_impact 9F ,
1868.Xr ddi_get8 9F ,
1869.Xr ddi_put8 9F ,
1870.Xr desballoc 9F ,
1871.Xr freemsg 9F ,
1872.Xr kstat_create 9F ,
1873.Xr mac_alloc 9F ,
1874.Xr mac_fini_ops 9F ,
1875.Xr mac_hcksum_get 9F ,
1876.Xr mac_hcksum_set 9F ,
1877.Xr mac_init_ops 9F ,
1878.Xr mac_link_update 9F ,
1879.Xr mac_lso_get 9F ,
1880.Xr mac_maxsdu_update 9F ,
1881.Xr mac_prop_info_set_default_link_flowctrl 9F ,
1882.Xr mac_prop_info_set_default_str 9F ,
1883.Xr mac_prop_info_set_default_uint32 9F ,
1884.Xr mac_prop_info_set_default_uint64 9F ,
1885.Xr mac_prop_info_set_default_uint8 9F ,
1886.Xr mac_prop_info_set_perm 9F ,
1887.Xr mac_prop_info_set_range_uint32 9F ,
1888.Xr mac_register 9F ,
1889.Xr mac_rx 9F ,
1890.Xr mac_unregister 9F ,
1891.Xr mod_install 9F ,
1892.Xr mod_remove 9F ,
1893.Xr strcmp 9F ,
1894.Xr timeout 9F ,
1895.Xr cb_ops 9S ,
1896.Xr ddi_device_acc_attr 9S ,
1897.Xr dev_ops 9S ,
1898.Xr mac_callbacks 9S ,
1899.Xr mac_register 9S ,
1900.Xr mblk 9S ,
1901.Xr modldrv 9S ,
1902.Xr modlinkage 9S
1903.Rs
1904.%A McCloghrie, K.
1905.%A Rose, M.
1906.%T RFC 1213 Management Information Base for Network Management of
1907.%T TCP/IP-based internets: MIB-II
1908.%D March 1991
1909.Re
1910.Rs
1911.%A McCloghrie, K.
1912.%A Kastenholz, F.
1913.%T RFC 1573 Evolution of the Interfaces Group of MIB-II
1914.%D January 1994
1915.Re
1916.Rs
1917.%A Kastenholz, F.
1918.%T RFC 1643 Definitions of Managed Objects for the Ethernet-like
1919.%T Interface Types
1920.Re
1921