xref: /illumos-gate/usr/src/uts/common/io/usb/hcd/README (revision 8119dad84d6416f13557b0ba8e2aaf9064cbcfd3)
1#
2# CDDL HEADER START
3#
4# The contents of this file are subject to the terms of the
5# Common Development and Distribution License (the "License").
6# You may not use this file except in compliance with the License.
7#
8# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9# or http://www.opensolaris.org/os/licensing.
10# See the License for the specific language governing permissions
11# and limitations under the License.
12#
13# When distributing Covered Code, include this CDDL HEADER in each
14# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15# If applicable, add the following below this CDDL HEADER, with the
16# fields enclosed by brackets "[]" replaced with your own identifying
17# information: Portions Copyright [yyyy] [name of copyright owner]
18#
19# CDDL HEADER END
20#
21/*
22 * Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
23 * Use is subject to license terms.
24 */
25
26
27			SOLARIS USB BANDWIDTH ANALYSIS
28
291.Introduction
30
31  This document discuss the USB bandwidth allocation scheme, and the protocol
32  overheads used for both full and high speed host controller drivers. This
33  information is derived from the USB 2.0 specification, the "Bandwidth Analysis
34  Whitepaper" which is posted on www.usb.org, and other resources.
35
36  The target audience for this whitepaper are USB software & hardware designers
37  and engineers, and other interested people. The reader should be familiar with
38  the Universal Serial Bus Specification version 2.0, the OpenHCI Specification
39  1.0a and the Enhanced HCI Specification 1.0.
40
412.Full speed bus
42
43  The following overheads, formulas and scheme are applicable both to full speed
44  host controllers and also to high speed hub Transaction Translators (TT),
45  which perform full/low speed transactions.
46
47  o Timing and data rate calculations
48
49    - Timing calculations
50
51      1 sec			1000 ms or 1000000000 ns
52      1 ms			1 frame
53
54    - Data rate calculations
55
56      1 ms			1500 bytes or 12000 bits (per  frame)
57      668 ns			1 byte or 8 bits
58
59      1 full speed bit time	83.54 ns
60
61  o Protocol Overheads and Bandwidth numbers
62
63    - Protocol Overheads
64
65      (Refer 5.11.3 section of USB2.0 specification & page 2 of USB Bandwidth
66       Analysis document)
67
68      Non Isochronous	  	9107 ns			14 bytes
69      Isochronous Output	6265 ns			10 bytes
70      Isochronous Input	        7268 ns			11 bytes
71      Low-speed overhead       64060 ns			97 bytes
72      Hub LS overhead*	         668 ns	 	 	 1 byte
73      SOF		        4010 ns 	 	 6 bytes
74      EOF		        2673 ns		 	 4 bytes
75
76      Host Delay*		Specific to hardware    18 bytes
77      Low-Speed clock*		Slower than Full speed	 8
78
79    - Bandwidth numbers
80
81      (Refer 7.3.5 section of OHCI specification 1.0a & page 2 of USB Bandwidth
82       Analysis document)
83
84      Maximum bandwidth available		      1500 bytes/frame
85      Maximum Non Periodic bandwidth	  	       197 bytes/frame
86      Maximum Periodic bandwidth		      1293 bytes/frame
87
88      NOTE:
89
90      1.Hub specific low speed overhead
91
92        The time provided by the Host Controller for hubs to enable Low Speed
93        ports. The minimum of 4 full speed bit time.
94
95        overhead = 2 x Hub_LS_Setup
96                 = 2 x (4 x 83.54) =  668.32 Nano seconds  = 1 byte.
97
98      2.Host delay will be specific to particular hardware. The following host
99        delay is for RIO USB OHCI host controller (Provided by Ken Ward - RIO
100        USB hardware person). The following is just an example how to calculate
101        "host delay" for given USB host controller implementation.
102
103        Ex: Assuming ED (Endpoint Descriptor)/TD's (Transfer Descriptor) are not
104            streaming in Schizo (PCI bridge) and no cache hits for an ED or TD:
105
106            To read an ED or TD or data:
107
108            PCI_ARB_DELAY + PCI_ADDRESS + SCHIZO_RETRY
109            PCI_ARB_DELAY + PCI_ADDRESS + SCHIZO_TRDY +
110			DATA +  Core_overhead
111
112            Where,
113
114	    PCI_ARB_DELAY = 2000ns
115	    PCI_ADDRESS = 30ns
116	    SCHIZO RETRY = 60ns
117	    SCHIZO TRDY = 60ns
118	    DATA = 240ns (Always read 64 bytes ...)
119	    Core Overhead =240 + 30 * (MPS/4) + 83.54 * (MPS/4) + 4 * 83.54
120	    =  ~3400ns
121
122	    now multiply by 3 for ED+TD+DATA = 10200ns = ~128 bits or 16 bytes.
123
124	    This is probably on the optimistic side, only using 2us for the
125	    PCI_ARB_DELAY.
126
127	If there is a USB cache hit, the time it takes for an ED or TD is:
128
129	CORE SYNC DELAY + CACHE_HIT CHECK + 30 * (MPS/4) + CORE OVERHEAD
130
131	240 + 30 + 120 + 1000ns ~ 1400ns , or ~ 2 bytes
132
133        Total Host delay will be 18 bytes.
134
135      3.The Low-Speed clock is eight times slower than full speed  i.e. 1/8th of
136        the full speed.
137
138      4.For non-periodic transfers, reserve for at least one low-speed device
139        transaction per frame. According to the USB Bandwidth Analysis white
140        paper and also as per OHCI Specification 1.0a, section 7.3.5, page 123,
141        one low-speed transaction takes  0x628h full speed bits (197 bytes),
142        which comes to around 13% of USB frame time.
143
144     5. Maximum Periodic bandwidth is calculated using the following formula
145
146        Maximum Periodic bandwidth  = Maximum bandwidth available
147        - SOF - EOF -  Maximum Non Periodic bandwidth.
148
149  o Bus Transaction Formulas
150
151    (Refer 5.11.3 section of USB2.0 specification)
152
153    - Full-Speed:
154
155      Protocol overhead + ((MaxPacketSize * 7) / 6 ) + Host_Delay
156
157    - Low-Speed:
158
159      Protocol overhead + Hub LS overhead +
160		(Low-Speed clock  * ((MaxPacketSize * 7) / 6 )) + Host_Delay
161
162  o Periodic Schedule
163
164    The figure 5.5 in OHCI specification 1.0a gives you information on periodic
165    scheduling, different polling intervals that are supported, & other details
166    for the OHCI host controller.
167
168    - The host controller processes one interrupt endpoint descriptor list every
169      frame. The lower five bits of the current frame number us  used as an
170      index into an array of 32 interrupt endpoint descriptor lists or periodic
171      frame lists found in the HCCA (Host controller communication area). This
172      means each list is revisited once every 32ms. The host controller driver
173      sets up the interrupt lists to visit any given endpoint descriptor in as
174      many lists as necessary to provide the interrupt granularity required for
175      that endpoint. See figure 5.5 in OHCI specification 1.0a.
176
177    - Isochronous endpoint descriptors are added at the end of 1ms interrupt
178      endpoint descriptors.
179
180    - The host controller driver maintains an array of 32 frame bandwidth lists
181      to save bandwidth allocated in each USB frame.
182
183      Please refer section 5.2.7.2 of OHCI specification 1.0a, page 61 for more
184      details.
185
186  o Bandwidth Allocation Scheme
187
188    The OHCI host controller driver will go through the following steps to
189    allocate bandwidth needed for  an interrupt or isochronous endpoint as
190    follows
191
192    - Calculate the bandwidth required for the given endpoint using the bus
193      transaction formula and protocol overhead calculations mentioned in
194      previous section.
195
196    - Compare the bandwidth available in the least allocated frame list out of
197      the 32 frame bandwidth lists, against the bandwidth required by this
198      endpoint. If this exceeds the limit, then, an return error.
199
200    - Find out the static node to which the given endpoint needs to be linked
201      so that it will be polled as per the required polling interval. This value
202      varies based on polling interval and current bandwidth load on this
203      schedule. See figure 5.5 in OHCI specification 1.0a.
204
205      Ex: If a polling interval is 4ms, then, the endpoint will be linked to one
206          of the four static nodes (range 3-6) in the 4ms column of figure 5.5
207          in OHCI specification 1.0a.
208
209    - Depending on the polling interval, we need to add the above calculated
210      bandwidth to one or more frame bandwidth lists. Before adding, we need to
211      double check the availability of bandwidth in those respective lists. If
212      this exceeds the limit, then, return an error. Add this bandwidth to all
213      the required frame bandwidth lists.
214
215      Ex: Assume a give polling interval of 4 and a static node value of 3.
216          In this case, we need to add required bandwidth to 0,4,8,12,16,20,24,
217          28 frame bandwidth lists.
218
219
2203.High speed bus
221
222  o Timing and data rate calculations
223
224    - Timing calculations
225
226      1 sec			1000 ms
227      125 us			1 uframe
228      1 ms			1 frame or 8  uframes
229
230    - Data rate calculations
231
232      125 us			7500 bytes (per uframe)
233      16.66 ns			1 byte or 8 bits
234
235      1 high speed bit time	2.083 ns
236
237  o Protocol Overheads and Bandwidth numbers
238
239    - Protocol Overheads
240
241      (Refer 5.11.3, 8.4.2.2 and 8.4.2.3 sections of USB2.0 specification)
242
243      Non Isochronous	  	917 ns			55 bytes
244      Isochronous 		634 ns			38 bytes
245
246      Start split  overhead 	 67 ns		  	 4 bytes
247      Complete split  overhead 	 67 ns		  	 4 bytes
248
249      SOF		  	200 ns			12 bytes
250      EOF		       1667 ns 			70 bytes
251
252      Host Delay*		 Specific to hardware 	18 bytes
253
254    - Bandwidth numbers
255
256      (Refer 5.5.4 section of USB2.0 specification)
257
258      Maximum bandwidth available		      7500 bytes/uframe
259      Maximum Non Periodic bandwidth*		      1500 bytes/uframe
260      Maximum Periodic bandwidth*		      5918 bytes/uframe
261
262      NOTE:
263
264      1.Host delay will be specific to particular hardware.
265
266      2.As per USB 2.0 specification section 5.5.4, 20% of bus time is reserved
267        for the non-periodic high-speed transfers, where as periodic high-speed
268        transfers will get 80% of the bus time. In one micro-frame or 125us, we
269        can transfer 7500 bytes or 60,000 bits. So 20% of 7500 is 1500 bytes.
270
271      3.Maximum Periodic bandwidth is calculated using the following formula
272
273        Maximum Periodic bandwidth  = Maximum bandwidth available
274		- SOF - EOF -  Maximum Non Periodic bandwidth.
275
276  o Bus Transaction Formulas
277
278    (Refer 5.11.3 8.4.2.2 and 8.4.2.3 sections of USB2.0 specification)
279
280    - High-Speed (Non-Split transactions):
281
282      (Protocol overhead + ((MaxPacketSize * 7) / 6 ) +
283		Host_Delay) x Number of transactions per micro-frame
284
285    - High-Speed (Split transaction - Device to Host):
286
287      Start Split transaction:
288
289      Protocol overhead  + Host_Delay + Start split overhead
290
291      Complete Split transaction:
292
293      Protocol overhead  + ((MaxPacketSize * 7) / 6 ) +
294		Host_Delay + Complete split overhead
295
296    - High-Speed (Split transaction - Host to Device):
297
298      Start Split transaction:
299
300      Protocol overhead + ((MaxPacketSize * 7) / 6 ) +
301		Host_Delay) + Start split overhead
302
303      Complete Split transaction:
304
305      Protocol overhead  + Host_Delay + Complete split overhead
306
307
308  o Interrupt schedule or Start and Complete split masks
309
310    (Refer 3.6.2 & 4.12.2 sections of EHCI 1.0 specification)
311
312    - Interrupt schedule or Start split mask
313
314      This field  is used for for high, full and low speed usb device interrupt
315      and isochronous endpoints. This will tell the host controller which micro-
316      frame of a given usb frame to initiate a high speed interrupt and
317      isochronous transaction. For full/low speed devices, it will tell when to
318      initiate a "start split" transaction.
319
320	ehci_start_split_mask[15] = /* One byte field */
321	/*
322	 * For all low/full speed devices, and  for  high speed devices with
323	 * a polling interval greater than or equal to 8us (125us).
324	 */
325	{0x01,	/*  00000001 */
326	0x02,	/*  00000010 */
327	0x04,	/*  00000100 */
328	0x08,	/*  00001000 */
329	0x10,	/*  00010000 */
330	0x20,	/*  00100000 */
331	0x40,	/*  01000000 */
332	0x80,	/*  10000000 */
333
334	/* For high speed devices with a polling interval of 4us. */
335	0x11,	/* 00010001 */
336	0x22,	/* 00100010 */
337	0x44,	/* 01000100 */
338	0x88,	/* 10001000 */
339
340	/* For high speed devices with a polling interval of 2us. */
341	0x55,	/* 01010101 */
342	0xaa,	/* 10101010 */
343
344	/* For high speed devices with a polling interval of 1us. */
345	0xff };	/* 11111111 */
346
347    - Complete split mask
348
349      This field is used only for full/low speed usb device interrupt and
350      isochronous endpoints. It will tell the host controller which micro frame
351      to initiate a "complete split" transaction. Complete split transactions
352      can to be retried for up to 3 times. So bandwidth for complete split
353      transaction is reserved in 3 consecutive micro frames
354
355	ehci_complete_split_mask[8] = /* One byte field */
356	/* Only full/low speed devices */
357	{0x0e,	/*  00001110 */
358	0x1c,	/*  00011100 */
359	0x38,	/*  00111000 */
360	0x70,	/*  01110000 */
361	0xe0,	/*  11100000 */
362	Reserved ,	/*  Need FSTN feature  */
363	Reserved ,	/*  Need FSTN feature  */
364	Reserved};	/*  Need FSTN feature */
365
366  o Periodic Schedule
367
368    The figure 4.8 in EHCI specification gives you information on periodic
369    scheduling, different polling intervals that are supported, and other
370    details for the EHCI host controller.
371
372    - The high speed host controller can support 256, 512 or 1024 periodic frame
373      lists. By default all host controllers will support 1024 frame lists. In
374      our implementation, we support 1024 frame lists and we do this by first
375      constructing 32 periodic frame lists and duplicating the same periodic
376      frame lists for a total of 32 times. See figure 4.8 in EHCI specification.
377
378    - The host controller traverses the periodic schedule by constructing an
379      array offset reference from the PERIODICLISTBASE & the FRINDEX registers.
380      It fetches the element and begins traversing the graph of linked schedule
381      data structure. See fig 4.8 in EHCI specification.
382
383    - The host controller processes one interrupt endpoint descriptor list every
384      micro frame (125us). This means same list is revisited 8 times in a frame.
385
386    - The host controller driver sets up the interrupt lists to visit any given
387      endpoint descriptor in as many lists as necessary to provide the interrupt
388      granularity required for that endpoint.
389
390    - For isochronous transfers, we use only transfer descriptors but no
391      endpoint descriptors as in OHCI. Transfer descriptors are added at the
392      beginning of the periodic schedule.
393
394    - For EHCI, the bandwidth requirement is depends on the usb device speed
395      i.e.
396
397      For a high speed usb device, you only need high speed bandwidth. For a
398      full/low speed device connected through a high speed hub, you need both
399      high speed bandwidth and TT (transaction translator) bandwidth.
400
401      High speed bandwidth information is saved in an EHCI data structure and TT
402      bandwidth is saved in the high speed hub's usb device data structure. Each
403      TT acts as a full speed host controller & its bandwidth allocation scheme
404      overhead calculations and other details are similar to those of a full
405      speed  host controller. Refer to the "Full speed bus" section for more
406      details.
407
408    - The EHCI host controller driver maintains an array of 32 frame lists to
409      store high speed bandwidth allocated in each  frame and also each frame
410      list has eight micro frame lists, which saves bandwidth allocated in each
411      micro frame of  that particular frame.
412
413  o Bandwidth Allocation Scheme
414
415    (Refer 3.6.2 & 4.12.2 sections of EHCI 1.0 specification)
416
417    High speed Non Split Transaction (for High speed devices only):
418
419    For a given high speed interrupt or isochronous endpoint, the EHCI host
420    controller driver will go through the following steps to allocate
421    bandwidth needed for this endpoint.
422
423    - Calculate the bandwidth required for given endpoint using the formula and
424      overhead calculations mentioned in previous section.
425
426    - Compare the bandwidth available in the least allocated frame list out of
427      the 32 frame lists against the bandwidth required by this endpoint. If
428      this exceeds the limit, then, return an error.
429
430    - Map a given high speed endpoint's polling interval in micro seconds to an
431      interrupt list path based on a millisecond value. For example, an endpoint
432      with a polling interval of 16us will map to an interrupt list path of 2ms.
433
434    - Find out the static node to which the given endpoint needs to be linked
435      so that it will be polled at its required polling interval. This varies
436      based on polling interval and current bandwidth load on this schedule.
437
438      Ex: If a polling interval is 32us and its corresponding frame polling
439          interval will be 4ms, then the endpoint will be linked to one of the
440          four static nodes (range 3-6) in the 4ms column of figure 4.8 in EHCI
441          specification.
442
443    - Depending on the polling interval, we need to add the above calculated
444      bandwidth to one or more frame bandwidth lists, and also to one or more
445      micro frame bandwidth lists for that particular frame bandwidth list.
446      Before adding, we need to double check the availability of bandwidth in
447      those respective lists. If needed bandwidth is not available, then,
448      return an error. Otherwise add this bandwidth to all the required frame
449      and micro frame lists.
450
451      Ex: Assume given endpoint's polling interval is 32us and static node value
452          is 3. In this case, we need to add required bandwidth to 0,4,8,12,16,
453          20,24,28 frame bandwidth lists and micro bandwidth information is
454          saved using ehci_start_split_masks matrix. For this example, we need
455          to use any one of the 15 entries to save micro frame bandwidth.
456
457      High speed split transactions (for full and low speed devices only):
458
459      For a given full/low speed interrupt or isochronous endpoint, we need both
460      high speed and TT bandwidths. The TT bandwidth allocation is same as full
461      speed bus bandwidth allocation. Please refer to the "full speed bus"
462      bandwidth allocation section for more details.
463
464      The EHCI driver will go through the following steps to allocate high speed
465      bandwidth needed for  this full/low speed endpoint.
466
467      - Calculate the bandwidth required for a given endpoint using the formula
468        and overhead calculations mentioned in previous section. In this case,
469        we need to calculate bandwidth needed both for Start and Complete start
470        transactions separately.
471
472      - Compare the bandwidth available in the least allocated frame list out of
473        32 frame lists against the bandwidth required by this endpoint. If this
474        exceeds the limit, then, return an error.
475
476      - Find out the static node to which the given endpoint needs to be linked
477        so that it will be polled as per the required polling interval. This
478        value varies based on polling interval and current bandwidth load on
479        this schedule.
480
481        Ex: If a polling interval is  4ms, then the endpoint will be linked to
482            one of the four static nodes (range 3-6) in the 4ms column of figure
483            4.8 in EHCI specification.
484
485      - Depending on the polling interval, we need to add the above calculated
486        Start and Complete split transactions bandwidth to one or more frame
487        bandwidth lists and also to one or more micro frame bandwidth lists for
488        that particular frame bandwidth list. In this case, the Start split
489        transaction needs bandwidth in one micro frame, where as the Complete
490        split transaction needs bandwidth in next three subsequent micro frames
491        of that particular frame or next frame. Before adding, we need to double
492        check the availability of bandwidth in those respective lists. If needed
493        bandwidth is not available, then, return an error. Otherwise add this
494        bandwidth to all the required lists.
495
496        Ex: Assume give polling interval is 4ms and static node value is 3. In
497            this case, we need to add required Start and Complete split
498            bandwidth to the 0,4,8,12,16,20,24,28  frame bandwidth lists. The
499            micro frame bandwidth lists is stored using ehci_start_split_mask &
500            ehci_complete_split_mask matrices. In this case, we need to use any
501            of the first 8 entries to save micro frame bandwidth.
502
503            Assume we found that the following micro frame bandwidth lists of
504            0,4,8,12,16,20,24,28 frame lists can be used for this endpoint.
505            It means, we need to initiate "start split transaction" in first
506            micro frame of 0,4,8,12,16,20,24,28 frames.
507
508            Start split mask = 0x01,	/*  00000001 */
509
510            For this "start split mask",  the "complete split mask" should be
511
512	    Complete split mask = 0x0e, /*  00001110 */
513
514	    It means try "complete split transactions" in second, third or
515            fourth micro frames of 0,4,8,12,16,20,24,28 frames.
516
5174.Reference
518
519  - USB2.0, OHCI and EHCI Specifications
520
521    http://www.usb.org/developers/docs
522
523  - USB bandwidth analysis from Intel
524
525    http://www.usb.org/developers/whitepapers
526