xref: /illumos-gate/usr/src/uts/common/io/usb/hcd/README (revision 24da5b34f49324ed742a340010ed5bd3d4e06625)
1#
2# CDDL HEADER START
3#
4# The contents of this file are subject to the terms of the
5# Common Development and Distribution License, Version 1.0 only
6# (the "License").  You may not use this file except in compliance
7# with the License.
8#
9# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
10# or http://www.opensolaris.org/os/licensing.
11# See the License for the specific language governing permissions
12# and limitations under the License.
13#
14# When distributing Covered Code, include this CDDL HEADER in each
15# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
16# If applicable, add the following below this CDDL HEADER, with the
17# fields enclosed by brackets "[]" replaced with your own identifying
18# information: Portions Copyright [yyyy] [name of copyright owner]
19#
20# CDDL HEADER END
21#
22/*
23 * Copyright 2003 Sun Microsystems, Inc.  All rights reserved.
24 * Use is subject to license terms.
25 */
26
27#pragma ident	"%Z%%M%	%I%	%E% SMI"
28
29			SOLARIS USB BANDWIDTH ANALYSIS
30
311.Introduction
32
33  This document discuss the USB bandwidth allocation scheme, and the protocol
34  overheads used for both full and high speed host controller drivers. This
35  information is derived from the USB 2.0 specification, the "Bandwidth Analysis
36  Whitepaper" which is posted on www.usb.org, and other resources.
37
38  The target audience for this whitepaper are USB software & hardware designers
39  and engineers, and other interested people. The reader should be familiar with
40  the Universal Serial Bus Specification version 2.0, the OpenHCI Specification
41  1.0a and the Enhanced HCI Specification 1.0.
42
432.Full speed bus
44
45  The following overheads, formulas and scheme are applicable both to full speed
46  host controllers and also to high speed hub Transaction Translators (TT),
47  which perform full/low speed transactions.
48
49  o Timing and data rate calculations
50
51    - Timing calculations
52
53      1 sec			1000 ms or 1000000000 ns
54      1 ms			1 frame
55
56    - Data rate calculations
57
58      1 ms			1500 bytes or 12000 bits (per  frame)
59      668 ns			1 byte or 8 bits
60
61      1 full speed bit time	83.54 ns
62
63  o Protocol Overheads and Bandwidth numbers
64
65    - Protocol Overheads
66
67      (Refer 5.11.3 section of USB2.0 specification & page 2 of USB Bandwidth
68       Analysis document)
69
70      Non Isochronous	  	9107 ns			14 bytes
71      Isochronous Output	6265 ns			10 bytes
72      Isochronous Input	        7268 ns			11 bytes
73      Low-speed overhead       64060 ns			97 bytes
74      Hub LS overhead*	         668 ns	 	 	 1 byte
75      SOF		        4010 ns 	 	 6 bytes
76      EOF		        2673 ns		 	 4 bytes
77
78      Host Delay*		Specific to hardware    18 bytes
79      Low-Speed clock*		Slower than Full speed	 8
80
81    - Bandwidth numbers
82
83      (Refer 7.3.5 section of OHCI specification 1.0a & page 2 of USB Bandwidth
84       Analysis document)
85
86      Maximum bandwidth available		      1500 bytes/frame
87      Maximum Non Periodic bandwidth	  	       197 bytes/frame
88      Maximum Periodic bandwidth		      1293 bytes/frame
89
90      NOTE:
91
92      1.Hub specific low speed overhead
93
94        The time provided by the Host Controller for hubs to enable Low Speed
95        ports. The minimum of 4 full speed bit time.
96
97        overhead = 2 x Hub_LS_Setup
98                 = 2 x (4 x 83.54) =  668.32 Nano seconds  = 1 byte.
99
100      2.Host delay will be specific to particular hardware. The following host
101        delay is for RIO USB OHCI host controller (Provided by Ken Ward - RIO
102        USB hardware person). The following is just an example how to calculate
103        "host delay" for given USB host controller implementation.
104
105        Ex: Assuming ED (Endpoint Descriptor)/TD's (Transfer Descriptor) are not
106            streaming in Schizo (PCI bridge) and no cache hits for an ED or TD:
107
108            To read an ED or TD or data:
109
110            PCI_ARB_DELAY + PCI_ADDRESS + SCHIZO_RETRY
111            PCI_ARB_DELAY + PCI_ADDRESS + SCHIZO_TRDY +
112			DATA +  Core_overhead
113
114            Where,
115
116	    PCI_ARB_DELAY = 2000ns
117	    PCI_ADDRESS = 30ns
118	    SCHIZO RETRY = 60ns
119	    SCHIZO TRDY = 60ns
120	    DATA = 240ns (Always read 64 bytes ...)
121	    Core Overhead =240 + 30 * (MPS/4) + 83.54 * (MPS/4) + 4 * 83.54
122	    =  ~3400ns
123
124	    now multiply by 3 for ED+TD+DATA = 10200ns = ~128 bits or 16 bytes.
125
126	    This is probably on the optimistic side, only using 2us for the
127	    PCI_ARB_DELAY.
128
129	If there is a USB cache hit, the time it takes for an ED or TD is:
130
131	CORE SYNC DELAY + CACHE_HIT CHECK + 30 * (MPS/4) + CORE OVERHEAD
132
133	240 + 30 + 120 + 1000ns ~ 1400ns , or ~ 2 bytes
134
135        Total Host delay will be 18 bytes.
136
137      3.The Low-Speed clock is eight times slower than full speed  i.e. 1/8th of
138        the full speed.
139
140      4.For non-periodic transfers, reserve for at least one low-speed device
141        transaction per frame. According to the USB Bandwidth Analysis white
142        paper and also as per OHCI Specification 1.0a, section 7.3.5, page 123,
143        one low-speed transaction takes  0x628h full speed bits (197 bytes),
144        which comes to around 13% of USB frame time.
145
146     5. Maximum Periodic bandwidth is calculated using the following formula
147
148        Maximum Periodic bandwidth  = Maximum bandwidth available
149        - SOF - EOF -  Maximum Non Periodic bandwidth.
150
151  o Bus Transaction Formulas
152
153    (Refer 5.11.3 section of USB2.0 specification)
154
155    - Full-Speed:
156
157      Protocol overhead + ((MaxPacketSize * 7) / 6 ) + Host_Delay
158
159    - Low-Speed:
160
161      Protocol overhead + Hub LS overhead +
162		(Low-Speed clock  * ((MaxPacketSize * 7) / 6 )) + Host_Delay
163
164  o Periodic Schedule
165
166    The figure 5.5 in OHCI specification 1.0a gives you information on periodic
167    scheduling, different polling intervals that are supported, & other details
168    for the OHCI host controller.
169
170    - The host controller processes one interrupt endpoint descriptor list every
171      frame. The lower five bits of the current frame number us  used as an
172      index into an array of 32 interrupt endpoint descriptor lists or periodic
173      frame lists found in the HCCA (Host controller communication area). This
174      means each list is revisited once every 32ms. The host controller driver
175      sets up the interrupt lists to visit any given endpoint descriptor in as
176      many lists as necessary to provide the interrupt granularity required for
177      that endpoint. See figure 5.5 in OHCI specification 1.0a.
178
179    - Isochronous endpoint descriptors are added at the end of 1ms interrupt
180      endpoint descriptors.
181
182    - The host controller driver maintains an array of 32 frame bandwidth lists
183      to save bandwidth allocated in each USB frame.
184
185      Please refer section 5.2.7.2 of OHCI specification 1.0a, page 61 for more
186      details.
187
188  o Bandwidth Allocation Scheme
189
190    The OHCI host controller driver will go through the following steps to
191    allocate bandwidth needed for  an interrupt or isochronous endpoint as
192    follows
193
194    - Calculate the bandwidth required for the given endpoint using the bus
195      transaction formula and protocol overhead calculations mentioned in
196      previous section.
197
198    - Compare the bandwidth available in the least allocated frame list out of
199      the 32 frame bandwidth lists, against the bandwidth required by this
200      endpoint. If this exceeds the limit, then, an return error.
201
202    - Find out the static node to which the given endpoint needs to be linked
203      so that it will be polled as per the required polling interval. This value
204      varies based on polling interval and current bandwidth load on this
205      schedule. See figure 5.5 in OHCI specification 1.0a.
206
207      Ex: If a polling interval is 4ms, then, the endpoint will be linked to one
208          of the four static nodes (range 3-6) in the 4ms column of figure 5.5
209          in OHCI specification 1.0a.
210
211    - Depending on the polling interval, we need to add the above calculated
212      bandwidth to one or more frame bandwidth lists. Before adding, we need to
213      double check the availability of bandwidth in those respective lists. If
214      this exceeds the limit, then, return an error. Add this bandwidth to all
215      the required frame bandwidth lists.
216
217      Ex: Assume a give polling interval of 4 and a static node value of 3.
218          In this case, we need to add required bandwidth to 0,4,8,12,16,20,24,
219          28 frame bandwidth lists.
220
221
2223.High speed bus
223
224  o Timing and data rate calculations
225
226    - Timing calculations
227
228      1 sec			1000 ms
229      125 us			1 uframe
230      1 ms			1 frame or 8  uframes
231
232    - Data rate calculations
233
234      125 us			7500 bytes (per uframe)
235      16.66 ns			1 byte or 8 bits
236
237      1 high speed bit time	2.083 ns
238
239  o Protocol Overheads and Bandwidth numbers
240
241    - Protocol Overheads
242
243      (Refer 5.11.3, 8.4.2.2 and 8.4.2.3 sections of USB2.0 specification)
244
245      Non Isochronous	  	917 ns			55 bytes
246      Isochronous 		634 ns			38 bytes
247
248      Start split  overhead 	 67 ns		  	 4 bytes
249      Complete split  overhead 	 67 ns		  	 4 bytes
250
251      SOF		  	200 ns			12 bytes
252      EOF		       1667 ns 			70 bytes
253
254      Host Delay*		 Specific to hardware 	18 bytes
255
256    - Bandwidth numbers
257
258      (Refer 5.5.4 section of USB2.0 specification)
259
260      Maximum bandwidth available		      7500 bytes/uframe
261      Maximum Non Periodic bandwidth*		      1500 bytes/uframe
262      Maximum Periodic bandwidth*		      5918 bytes/uframe
263
264      NOTE:
265
266      1.Host delay will be specific to particular hardware.
267
268      2.As per USB 2.0 specification section 5.5.4, 20% of bus time is reserved
269        for the non-periodic high-speed transfers, where as periodic high-speed
270        transfers will get 80% of the bus time. In one micro-frame or 125us, we
271        can transfer 7500 bytes or 60,000 bits. So 20% of 7500 is 1500 bytes.
272
273      3.Maximum Periodic bandwidth is calculated using the following formula
274
275        Maximum Periodic bandwidth  = Maximum bandwidth available
276		- SOF - EOF -  Maximum Non Periodic bandwidth.
277
278  o Bus Transaction Formulas
279
280    (Refer 5.11.3 8.4.2.2 and 8.4.2.3 sections of USB2.0 specification)
281
282    - High-Speed (Non-Split transactions):
283
284      (Protocol overhead + ((MaxPacketSize * 7) / 6 ) +
285		Host_Delay) x Number of transactions per micro-frame
286
287    - High-Speed (Split transaction - Device to Host):
288
289      Start Split transaction:
290
291      Protocol overhead  + Host_Delay + Start split overhead
292
293      Complete Split transaction:
294
295      Protocol overhead  + ((MaxPacketSize * 7) / 6 ) +
296		Host_Delay + Complete split overhead
297
298    - High-Speed (Split transaction - Host to Device):
299
300      Start Split transaction:
301
302      Protocol overhead + ((MaxPacketSize * 7) / 6 ) +
303		Host_Delay) + Start split overhead
304
305      Complete Split transaction:
306
307      Protocol overhead  + Host_Delay + Complete split overhead
308
309
310  o Interrupt schedule or Start and Complete split masks
311
312    (Refer 3.6.2 & 4.12.2 sections of EHCI 1.0 specification)
313
314    - Interrupt schedule or Start split mask
315
316      This field  is used for for high, full and low speed usb device interrupt
317      and isochronous endpoints. This will tell the host controller which micro-
318      frame of a given usb frame to initiate a high speed interrupt and
319      isochronous transaction. For full/low speed devices, it will tell when to
320      initiate a "start split" transaction.
321
322	ehci_start_split_mask[15] = /* One byte field */
323	/*
324	 * For all low/full speed devices, and  for  high speed devices with
325	 * a polling interval greater than or equal to 8us (125us).
326	 */
327	{0x01,	/*  00000001 */
328	0x02,	/*  00000010 */
329	0x04,	/*  00000100 */
330	0x08,	/*  00001000 */
331	0x10,	/*  00010000 */
332	0x20,	/*  00100000 */
333	0x40,	/*  01000000 */
334	0x80,	/*  10000000 */
335
336	/* For high speed devices with a polling interval of 4us. */
337	0x11,	/* 00010001 */
338	0x22,	/* 00100010 */
339	0x44,	/* 01000100 */
340	0x88,	/* 10001000 */
341
342	/* For high speed devices with a polling interval of 2us. */
343	0x55,	/* 01010101 */
344	0xaa,	/* 10101010 */
345
346	/* For high speed devices with a polling interval of 1us. */
347	0xff };	/* 11111111 */
348
349    - Complete split mask
350
351      This field is used only for full/low speed usb device interrupt and
352      isochronous endpoints. It will tell the host controller which micro frame
353      to initiate a "complete split" transaction. Complete split transactions
354      can to be retried for up to 3 times. So bandwidth for complete split
355      transaction is reserved in 3 consecutive micro frames
356
357	ehci_complete_split_mask[8] = /* One byte field */
358	/* Only full/low speed devices */
359	{0x0e,	/*  00001110 */
360	0x1c,	/*  00011100 */
361	0x38,	/*  00111000 */
362	0x70,	/*  01110000 */
363	0xe0,	/*  11100000 */
364	Reserved ,	/*  Need FSTN feature  */
365	Reserved ,	/*  Need FSTN feature  */
366	Reserved};	/*  Need FSTN feature */
367
368  o Periodic Schedule
369
370    The figure 4.8 in EHCI specification gives you information on periodic
371    scheduling, different polling intervals that are supported, and other
372    details for the EHCI host controller.
373
374    - The high speed host controller can support 256, 512 or 1024 periodic frame
375      lists. By default all host controllers will support 1024 frame lists. In
376      our implementation, we support 1024 frame lists and we do this by first
377      constructing 32 periodic frame lists and duplicating the same periodic
378      frame lists for a total of 32 times. See figure 4.8 in EHCI specification.
379
380    - The host controller traverses the periodic schedule by constructing an
381      array offset reference from the PERIODICLISTBASE & the FRINDEX registers.
382      It fetches the element and begins traversing the graph of linked schedule
383      data structure. See fig 4.8 in EHCI specification.
384
385    - The host controller processes one interrupt endpoint descriptor list every
386      micro frame (125us). This means same list is revisited 8 times in a frame.
387
388    - The host controller driver sets up the interrupt lists to visit any given
389      endpoint descriptor in as many lists as necessary to provide the interrupt
390      granularity required for that endpoint.
391
392    - For isochronous transfers, we use only transfer descriptors but no
393      endpoint descriptors as in OHCI. Transfer descriptors are added at the
394      beginning of the periodic schedule.
395
396    - For EHCI, the bandwidth requirement is depends on the usb device speed
397      i.e.
398
399      For a high speed usb device, you only need high speed bandwidth. For a
400      full/low speed device connected through a high speed hub, you need both
401      high speed bandwidth and TT (transaction translator) bandwidth.
402
403      High speed bandwidth information is saved in an EHCI data structure and TT
404      bandwidth is saved in the high speed hub's usb device data structure. Each
405      TT acts as a full speed host controller & its bandwidth allocation scheme
406      overhead calculations and other details are similar to those of a full
407      speed  host controller. Refer to the "Full speed bus" section for more
408      details.
409
410    - The EHCI host controller driver maintains an array of 32 frame lists to
411      store high speed bandwidth allocated in each  frame and also each frame
412      list has eight micro frame lists, which saves bandwidth allocated in each
413      micro frame of  that particular frame.
414
415  o Bandwidth Allocation Scheme
416
417    (Refer 3.6.2 & 4.12.2 sections of EHCI 1.0 specification)
418
419    High speed Non Split Transaction (for High speed devices only):
420
421    For a given high speed interrupt or isochronous endpoint, the EHCI host
422    controller driver will go through the following steps to allocate
423    bandwidth needed for this endpoint.
424
425    - Calculate the bandwidth required for given endpoint using the formula and
426      overhead calculations mentioned in previous section.
427
428    - Compare the bandwidth available in the least allocated frame list out of
429      the 32 frame lists against the bandwidth required by this endpoint. If
430      this exceeds the limit, then, return an error.
431
432    - Map a given high speed endpoint's polling interval in micro seconds to an
433      interrupt list path based on a millisecond value. For example, an endpoint
434      with a polling interval of 16us will map to an interrupt list path of 2ms.
435
436    - Find out the static node to which the given endpoint needs to be linked
437      so that it will be polled at its required polling interval. This varies
438      based on polling interval and current bandwidth load on this schedule.
439
440      Ex: If a polling interval is 32us and its corresponding frame polling
441          interval will be 4ms, then the endpoint will be linked to one of the
442          four static nodes (range 3-6) in the 4ms column of figure 4.8 in EHCI
443          specification.
444
445    - Depending on the polling interval, we need to add the above calculated
446      bandwidth to one or more frame bandwidth lists, and also to one or more
447      micro frame bandwidth lists for that particular frame bandwidth list.
448      Before adding, we need to double check the availability of bandwidth in
449      those respective lists. If needed bandwidth is not available, then,
450      return an error. Otherwise add this bandwidth to all the required frame
451      and micro frame lists.
452
453      Ex: Assume given endpoint's polling interval is 32us and static node value
454          is 3. In this case, we need to add required bandwidth to 0,4,8,12,16,
455          20,24,28 frame bandwidth lists and micro bandwidth information is
456          saved using ehci_start_split_masks matrix. For this example, we need
457          to use any one of the 15 entries to save micro frame bandwidth.
458
459      High speed split transactions (for full and low speed devices only):
460
461      For a given full/low speed interrupt or isochronous endpoint, we need both
462      high speed and TT bandwidths. The TT bandwidth allocation is same as full
463      speed bus bandwidth allocation. Please refer to the "full speed bus"
464      bandwidth allocation section for more details.
465
466      The EHCI driver will go through the following steps to allocate high speed
467      bandwidth needed for  this full/low speed endpoint.
468
469      - Calculate the bandwidth required for a given endpoint using the formula
470        and overhead calculations mentioned in previous section. In this case,
471        we need to calculate bandwidth needed both for Start and Complete start
472        transactions separately.
473
474      - Compare the bandwidth available in the least allocated frame list out of
475        32 frame lists against the bandwidth required by this endpoint. If this
476        exceeds the limit, then, return an error.
477
478      - Find out the static node to which the given endpoint needs to be linked
479        so that it will be polled as per the required polling interval. This
480        value varies based on polling interval and current bandwidth load on
481        this schedule.
482
483        Ex: If a polling interval is  4ms, then the endpoint will be linked to
484            one of the four static nodes (range 3-6) in the 4ms column of figure
485            4.8 in EHCI specification.
486
487      - Depending on the polling interval, we need to add the above calculated
488        Start and Complete split transactions bandwidth to one or more frame
489        bandwidth lists and also to one or more micro frame bandwidth lists for
490        that particular frame bandwidth list. In this case, the Start split
491        transaction needs bandwidth in one micro frame, where as the Complete
492        split transaction needs bandwidth in next three subsequent micro frames
493        of that particular frame or next frame. Before adding, we need to double
494        check the availability of bandwidth in those respective lists. If needed
495        bandwidth is not available, then, return an error. Otherwise add this
496        bandwidth to all the required lists.
497
498        Ex: Assume give polling interval is 4ms and static node value is 3. In
499            this case, we need to add required Start and Complete split
500            bandwidth to the 0,4,8,12,16,20,24,28  frame bandwidth lists. The
501            micro frame bandwidth lists is stored using ehci_start_split_mask &
502            ehci_complete_split_mask matrices. In this case, we need to use any
503            of the first 8 entries to save micro frame bandwidth.
504
505            Assume we found that the following micro frame bandwidth lists of
506            0,4,8,12,16,20,24,28 frame lists can be used for this endpoint.
507            It means, we need to initiate "start split transaction" in first
508            micro frame of 0,4,8,12,16,20,24,28 frames.
509
510            Start split mask = 0x01,	/*  00000001 */
511
512            For this "start split mask",  the "complete split mask" should be
513
514	    Complete split mask = 0x0e, /*  00001110 */
515
516	    It means try "complete split transactions" in second, third or
517            fourth micro frames of 0,4,8,12,16,20,24,28 frames.
518
5194.Reference
520
521  - USB2.0, OHCI and EHCI Specifications
522
523    http://www.usb.org/developers/docs
524
525  - USB bandwidth analysis from Intel
526
527    http://www.usb.org/developers/whitepapers
528