xref: /linux/Documentation/driver-api/crypto/iaa/iaa-crypto.rst (revision da939ef4c494246bc2102ecb628bbcc71d650410)
1.. SPDX-License-Identifier: GPL-2.0
2
3=========================================
4IAA Compression Accelerator Crypto Driver
5=========================================
6
7Tom Zanussi <tom.zanussi@linux.intel.com>
8
9The IAA crypto driver supports compression/decompression compatible
10with the DEFLATE compression standard described in RFC 1951, which is
11the compression/decompression algorithm exported by this module.
12
13The IAA hardware spec can be found here:
14
15  https://cdrdv2.intel.com/v1/dl/getContent/721858
16
17The iaa_crypto driver is designed to work as a layer underneath
18higher-level compression devices such as zswap.
19
20Users can select IAA compress/decompress acceleration by specifying
21one of the supported IAA compression algorithms in whatever facility
22allows compression algorithms to be selected.
23
24For example, a zswap device can select the IAA 'fixed' mode
25represented by selecting the 'deflate-iaa' crypto compression
26algorithm::
27
28  # echo deflate-iaa > /sys/module/zswap/parameters/compressor
29
30This will tell zswap to use the IAA 'fixed' compression mode for all
31compresses and decompresses.
32
33Currently, there is only one compression modes available, 'fixed'
34mode.
35
36The 'fixed' compression mode implements the compression scheme
37specified by RFC 1951 and is given the crypto algorithm name
38'deflate-iaa'.  (Because the IAA hardware has a 4k history-window
39limitation, only buffers <= 4k, or that have been compressed using a
40<= 4k history window, are technically compliant with the deflate spec,
41which allows for a window of up to 32k.  Because of this limitation,
42the IAA fixed mode deflate algorithm is given its own algorithm name
43rather than simply 'deflate').
44
45
46Config options and other setup
47==============================
48
49The IAA crypto driver is available via menuconfig using the following
50path::
51
52  Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression Accelerator
53
54In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO.
55
56The IAA crypto driver also supports statistics, which are available
57via menuconfig using the following path::
58
59  Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression -> Enable Intel(R) IAA Compression Accelerator Statistics
60
61In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS.
62
63The following config options should also be enabled::
64
65  CONFIG_IRQ_REMAP=y
66  CONFIG_INTEL_IOMMU=y
67  CONFIG_INTEL_IOMMU_SVM=y
68  CONFIG_PCI_ATS=y
69  CONFIG_PCI_PRI=y
70  CONFIG_PCI_PASID=y
71  CONFIG_INTEL_IDXD=m
72  CONFIG_INTEL_IDXD_SVM=y
73
74IAA is one of the first Intel accelerator IPs that can work in
75conjunction with the Intel IOMMU.  There are multiple modes that exist
76for testing. Based on IOMMU configuration, there are 3 modes::
77
78  - Scalable
79  - Legacy
80  - No IOMMU
81
82
83Scalable mode
84-------------
85
86Scalable mode supports Shared Virtual Memory (SVM or SVA). It is
87entered when using the kernel boot commandline::
88
89  intel_iommu=on,sm_on
90
91with VT-d turned on in BIOS.
92
93With scalable mode, both shared and dedicated workqueues are available
94for use.
95
96For scalable mode, the following BIOS settings should be enabled::
97
98  Socket Configuration > IIO Configuration > Intel VT for Directed I/O (VT-d) > Intel VT for Directed I/O
99
100  Socket Configuration > IIO Configuration > PCIe ENQCMD > ENQCMDS
101
102
103Legacy mode
104-----------
105
106Legacy mode is entered when using the kernel boot commandline::
107
108  intel_iommu=off
109
110or VT-d is not turned on in BIOS.
111
112If you have booted into Linux and not sure if VT-d is on, do a "dmesg
113| grep -i dmar". If you don't see a number of DMAR devices enumerated,
114most likely VT-d is not on.
115
116With legacy mode, only dedicated workqueues are available for use.
117
118
119No IOMMU mode
120-------------
121
122No IOMMU mode is entered when using the kernel boot commandline::
123
124  iommu=off.
125
126With no IOMMU mode, only dedicated workqueues are available for use.
127
128
129Usage
130=====
131
132accel-config
133------------
134
135When loaded, the iaa_crypto driver automatically creates a default
136configuration and enables it, and assigns default driver attributes.
137If a different configuration or set of driver attributes is required,
138the user must first disable the IAA devices and workqueues, reset the
139configuration, and then re-register the deflate-iaa algorithm with the
140crypto subsystem by removing and reinserting the iaa_crypto module.
141
142The :ref:`iaa_disable_script` in the 'Use Cases'
143section below can be used to disable the default configuration.
144
145See :ref:`iaa_default_config` below for details of the default
146configuration.
147
148More likely than not, however, and because of the complexity and
149configurability of the accelerator devices, the user will want to
150configure the device and manually enable the desired devices and
151workqueues.
152
153The userspace tool to help doing that is called accel-config.  Using
154accel-config to configure device or loading a previously saved config
155is highly recommended.  The device can be controlled via sysfs
156directly but comes with the warning that you should do this ONLY if
157you know exactly what you are doing.  The following sections will not
158cover the sysfs interface but assumes you will be using accel-config.
159
160The :ref:`iaa_sysfs_config` section in the appendix below can be
161consulted for the sysfs interface details if interested.
162
163The accel-config tool along with instructions for building it can be
164found here:
165
166  https://github.com/intel/idxd-config/#readme
167
168Typical usage
169-------------
170
171In order for the iaa_crypto module to actually do any
172compression/decompression work on behalf of a facility, one or more
173IAA workqueues need to be bound to the iaa_crypto driver.
174
175For instance, here's an example of configuring an IAA workqueue and
176binding it to the iaa_crypto driver (note that device names are
177specified as 'iax' rather than 'iaa' - this is because upstream still
178has the old 'iax' device naming in place) ::
179
180  # configure wq1.0
181
182  accel-config config-wq --group-id=0 --mode=dedicated --type=kernel --priority=10 --name="iaa_crypto" --driver-name="crypto" iax1/wq1.0
183
184  accel-config config-engine iax1/engine1.0 --group-id=0
185
186  # enable IAA device iax1
187
188  accel-config enable-device iax1
189
190  # enable wq1.0 on IAX device iax1
191
192  accel-config enable-wq iax1/wq1.0
193
194Whenever a new workqueue is bound to or unbound from the iaa_crypto
195driver, the available workqueues are 'rebalanced' such that work
196submitted from a particular CPU is given to the most appropriate
197workqueue available.  Current best practice is to configure and bind
198at least one workqueue for each IAA device, but as long as there is at
199least one workqueue configured and bound to any IAA device in the
200system, the iaa_crypto driver will work, albeit most likely not as
201efficiently.
202
203The IAA crypto algorigthms is operational and compression and
204decompression operations are fully enabled following the successful
205binding of the first IAA workqueue to the iaa_crypto driver.
206
207Similarly, the IAA crypto algorithm is not operational and compression
208and decompression operations are disabled following the unbinding of
209the last IAA worqueue to the iaa_crypto driver.
210
211As a result, the IAA crypto algorithms and thus the IAA hardware are
212only available when one or more workques are bound to the iaa_crypto
213driver.
214
215When there are no IAA workqueues bound to the driver, the IAA crypto
216algorithms can be unregistered by removing the module.
217
218
219Driver attributes
220-----------------
221
222There are a couple user-configurable driver attributes that can be
223used to configure various modes of operation.  They're listed below,
224along with their default values.  To set any of these attributes, echo
225the appropriate values to the attribute file located under
226/sys/bus/dsa/drivers/crypto/
227
228The attribute settings at the time the IAA algorithms are registered
229are captured in each algorithm's crypto_ctx and used for all compresses
230and decompresses when using that algorithm.
231
232The available attributes are:
233
234  - verify_compress
235
236    Toggle compression verification.  If set, each compress will be
237    internally decompressed and the contents verified, returning error
238    codes if unsuccessful.  This can be toggled with 0/1::
239
240      echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress
241
242    The default setting is '1' - verify all compresses.
243
244  - sync_mode
245
246    Select mode to be used to wait for completion of each compresses
247    and decompress operation.
248
249    The crypto async interface support implemented by iaa_crypto
250    provides an implementation that satisfies the interface but does
251    so in a synchronous manner - it fills and submits the IDXD
252    descriptor and then loops around waiting for it to complete before
253    returning.  This isn't a problem at the moment, since all existing
254    callers (e.g. zswap) wrap any asynchronous callees in a
255    synchronous wrapper anyway.
256
257    The iaa_crypto driver does however provide true asynchronous
258    support for callers that can make use of it.  In this mode, it
259    fills and submits the IDXD descriptor, then returns immediately
260    with -EINPROGRESS.  The caller can then either poll for completion
261    itself, which requires specific code in the caller which currently
262    nothing in the upstream kernel implements, or go to sleep and wait
263    for an interrupt signaling completion.  This latter mode is
264    supported by current users in the kernel such as zswap via
265    synchronous wrappers.  Although it is supported this mode is
266    significantly slower than the synchronous mode that does the
267    polling in the iaa_crypto driver previously mentioned.
268
269    This mode can be enabled by writing 'async_irq' to the sync_mode
270    iaa_crypto driver attribute::
271
272      echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode
273
274    Async mode without interrupts (caller must poll) can be enabled by
275    writing 'async' to it (please see Caveat)::
276
277      echo async > /sys/bus/dsa/drivers/crypto/sync_mode
278
279    The mode that does the polling in the iaa_crypto driver can be
280    enabled by writing 'sync' to it::
281
282      echo sync > /sys/bus/dsa/drivers/crypto/sync_mode
283
284    The default mode is 'sync'.
285
286    Caveat: since the only mechanism that iaa_crypto currently implements
287    for async polling without interrupts is via the 'sync' mode as
288    described earlier, writing 'async' to
289    '/sys/bus/dsa/drivers/crypto/sync_mode' will internally enable the
290    'sync' mode. This is to ensure correct iaa_crypto behavior until true
291    async polling without interrupts is enabled in iaa_crypto.
292
293.. _iaa_default_config:
294
295IAA Default Configuration
296-------------------------
297
298When the iaa_crypto driver is loaded, each IAA device has a single
299work queue configured for it, with the following attributes::
300
301          mode              "dedicated"
302          threshold         0
303          size              Total WQ Size from WQCAP
304          priority          10
305          type              IDXD_WQT_KERNEL
306          group             0
307          name              "iaa_crypto"
308          driver_name       "crypto"
309
310The devices and workqueues are also enabled and therefore the driver
311is ready to be used without any additional configuration.
312
313The default driver attributes in effect when the driver is loaded are::
314
315          sync_mode         "sync"
316          verify_compress   1
317
318In order to change either the device/work queue or driver attributes,
319the enabled devices and workqueues must first be disabled.  In order
320to have the new configuration applied to the deflate-iaa crypto
321algorithm, it needs to be re-registered by removing and reinserting
322the iaa_crypto module.  The :ref:`iaa_disable_script` in the 'Use
323Cases' section below can be used to disable the default configuration.
324
325Statistics
326==========
327
328If the optional debugfs statistics support is enabled, the IAA crypto
329driver will generate statistics which can be accessed in debugfs at::
330
331  # ls -al /sys/kernel/debug/iaa-crypto/
332  total 0
333  drwxr-xr-x  2 root root 0 Mar  3 07:55 .
334  drwx------ 53 root root 0 Mar  3 07:55 ..
335  -rw-r--r--  1 root root 0 Mar  3 07:55 global_stats
336  -rw-r--r--  1 root root 0 Mar  3 07:55 stats_reset
337  -rw-r--r--  1 root root 0 Mar  3 07:55 wq_stats
338
339The global_stats file shows a set of global statistics collected since
340the driver has been loaded or reset::
341
342  # cat global_stats
343  global stats:
344    total_comp_calls: 4300
345    total_decomp_calls: 4164
346    total_sw_decomp_calls: 0
347    total_comp_bytes_out: 5993989
348    total_decomp_bytes_in: 5993989
349    total_completion_einval_errors: 0
350    total_completion_timeout_errors: 0
351    total_completion_comp_buf_overflow_errors: 136
352
353The wq_stats file shows per-wq stats, a set for each iaa device and wq
354in addition to some global stats::
355
356  # cat wq_stats
357  iaa device:
358    id: 1
359    n_wqs: 1
360    comp_calls: 0
361    comp_bytes: 0
362    decomp_calls: 0
363    decomp_bytes: 0
364    wqs:
365      name: iaa_crypto
366      comp_calls: 0
367      comp_bytes: 0
368      decomp_calls: 0
369      decomp_bytes: 0
370
371  iaa device:
372    id: 3
373    n_wqs: 1
374    comp_calls: 0
375    comp_bytes: 0
376    decomp_calls: 0
377    decomp_bytes: 0
378    wqs:
379      name: iaa_crypto
380      comp_calls: 0
381      comp_bytes: 0
382      decomp_calls: 0
383      decomp_bytes: 0
384
385  iaa device:
386    id: 5
387    n_wqs: 1
388    comp_calls: 1360
389    comp_bytes: 1999776
390    decomp_calls: 0
391    decomp_bytes: 0
392    wqs:
393      name: iaa_crypto
394      comp_calls: 1360
395      comp_bytes: 1999776
396      decomp_calls: 0
397      decomp_bytes: 0
398
399  iaa device:
400    id: 7
401    n_wqs: 1
402    comp_calls: 2940
403    comp_bytes: 3994213
404    decomp_calls: 4164
405    decomp_bytes: 5993989
406    wqs:
407      name: iaa_crypto
408      comp_calls: 2940
409      comp_bytes: 3994213
410      decomp_calls: 4164
411      decomp_bytes: 5993989
412    ...
413
414Writing to 'stats_reset' resets all the stats, including the
415per-device and per-wq stats::
416
417  # echo 1 > stats_reset
418  # cat wq_stats
419    global stats:
420    total_comp_calls: 0
421    total_decomp_calls: 0
422    total_comp_bytes_out: 0
423    total_decomp_bytes_in: 0
424    total_completion_einval_errors: 0
425    total_completion_timeout_errors: 0
426    total_completion_comp_buf_overflow_errors: 0
427    ...
428
429
430Use cases
431=========
432
433Simple zswap test
434-----------------
435
436For this example, the kernel should be configured according to the
437dedicated mode options described above, and zswap should be enabled as
438well::
439
440  CONFIG_ZSWAP=y
441
442This is a simple test that uses iaa_compress as the compressor for a
443swap (zswap) device.  It sets up the zswap device and then uses the
444memory_memadvise program listed below to forcibly swap out and in a
445specified number of pages, demonstrating both compress and decompress.
446
447The zswap test expects the work queues for each IAA device on the
448system to be configured properly as a kernel workqueue with a
449workqueue driver_name of "crypto".
450
451The first step is to make sure the iaa_crypto module is loaded::
452
453  modprobe iaa_crypto
454
455If the IAA devices and workqueues haven't previously been disabled and
456reconfigured, then the default configuration should be in place and no
457further IAA configuration is necessary.  See :ref:`iaa_default_config`
458below for details of the default configuration.
459
460If the default configuration is in place, you should see the iaa
461devices and wq0s enabled::
462
463  # cat /sys/bus/dsa/devices/iax1/state
464  enabled
465  # cat /sys/bus/dsa/devices/iax1/wq1.0/state
466  enabled
467
468To demonstrate that the following steps work as expected, these
469commands can be used to enable debug output::
470
471  # echo -n 'module iaa_crypto +p' > /sys/kernel/debug/dynamic_debug/control
472  # echo -n 'module idxd +p' > /sys/kernel/debug/dynamic_debug/control
473
474Use the following commands to enable zswap::
475
476  # echo 0 > /sys/module/zswap/parameters/enabled
477  # echo 50 > /sys/module/zswap/parameters/max_pool_percent
478  # echo deflate-iaa > /sys/module/zswap/parameters/compressor
479  # echo 1 > /sys/module/zswap/parameters/enabled
480  # echo 100 > /proc/sys/vm/swappiness
481  # echo never > /sys/kernel/mm/transparent_hugepage/enabled
482  # echo 1 > /proc/sys/vm/overcommit_memory
483
484Now you can now run the zswap workload you want to measure. For
485example, using the memory_memadvise code below, the following command
486will swap in and out 100 pages::
487
488  ./memory_madvise 100
489
490  Allocating 100 pages to swap in/out
491  Swapping out 100 pages
492  Swapping in 100 pages
493  Swapped out and in 100 pages
494
495You should see something like the following in the dmesg output::
496
497  [  404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096
498  [  404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192
499  [  404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568
500  [  404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
501  ...
502
503Now that basic functionality has been demonstrated, the defaults can
504be erased and replaced with a different configuration.  To do that,
505first disable zswap::
506
507  # echo lzo > /sys/module/zswap/parameters/compressor
508  # swapoff -a
509  # echo 0 > /sys/module/zswap/parameters/accept_threshold_percent
510  # echo 0 > /sys/module/zswap/parameters/max_pool_percent
511  # echo 0 > /sys/module/zswap/parameters/enabled
512  # echo 0 > /sys/module/zswap/parameters/enabled
513
514Then run the :ref:`iaa_disable_script` in the 'Use Cases' section
515below to disable the default configuration.
516
517Finally turn swap back on::
518
519  # swapon -a
520
521Following all that the IAA device(s) can now be re-configured and
522enabled as desired for further testing.  Below is one example.
523
524The zswap test expects the work queues for each IAA device on the
525system to be configured properly as a kernel workqueue with a
526workqueue driver_name of "crypto".
527
528The below script automatically does that::
529
530  #!/bin/bash
531
532  echo "IAA devices:"
533  lspci -d:0cfe
534  echo "# IAA devices:"
535  lspci -d:0cfe | wc -l
536
537  #
538  # count iaa instances
539  #
540  iaa_dev_id="0cfe"
541  num_iaa=$(lspci -d:${iaa_dev_id} | wc -l)
542  echo "Found ${num_iaa} IAA instances"
543
544  #
545  # disable iaa wqs and devices
546  #
547  echo "Disable IAA"
548
549  for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
550      echo disable wq iax${i}/wq${i}.0
551      accel-config disable-wq iax${i}/wq${i}.0
552      echo disable iaa iax${i}
553      accel-config disable-device iax${i}
554  done
555
556  echo "End Disable IAA"
557
558  echo "Reload iaa_crypto module"
559
560  rmmod iaa_crypto
561  modprobe iaa_crypto
562
563  echo "End Reload iaa_crypto module"
564
565  #
566  # configure iaa wqs and devices
567  #
568  echo "Configure IAA"
569  for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
570      accel-config config-wq --group-id=0 --mode=dedicated --wq-size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver-name="crypto" iax${i}/wq${i}.0
571      accel-config config-engine iax${i}/engine${i}.0 --group-id=0
572  done
573
574  echo "End Configure IAA"
575
576  #
577  # enable iaa wqs and devices
578  #
579  echo "Enable IAA"
580
581  for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
582      echo enable iaa iax${i}
583      accel-config enable-device iax${i}
584      echo enable wq iax${i}/wq${i}.0
585      accel-config enable-wq iax${i}/wq${i}.0
586  done
587
588  echo "End Enable IAA"
589
590When the workqueues are bound to the iaa_crypto driver, you should
591see something similar to the following in dmesg output if you've
592enabled debug output (echo -n 'module iaa_crypto +p' >
593/sys/kernel/debug/dynamic_debug/control)::
594
595  [   60.752344] idxd 0000:f6:02.0: add_iaa_wq: added wq 000000004068d14d to iaa 00000000c9585ba2, n_wq 1
596  [   60.752346] iaa_crypto: rebalance_wq_table: nr_nodes=2, nr_cpus 160, nr_iaa 8, cpus_per_iaa 20
597  [   60.752347] iaa_crypto: rebalance_wq_table: iaa=0
598  [   60.752349] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0)
599  [   60.752350] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0)
600  [   60.752352] iaa_crypto: rebalance_wq_table: assigned wq for cpu=0, node=0 = wq 00000000c8bb4452
601  [   60.752354] iaa_crypto: rebalance_wq_table: iaa=0
602  [   60.752355] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0)
603  [   60.752356] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0)
604  [   60.752358] iaa_crypto: rebalance_wq_table: assigned wq for cpu=1, node=0 = wq 00000000c8bb4452
605  [   60.752359] iaa_crypto: rebalance_wq_table: iaa=0
606  [   60.752360] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0)
607  [   60.752361] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0)
608  [   60.752362] iaa_crypto: rebalance_wq_table: assigned wq for cpu=2, node=0 = wq 00000000c8bb4452
609  [   60.752364] iaa_crypto: rebalance_wq_table: iaa=0
610  .
611  .
612  .
613
614Once the workqueues and devices have been enabled, the IAA crypto
615algorithms are enabled and available.  When the IAA crypto algorithms
616have been successfully enabled, you should see the following dmesg
617output::
618
619  [   64.893759] iaa_crypto: iaa_crypto_enable: iaa_crypto now ENABLED
620
621Now run the following zswap-specific setup commands to have zswap use
622the 'fixed' compression mode::
623
624  echo 0 > /sys/module/zswap/parameters/enabled
625  echo 50 > /sys/module/zswap/parameters/max_pool_percent
626  echo deflate-iaa > /sys/module/zswap/parameters/compressor
627  echo 1 > /sys/module/zswap/parameters/enabled
628
629  echo 100 > /proc/sys/vm/swappiness
630  echo never > /sys/kernel/mm/transparent_hugepage/enabled
631  echo 1 > /proc/sys/vm/overcommit_memory
632
633Finally, you can now run the zswap workload you want to measure. For
634example, using the code below, the following command will swap in and
635out 100 pages::
636
637  ./memory_madvise 100
638
639  Allocating 100 pages to swap in/out
640  Swapping out 100 pages
641  Swapping in 100 pages
642  Swapped out and in 100 pages
643
644You should see something like the following in the dmesg output if
645you've enabled debug output (echo -n 'module iaa_crypto +p' >
646/sys/kernel/debug/dynamic_debug/control)::
647
648  [  404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096
649  [  404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192
650  [  404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568
651  [  404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
652  [  409.203227] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228
653  [  409.203235] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21ee3dc000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096
654  [  409.203239] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21ee3dc000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
655  [  409.203254] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228
656  [  409.203256] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21f1551000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096
657  [  409.203257] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21f1551000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0
658
659In order to unregister the IAA crypto algorithms, and register new
660ones using different parameters, any users of the current algorithm
661should be stopped and the IAA workqueues and devices disabled.
662
663In the case of zswap, remove the IAA crypto algorithm as the
664compressor and turn off swap (to remove all references to
665iaa_crypto)::
666
667  echo lzo > /sys/module/zswap/parameters/compressor
668  swapoff -a
669
670  echo 0 > /sys/module/zswap/parameters/accept_threshold_percent
671  echo 0 > /sys/module/zswap/parameters/max_pool_percent
672  echo 0 > /sys/module/zswap/parameters/enabled
673
674Once zswap is disabled and no longer using iaa_crypto, the IAA wqs and
675devices can be disabled.
676
677.. _iaa_disable_script:
678
679IAA disable script
680------------------
681
682The below script automatically does that::
683
684  #!/bin/bash
685
686  echo "IAA devices:"
687  lspci -d:0cfe
688  echo "# IAA devices:"
689  lspci -d:0cfe | wc -l
690
691  #
692  # count iaa instances
693  #
694  iaa_dev_id="0cfe"
695  num_iaa=$(lspci -d:${iaa_dev_id} | wc -l)
696  echo "Found ${num_iaa} IAA instances"
697
698  #
699  # disable iaa wqs and devices
700  #
701  echo "Disable IAA"
702
703  for ((i = 1; i < ${num_iaa} * 2; i += 2)); do
704      echo disable wq iax${i}/wq${i}.0
705      accel-config disable-wq iax${i}/wq${i}.0
706      echo disable iaa iax${i}
707      accel-config disable-device iax${i}
708  done
709
710  echo "End Disable IAA"
711
712Finally, at this point the iaa_crypto module can be removed, which
713will unregister the current IAA crypto algorithms::
714
715  rmmod iaa_crypto
716
717
718memory_madvise.c (gcc -o memory_memadvise memory_madvise.c)::
719
720  #include <stdio.h>
721  #include <stdlib.h>
722  #include <string.h>
723  #include <unistd.h>
724  #include <sys/mman.h>
725  #include <linux/mman.h>
726
727  #ifndef MADV_PAGEOUT
728  #define MADV_PAGEOUT    21      /* force pages out immediately */
729  #endif
730
731  #define PG_SZ           4096
732
733  int main(int argc, char **argv)
734  {
735        int i, nr_pages = 1;
736        int64_t *dump_ptr;
737        char *addr, *a;
738        int loop = 1;
739
740        if (argc > 1)
741                nr_pages = atoi(argv[1]);
742
743        printf("Allocating %d pages to swap in/out\n", nr_pages);
744
745        /* allocate pages */
746        addr = mmap(NULL, nr_pages * PG_SZ, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
747        *addr = 1;
748
749        /* initialize data in page to all '*' chars */
750        memset(addr, '*', nr_pages * PG_SZ);
751
752         printf("Swapping out %d pages\n", nr_pages);
753
754        /* Tell kernel to swap it out */
755        madvise(addr, nr_pages * PG_SZ, MADV_PAGEOUT);
756
757        while (loop > 0) {
758                /* Wait for swap out to finish */
759                sleep(5);
760
761                a = addr;
762
763                printf("Swapping in %d pages\n", nr_pages);
764
765                /* Access the page ... this will swap it back in again */
766                for (i = 0; i < nr_pages; i++) {
767                        if (a[0] != '*') {
768                                printf("Bad data from decompress!!!!!\n");
769
770                                dump_ptr = (int64_t *)a;
771                                 for (int j = 0; j < 100; j++) {
772                                        printf("  page %d data: %#llx\n", i, *dump_ptr);
773                                        dump_ptr++;
774                                }
775                        }
776
777                        a += PG_SZ;
778                }
779
780                loop --;
781        }
782
783       printf("Swapped out and in %d pages\n", nr_pages);
784
785Appendix
786========
787
788.. _iaa_sysfs_config:
789
790IAA sysfs config interface
791--------------------------
792
793Below is a description of the IAA sysfs interface, which as mentioned
794in the main document, should only be used if you know exactly what you
795are doing.  Even then, there's no compelling reason to use it directly
796since accel-config can do everything the sysfs interface can and in
797fact accel-config is based on it under the covers.
798
799The 'IAA config path' is /sys/bus/dsa/devices and contains
800subdirectories representing each IAA device, workqueue, engine, and
801group.  Note that in the sysfs interface, the IAA devices are actually
802named using iax e.g. iax1, iax3, etc. (Note that IAA devices are the
803odd-numbered devices; the even-numbered devices are DSA devices and
804can be ignored for IAA).
805
806The 'IAA device bind path' is /sys/bus/dsa/drivers/idxd/bind and is
807the file that is written to enable an IAA device.
808
809The 'IAA workqueue bind path' is /sys/bus/dsa/drivers/crypto/bind and
810is the file that is written to enable an IAA workqueue.
811
812Similarly /sys/bus/dsa/drivers/idxd/unbind and
813/sys/bus/dsa/drivers/crypto/unbind are used to disable IAA devices and
814workqueues.
815
816The basic sequence of commands needed to set up the IAA devices and
817workqueues is:
818
819For each device::
820  1) Disable any workqueues enabled on the device.  For example to
821     disable workques 0 and 1 on IAA device 3::
822
823       # echo wq3.0 > /sys/bus/dsa/drivers/crypto/unbind
824       # echo wq3.1 > /sys/bus/dsa/drivers/crypto/unbind
825
826  2) Disable the device. For example to disable IAA device 3::
827
828       # echo iax3 > /sys/bus/dsa/drivers/idxd/unbind
829
830  3) configure the desired workqueues.  For example, to configure
831     workqueue 3 on IAA device 3::
832
833       # echo dedicated > /sys/bus/dsa/devices/iax3/wq3.3/mode
834       # echo 128 > /sys/bus/dsa/devices/iax3/wq3.3/size
835       # echo 0 > /sys/bus/dsa/devices/iax3/wq3.3/group_id
836       # echo 10 > /sys/bus/dsa/devices/iax3/wq3.3/priority
837       # echo "kernel" > /sys/bus/dsa/devices/iax3/wq3.3/type
838       # echo "iaa_crypto" > /sys/bus/dsa/devices/iax3/wq3.3/name
839       # echo "crypto" > /sys/bus/dsa/devices/iax3/wq3.3/driver_name
840
841  4) Enable the device. For example to enable IAA device 3::
842
843       # echo iax3 > /sys/bus/dsa/drivers/idxd/bind
844
845  5) Enable the desired workqueues on the device.  For example to
846     enable workques 0 and 1 on IAA device 3::
847
848       # echo wq3.0 > /sys/bus/dsa/drivers/crypto/bind
849       # echo wq3.1 > /sys/bus/dsa/drivers/crypto/bind
850