1.. SPDX-License-Identifier: GPL-2.0 2 3========================================= 4IAA Compression Accelerator Crypto Driver 5========================================= 6 7Tom Zanussi <tom.zanussi@linux.intel.com> 8 9The IAA crypto driver supports compression/decompression compatible 10with the DEFLATE compression standard described in RFC 1951, which is 11the compression/decompression algorithm exported by this module. 12 13The IAA hardware spec can be found here: 14 15 https://cdrdv2.intel.com/v1/dl/getContent/721858 16 17The iaa_crypto driver is designed to work as a layer underneath 18higher-level compression devices such as zswap. 19 20Users can select IAA compress/decompress acceleration by specifying 21one of the supported IAA compression algorithms in whatever facility 22allows compression algorithms to be selected. 23 24For example, a zswap device can select the IAA 'fixed' mode 25represented by selecting the 'deflate-iaa' crypto compression 26algorithm:: 27 28 # echo deflate-iaa > /sys/module/zswap/parameters/compressor 29 30This will tell zswap to use the IAA 'fixed' compression mode for all 31compresses and decompresses. 32 33Currently, there is only one compression modes available, 'fixed' 34mode. 35 36The 'fixed' compression mode implements the compression scheme 37specified by RFC 1951 and is given the crypto algorithm name 38'deflate-iaa'. (Because the IAA hardware has a 4k history-window 39limitation, only buffers <= 4k, or that have been compressed using a 40<= 4k history window, are technically compliant with the deflate spec, 41which allows for a window of up to 32k. Because of this limitation, 42the IAA fixed mode deflate algorithm is given its own algorithm name 43rather than simply 'deflate'). 44 45 46Config options and other setup 47============================== 48 49The IAA crypto driver is available via menuconfig using the following 50path:: 51 52 Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression Accelerator 53 54In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO. 55 56The IAA crypto driver also supports statistics, which are available 57via menuconfig using the following path:: 58 59 Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression -> Enable Intel(R) IAA Compression Accelerator Statistics 60 61In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS. 62 63The following config options should also be enabled:: 64 65 CONFIG_IRQ_REMAP=y 66 CONFIG_INTEL_IOMMU=y 67 CONFIG_INTEL_IOMMU_SVM=y 68 CONFIG_PCI_ATS=y 69 CONFIG_PCI_PRI=y 70 CONFIG_PCI_PASID=y 71 CONFIG_INTEL_IDXD=m 72 CONFIG_INTEL_IDXD_SVM=y 73 74IAA is one of the first Intel accelerator IPs that can work in 75conjunction with the Intel IOMMU. There are multiple modes that exist 76for testing. Based on IOMMU configuration, there are 3 modes:: 77 78 - Scalable 79 - Legacy 80 - No IOMMU 81 82 83Scalable mode 84------------- 85 86Scalable mode supports Shared Virtual Memory (SVM or SVA). It is 87entered when using the kernel boot commandline:: 88 89 intel_iommu=on,sm_on 90 91with VT-d turned on in BIOS. 92 93With scalable mode, both shared and dedicated workqueues are available 94for use. 95 96For scalable mode, the following BIOS settings should be enabled:: 97 98 Socket Configuration > IIO Configuration > Intel VT for Directed I/O (VT-d) > Intel VT for Directed I/O 99 100 Socket Configuration > IIO Configuration > PCIe ENQCMD > ENQCMDS 101 102 103Legacy mode 104----------- 105 106Legacy mode is entered when using the kernel boot commandline:: 107 108 intel_iommu=off 109 110or VT-d is not turned on in BIOS. 111 112If you have booted into Linux and not sure if VT-d is on, do a "dmesg 113| grep -i dmar". If you don't see a number of DMAR devices enumerated, 114most likely VT-d is not on. 115 116With legacy mode, only dedicated workqueues are available for use. 117 118 119No IOMMU mode 120------------- 121 122No IOMMU mode is entered when using the kernel boot commandline:: 123 124 iommu=off. 125 126With no IOMMU mode, only dedicated workqueues are available for use. 127 128 129Usage 130===== 131 132accel-config 133------------ 134 135When loaded, the iaa_crypto driver automatically creates a default 136configuration and enables it, and assigns default driver attributes. 137If a different configuration or set of driver attributes is required, 138the user must first disable the IAA devices and workqueues, reset the 139configuration, and then re-register the deflate-iaa algorithm with the 140crypto subsystem by removing and reinserting the iaa_crypto module. 141 142The :ref:`iaa_disable_script` in the 'Use Cases' 143section below can be used to disable the default configuration. 144 145See :ref:`iaa_default_config` below for details of the default 146configuration. 147 148More likely than not, however, and because of the complexity and 149configurability of the accelerator devices, the user will want to 150configure the device and manually enable the desired devices and 151workqueues. 152 153The userspace tool to help doing that is called accel-config. Using 154accel-config to configure device or loading a previously saved config 155is highly recommended. The device can be controlled via sysfs 156directly but comes with the warning that you should do this ONLY if 157you know exactly what you are doing. The following sections will not 158cover the sysfs interface but assumes you will be using accel-config. 159 160The :ref:`iaa_sysfs_config` section in the appendix below can be 161consulted for the sysfs interface details if interested. 162 163The accel-config tool along with instructions for building it can be 164found here: 165 166 https://github.com/intel/idxd-config/#readme 167 168Typical usage 169------------- 170 171In order for the iaa_crypto module to actually do any 172compression/decompression work on behalf of a facility, one or more 173IAA workqueues need to be bound to the iaa_crypto driver. 174 175For instance, here's an example of configuring an IAA workqueue and 176binding it to the iaa_crypto driver (note that device names are 177specified as 'iax' rather than 'iaa' - this is because upstream still 178has the old 'iax' device naming in place) :: 179 180 # configure wq1.0 181 182 accel-config config-wq --group-id=0 --mode=dedicated --type=kernel --name="iaa_crypto" --device_name="crypto" iax1/wq1.0 183 184 # enable IAA device iax1 185 186 accel-config enable-device iax1 187 188 # enable wq1.0 on IAX device iax1 189 190 accel-config enable-wq iax1/wq1.0 191 192Whenever a new workqueue is bound to or unbound from the iaa_crypto 193driver, the available workqueues are 'rebalanced' such that work 194submitted from a particular CPU is given to the most appropriate 195workqueue available. Current best practice is to configure and bind 196at least one workqueue for each IAA device, but as long as there is at 197least one workqueue configured and bound to any IAA device in the 198system, the iaa_crypto driver will work, albeit most likely not as 199efficiently. 200 201The IAA crypto algorigthms is operational and compression and 202decompression operations are fully enabled following the successful 203binding of the first IAA workqueue to the iaa_crypto driver. 204 205Similarly, the IAA crypto algorithm is not operational and compression 206and decompression operations are disabled following the unbinding of 207the last IAA worqueue to the iaa_crypto driver. 208 209As a result, the IAA crypto algorithms and thus the IAA hardware are 210only available when one or more workques are bound to the iaa_crypto 211driver. 212 213When there are no IAA workqueues bound to the driver, the IAA crypto 214algorithms can be unregistered by removing the module. 215 216 217Driver attributes 218----------------- 219 220There are a couple user-configurable driver attributes that can be 221used to configure various modes of operation. They're listed below, 222along with their default values. To set any of these attributes, echo 223the appropriate values to the attribute file located under 224/sys/bus/dsa/drivers/crypto/ 225 226The attribute settings at the time the IAA algorithms are registered 227are captured in each algorithm's crypto_ctx and used for all compresses 228and decompresses when using that algorithm. 229 230The available attributes are: 231 232 - verify_compress 233 234 Toggle compression verification. If set, each compress will be 235 internally decompressed and the contents verified, returning error 236 codes if unsuccessful. This can be toggled with 0/1:: 237 238 echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress 239 240 The default setting is '1' - verify all compresses. 241 242 - sync_mode 243 244 Select mode to be used to wait for completion of each compresses 245 and decompress operation. 246 247 The crypto async interface support implemented by iaa_crypto 248 provides an implementation that satisfies the interface but does 249 so in a synchronous manner - it fills and submits the IDXD 250 descriptor and then loops around waiting for it to complete before 251 returning. This isn't a problem at the moment, since all existing 252 callers (e.g. zswap) wrap any asynchronous callees in a 253 synchronous wrapper anyway. 254 255 The iaa_crypto driver does however provide true asynchronous 256 support for callers that can make use of it. In this mode, it 257 fills and submits the IDXD descriptor, then returns immediately 258 with -EINPROGRESS. The caller can then either poll for completion 259 itself, which requires specific code in the caller which currently 260 nothing in the upstream kernel implements, or go to sleep and wait 261 for an interrupt signaling completion. This latter mode is 262 supported by current users in the kernel such as zswap via 263 synchronous wrappers. Although it is supported this mode is 264 significantly slower than the synchronous mode that does the 265 polling in the iaa_crypto driver previously mentioned. 266 267 This mode can be enabled by writing 'async_irq' to the sync_mode 268 iaa_crypto driver attribute:: 269 270 echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode 271 272 Async mode without interrupts (caller must poll) can be enabled by 273 writing 'async' to it:: 274 275 echo async > /sys/bus/dsa/drivers/crypto/sync_mode 276 277 The mode that does the polling in the iaa_crypto driver can be 278 enabled by writing 'sync' to it:: 279 280 echo sync > /sys/bus/dsa/drivers/crypto/sync_mode 281 282 The default mode is 'sync'. 283 284.. _iaa_default_config: 285 286IAA Default Configuration 287------------------------- 288 289When the iaa_crypto driver is loaded, each IAA device has a single 290work queue configured for it, with the following attributes:: 291 292 mode "dedicated" 293 threshold 0 294 size Total WQ Size from WQCAP 295 priority 10 296 type IDXD_WQT_KERNEL 297 group 0 298 name "iaa_crypto" 299 driver_name "crypto" 300 301The devices and workqueues are also enabled and therefore the driver 302is ready to be used without any additional configuration. 303 304The default driver attributes in effect when the driver is loaded are:: 305 306 sync_mode "sync" 307 verify_compress 1 308 309In order to change either the device/work queue or driver attributes, 310the enabled devices and workqueues must first be disabled. In order 311to have the new configuration applied to the deflate-iaa crypto 312algorithm, it needs to be re-registered by removing and reinserting 313the iaa_crypto module. The :ref:`iaa_disable_script` in the 'Use 314Cases' section below can be used to disable the default configuration. 315 316Statistics 317========== 318 319If the optional debugfs statistics support is enabled, the IAA crypto 320driver will generate statistics which can be accessed in debugfs at:: 321 322 # ls -al /sys/kernel/debug/iaa-crypto/ 323 total 0 324 drwxr-xr-x 2 root root 0 Mar 3 09:35 . 325 drwx------ 47 root root 0 Mar 3 09:35 .. 326 -rw-r--r-- 1 root root 0 Mar 3 09:35 max_acomp_delay_ns 327 -rw-r--r-- 1 root root 0 Mar 3 09:35 max_adecomp_delay_ns 328 -rw-r--r-- 1 root root 0 Mar 3 09:35 max_comp_delay_ns 329 -rw-r--r-- 1 root root 0 Mar 3 09:35 max_decomp_delay_ns 330 -rw-r--r-- 1 root root 0 Mar 3 09:35 stats_reset 331 -rw-r--r-- 1 root root 0 Mar 3 09:35 total_comp_bytes_out 332 -rw-r--r-- 1 root root 0 Mar 3 09:35 total_comp_calls 333 -rw-r--r-- 1 root root 0 Mar 3 09:35 total_decomp_bytes_in 334 -rw-r--r-- 1 root root 0 Mar 3 09:35 total_decomp_calls 335 -rw-r--r-- 1 root root 0 Mar 3 09:35 wq_stats 336 337Most of the above statisticss are self-explanatory. The wq_stats file 338shows per-wq stats, a set for each iaa device and wq in addition to 339some global stats:: 340 341 # cat wq_stats 342 global stats: 343 total_comp_calls: 100 344 total_decomp_calls: 100 345 total_comp_bytes_out: 22800 346 total_decomp_bytes_in: 22800 347 total_completion_einval_errors: 0 348 total_completion_timeout_errors: 0 349 total_completion_comp_buf_overflow_errors: 0 350 351 iaa device: 352 id: 1 353 n_wqs: 1 354 comp_calls: 0 355 comp_bytes: 0 356 decomp_calls: 0 357 decomp_bytes: 0 358 wqs: 359 name: iaa_crypto 360 comp_calls: 0 361 comp_bytes: 0 362 decomp_calls: 0 363 decomp_bytes: 0 364 365 iaa device: 366 id: 3 367 n_wqs: 1 368 comp_calls: 0 369 comp_bytes: 0 370 decomp_calls: 0 371 decomp_bytes: 0 372 wqs: 373 name: iaa_crypto 374 comp_calls: 0 375 comp_bytes: 0 376 decomp_calls: 0 377 decomp_bytes: 0 378 379 iaa device: 380 id: 5 381 n_wqs: 1 382 comp_calls: 100 383 comp_bytes: 22800 384 decomp_calls: 100 385 decomp_bytes: 22800 386 wqs: 387 name: iaa_crypto 388 comp_calls: 100 389 comp_bytes: 22800 390 decomp_calls: 100 391 decomp_bytes: 22800 392 393Writing 0 to 'stats_reset' resets all the stats, including the 394per-device and per-wq stats:: 395 396 # echo 0 > stats_reset 397 # cat wq_stats 398 global stats: 399 total_comp_calls: 0 400 total_decomp_calls: 0 401 total_comp_bytes_out: 0 402 total_decomp_bytes_in: 0 403 total_completion_einval_errors: 0 404 total_completion_timeout_errors: 0 405 total_completion_comp_buf_overflow_errors: 0 406 ... 407 408 409Use cases 410========= 411 412Simple zswap test 413----------------- 414 415For this example, the kernel should be configured according to the 416dedicated mode options described above, and zswap should be enabled as 417well:: 418 419 CONFIG_ZSWAP=y 420 421This is a simple test that uses iaa_compress as the compressor for a 422swap (zswap) device. It sets up the zswap device and then uses the 423memory_memadvise program listed below to forcibly swap out and in a 424specified number of pages, demonstrating both compress and decompress. 425 426The zswap test expects the work queues for each IAA device on the 427system to be configured properly as a kernel workqueue with a 428workqueue driver_name of "crypto". 429 430The first step is to make sure the iaa_crypto module is loaded:: 431 432 modprobe iaa_crypto 433 434If the IAA devices and workqueues haven't previously been disabled and 435reconfigured, then the default configuration should be in place and no 436further IAA configuration is necessary. See :ref:`iaa_default_config` 437below for details of the default configuration. 438 439If the default configuration is in place, you should see the iaa 440devices and wq0s enabled:: 441 442 # cat /sys/bus/dsa/devices/iax1/state 443 enabled 444 # cat /sys/bus/dsa/devices/iax1/wq1.0/state 445 enabled 446 447To demonstrate that the following steps work as expected, these 448commands can be used to enable debug output:: 449 450 # echo -n 'module iaa_crypto +p' > /sys/kernel/debug/dynamic_debug/control 451 # echo -n 'module idxd +p' > /sys/kernel/debug/dynamic_debug/control 452 453Use the following commands to enable zswap:: 454 455 # echo 0 > /sys/module/zswap/parameters/enabled 456 # echo 50 > /sys/module/zswap/parameters/max_pool_percent 457 # echo deflate-iaa > /sys/module/zswap/parameters/compressor 458 # echo zsmalloc > /sys/module/zswap/parameters/zpool 459 # echo 1 > /sys/module/zswap/parameters/enabled 460 # echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled 461 # echo 100 > /proc/sys/vm/swappiness 462 # echo never > /sys/kernel/mm/transparent_hugepage/enabled 463 # echo 1 > /proc/sys/vm/overcommit_memory 464 465Now you can now run the zswap workload you want to measure. For 466example, using the memory_memadvise code below, the following command 467will swap in and out 100 pages:: 468 469 ./memory_madvise 100 470 471 Allocating 100 pages to swap in/out 472 Swapping out 100 pages 473 Swapping in 100 pages 474 Swapped out and in 100 pages 475 476You should see something like the following in the dmesg output:: 477 478 [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096 479 [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192 480 [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568 481 [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 482 ... 483 484Now that basic functionality has been demonstrated, the defaults can 485be erased and replaced with a different configuration. To do that, 486first disable zswap:: 487 488 # echo lzo > /sys/module/zswap/parameters/compressor 489 # swapoff -a 490 # echo 0 > /sys/module/zswap/parameters/accept_threshold_percent 491 # echo 0 > /sys/module/zswap/parameters/max_pool_percent 492 # echo 0 > /sys/module/zswap/parameters/enabled 493 # echo 0 > /sys/module/zswap/parameters/enabled 494 495Then run the :ref:`iaa_disable_script` in the 'Use Cases' section 496below to disable the default configuration. 497 498Finally turn swap back on:: 499 500 # swapon -a 501 502Following all that the IAA device(s) can now be re-configured and 503enabled as desired for further testing. Below is one example. 504 505The zswap test expects the work queues for each IAA device on the 506system to be configured properly as a kernel workqueue with a 507workqueue driver_name of "crypto". 508 509The below script automatically does that:: 510 511 #!/bin/bash 512 513 echo "IAA devices:" 514 lspci -d:0cfe 515 echo "# IAA devices:" 516 lspci -d:0cfe | wc -l 517 518 # 519 # count iaa instances 520 # 521 iaa_dev_id="0cfe" 522 num_iaa=$(lspci -d:${iaa_dev_id} | wc -l) 523 echo "Found ${num_iaa} IAA instances" 524 525 # 526 # disable iaa wqs and devices 527 # 528 echo "Disable IAA" 529 530 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 531 echo disable wq iax${i}/wq${i}.0 532 accel-config disable-wq iax${i}/wq${i}.0 533 echo disable iaa iax${i} 534 accel-config disable-device iax${i} 535 done 536 537 echo "End Disable IAA" 538 539 # 540 # configure iaa wqs and devices 541 # 542 echo "Configure IAA" 543 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 544 accel-config config-wq --group-id=0 --mode=dedicated --size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver_name="crypto" iax${i}/wq${i} 545 done 546 547 echo "End Configure IAA" 548 549 # 550 # enable iaa wqs and devices 551 # 552 echo "Enable IAA" 553 554 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 555 echo enable iaa iaa${i} 556 accel-config enable-device iaa${i} 557 echo enable wq iaa${i}/wq${i}.0 558 accel-config enable-wq iaa${i}/wq${i}.0 559 done 560 561 echo "End Enable IAA" 562 563When the workqueues are bound to the iaa_crypto driver, you should 564see something similar to the following in dmesg output if you've 565enabled debug output (echo -n 'module iaa_crypto +p' > 566/sys/kernel/debug/dynamic_debug/control):: 567 568 [ 60.752344] idxd 0000:f6:02.0: add_iaa_wq: added wq 000000004068d14d to iaa 00000000c9585ba2, n_wq 1 569 [ 60.752346] iaa_crypto: rebalance_wq_table: nr_nodes=2, nr_cpus 160, nr_iaa 8, cpus_per_iaa 20 570 [ 60.752347] iaa_crypto: rebalance_wq_table: iaa=0 571 [ 60.752349] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) 572 [ 60.752350] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) 573 [ 60.752352] iaa_crypto: rebalance_wq_table: assigned wq for cpu=0, node=0 = wq 00000000c8bb4452 574 [ 60.752354] iaa_crypto: rebalance_wq_table: iaa=0 575 [ 60.752355] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) 576 [ 60.752356] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) 577 [ 60.752358] iaa_crypto: rebalance_wq_table: assigned wq for cpu=1, node=0 = wq 00000000c8bb4452 578 [ 60.752359] iaa_crypto: rebalance_wq_table: iaa=0 579 [ 60.752360] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) 580 [ 60.752361] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) 581 [ 60.752362] iaa_crypto: rebalance_wq_table: assigned wq for cpu=2, node=0 = wq 00000000c8bb4452 582 [ 60.752364] iaa_crypto: rebalance_wq_table: iaa=0 583 . 584 . 585 . 586 587Once the workqueues and devices have been enabled, the IAA crypto 588algorithms are enabled and available. When the IAA crypto algorithms 589have been successfully enabled, you should see the following dmesg 590output:: 591 592 [ 64.893759] iaa_crypto: iaa_crypto_enable: iaa_crypto now ENABLED 593 594Now run the following zswap-specific setup commands to have zswap use 595the 'fixed' compression mode:: 596 597 echo 0 > /sys/module/zswap/parameters/enabled 598 echo 50 > /sys/module/zswap/parameters/max_pool_percent 599 echo deflate-iaa > /sys/module/zswap/parameters/compressor 600 echo zsmalloc > /sys/module/zswap/parameters/zpool 601 echo 1 > /sys/module/zswap/parameters/enabled 602 echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled 603 604 echo 100 > /proc/sys/vm/swappiness 605 echo never > /sys/kernel/mm/transparent_hugepage/enabled 606 echo 1 > /proc/sys/vm/overcommit_memory 607 608Finally, you can now run the zswap workload you want to measure. For 609example, using the code below, the following command will swap in and 610out 100 pages:: 611 612 ./memory_madvise 100 613 614 Allocating 100 pages to swap in/out 615 Swapping out 100 pages 616 Swapping in 100 pages 617 Swapped out and in 100 pages 618 619You should see something like the following in the dmesg output if 620you've enabled debug output (echo -n 'module iaa_crypto +p' > 621/sys/kernel/debug/dynamic_debug/control):: 622 623 [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096 624 [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192 625 [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568 626 [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 627 [ 409.203227] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228 628 [ 409.203235] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21ee3dc000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096 629 [ 409.203239] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21ee3dc000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 630 [ 409.203254] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228 631 [ 409.203256] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21f1551000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096 632 [ 409.203257] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21f1551000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 633 634In order to unregister the IAA crypto algorithms, and register new 635ones using different parameters, any users of the current algorithm 636should be stopped and the IAA workqueues and devices disabled. 637 638In the case of zswap, remove the IAA crypto algorithm as the 639compressor and turn off swap (to remove all references to 640iaa_crypto):: 641 642 echo lzo > /sys/module/zswap/parameters/compressor 643 swapoff -a 644 645 echo 0 > /sys/module/zswap/parameters/accept_threshold_percent 646 echo 0 > /sys/module/zswap/parameters/max_pool_percent 647 echo 0 > /sys/module/zswap/parameters/enabled 648 649Once zswap is disabled and no longer using iaa_crypto, the IAA wqs and 650devices can be disabled. 651 652.. _iaa_disable_script: 653 654IAA disable script 655------------------ 656 657The below script automatically does that:: 658 659 #!/bin/bash 660 661 echo "IAA devices:" 662 lspci -d:0cfe 663 echo "# IAA devices:" 664 lspci -d:0cfe | wc -l 665 666 # 667 # count iaa instances 668 # 669 iaa_dev_id="0cfe" 670 num_iaa=$(lspci -d:${iaa_dev_id} | wc -l) 671 echo "Found ${num_iaa} IAA instances" 672 673 # 674 # disable iaa wqs and devices 675 # 676 echo "Disable IAA" 677 678 for ((i = 1; i < ${num_iaa} * 2; i += 2)); do 679 echo disable wq iax${i}/wq${i}.0 680 accel-config disable-wq iax${i}/wq${i}.0 681 echo disable iaa iax${i} 682 accel-config disable-device iax${i} 683 done 684 685 echo "End Disable IAA" 686 687Finally, at this point the iaa_crypto module can be removed, which 688will unregister the current IAA crypto algorithms:: 689 690 rmmod iaa_crypto 691 692 693memory_madvise.c (gcc -o memory_memadvise memory_madvise.c):: 694 695 #include <stdio.h> 696 #include <stdlib.h> 697 #include <string.h> 698 #include <unistd.h> 699 #include <sys/mman.h> 700 #include <linux/mman.h> 701 702 #ifndef MADV_PAGEOUT 703 #define MADV_PAGEOUT 21 /* force pages out immediately */ 704 #endif 705 706 #define PG_SZ 4096 707 708 int main(int argc, char **argv) 709 { 710 int i, nr_pages = 1; 711 int64_t *dump_ptr; 712 char *addr, *a; 713 int loop = 1; 714 715 if (argc > 1) 716 nr_pages = atoi(argv[1]); 717 718 printf("Allocating %d pages to swap in/out\n", nr_pages); 719 720 /* allocate pages */ 721 addr = mmap(NULL, nr_pages * PG_SZ, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); 722 *addr = 1; 723 724 /* initialize data in page to all '*' chars */ 725 memset(addr, '*', nr_pages * PG_SZ); 726 727 printf("Swapping out %d pages\n", nr_pages); 728 729 /* Tell kernel to swap it out */ 730 madvise(addr, nr_pages * PG_SZ, MADV_PAGEOUT); 731 732 while (loop > 0) { 733 /* Wait for swap out to finish */ 734 sleep(5); 735 736 a = addr; 737 738 printf("Swapping in %d pages\n", nr_pages); 739 740 /* Access the page ... this will swap it back in again */ 741 for (i = 0; i < nr_pages; i++) { 742 if (a[0] != '*') { 743 printf("Bad data from decompress!!!!!\n"); 744 745 dump_ptr = (int64_t *)a; 746 for (int j = 0; j < 100; j++) { 747 printf(" page %d data: %#llx\n", i, *dump_ptr); 748 dump_ptr++; 749 } 750 } 751 752 a += PG_SZ; 753 } 754 755 loop --; 756 } 757 758 printf("Swapped out and in %d pages\n", nr_pages); 759 760Appendix 761======== 762 763.. _iaa_sysfs_config: 764 765IAA sysfs config interface 766-------------------------- 767 768Below is a description of the IAA sysfs interface, which as mentioned 769in the main document, should only be used if you know exactly what you 770are doing. Even then, there's no compelling reason to use it directly 771since accel-config can do everything the sysfs interface can and in 772fact accel-config is based on it under the covers. 773 774The 'IAA config path' is /sys/bus/dsa/devices and contains 775subdirectories representing each IAA device, workqueue, engine, and 776group. Note that in the sysfs interface, the IAA devices are actually 777named using iax e.g. iax1, iax3, etc. (Note that IAA devices are the 778odd-numbered devices; the even-numbered devices are DSA devices and 779can be ignored for IAA). 780 781The 'IAA device bind path' is /sys/bus/dsa/drivers/idxd/bind and is 782the file that is written to enable an IAA device. 783 784The 'IAA workqueue bind path' is /sys/bus/dsa/drivers/crypto/bind and 785is the file that is written to enable an IAA workqueue. 786 787Similarly /sys/bus/dsa/drivers/idxd/unbind and 788/sys/bus/dsa/drivers/crypto/unbind are used to disable IAA devices and 789workqueues. 790 791The basic sequence of commands needed to set up the IAA devices and 792workqueues is: 793 794For each device:: 795 1) Disable any workqueues enabled on the device. For example to 796 disable workques 0 and 1 on IAA device 3:: 797 798 # echo wq3.0 > /sys/bus/dsa/drivers/crypto/unbind 799 # echo wq3.1 > /sys/bus/dsa/drivers/crypto/unbind 800 801 2) Disable the device. For example to disable IAA device 3:: 802 803 # echo iax3 > /sys/bus/dsa/drivers/idxd/unbind 804 805 3) configure the desired workqueues. For example, to configure 806 workqueue 3 on IAA device 3:: 807 808 # echo dedicated > /sys/bus/dsa/devices/iax3/wq3.3/mode 809 # echo 128 > /sys/bus/dsa/devices/iax3/wq3.3/size 810 # echo 0 > /sys/bus/dsa/devices/iax3/wq3.3/group_id 811 # echo 10 > /sys/bus/dsa/devices/iax3/wq3.3/priority 812 # echo "kernel" > /sys/bus/dsa/devices/iax3/wq3.3/type 813 # echo "iaa_crypto" > /sys/bus/dsa/devices/iax3/wq3.3/name 814 # echo "crypto" > /sys/bus/dsa/devices/iax3/wq3.3/driver_name 815 816 4) Enable the device. For example to enable IAA device 3:: 817 818 # echo iax3 > /sys/bus/dsa/drivers/idxd/bind 819 820 5) Enable the desired workqueues on the device. For example to 821 enable workques 0 and 1 on IAA device 3:: 822 823 # echo wq3.0 > /sys/bus/dsa/drivers/crypto/bind 824 # echo wq3.1 > /sys/bus/dsa/drivers/crypto/bind 825