workqueue.rst - OpenGrok cross reference for /linux/Documentation/core-api/workqueue.rst

Lines Matching +full:multi +full:- +full:threaded
31 In the original wq implementation, a multi threaded (MT) wq had one
32 worker thread per CPU and a single threaded (ST) wq had one worker
33 thread system-wide.  A single MT wq needed to keep around the same
60 * Use per-CPU unified worker pools shared by all wq to provide
82 For threaded workqueues, special purpose threads, called [k]workers, execute
85 worker-pools.
87 The cmwq design differentiates between the user-facing workqueues that
89 which manages worker-pools and processes the queued work items.
91 There are two worker-pools, one for normal work items and the other
93 worker-pools to serve work items queued on unbound workqueues - the
98 Each per-CPU BH worker pool contains only one pseudo worker which represents
110 When a work item is queued to a workqueue, the target worker-pool is
112 and appended on the shared worklist of the worker-pool.  For example,
114 be queued on the worklist of either normal or highpri worker-pool that
123 Each worker-pool bound to an actual CPU implements concurrency
124 management by hooking into the scheduler.  The worker-pool is notified
130 workers on the CPU, the worker-pool doesn't start execution of a new
152 wq's that have a rescue-worker reserved for execution under memory
153 pressure.  Else it is possible that the worker-pool deadlocks waiting
162 removal.  ``alloc_workqueue()`` takes three arguments - ``@name``,
173 ---------
177   workqueues are always per-CPU and all BH work items are executed in the
188   worker-pools which host workers which are not bound to any
191   worker-pools try to start execution of work items as soon as
215   worker-pool of the target cpu.  Highpri worker-pools are
218   Note that normal and highpri worker-pools don't interact with
226   worker-pool from starting execution.  This is useful for bound
233   non-CPU-intensive work items can delay execution of CPU
240 --------------
245 at the same time per CPU. This is always a per-CPU attribute, even for
389   worker on the same CPU. This makes unbound workqueues behave as per-cpu
424   item starts execution, workqueue makes a best-effort attempt to ensure
443 kernel, there exists a pronounced trade-off between locality and utilization
450 testing with dm-crypt clearly illustrates this trade-off.
452 The tests are run on a CPU with 12-cores/24-threads split across four L3
454 ``/dev/dm-0`` is a dm-crypt device created on NVME SSD (Samsung 990 PRO) and
459 -------------------------------------------------------------
463   $ fio --filename=/dev/dm-0 --direct=1 --rw=randrw --bs=32k --ioengine=libaio \
464     --iodepth=64 --runtime=60 --numjobs=24 --time_based --group_reporting \
465     --name=iops-test-job --verify=sha512
467 There are 24 issuers, each issuing 64 IOs concurrently. ``--verify=sha512``
474 .. list-table::
476    :header-rows: 1
478    * - Affinity
479      - Bandwidth (MiBps)
480      - CPU util (%)
482    * - system
483      - 1159.40 ±1.34
484      - 99.31 ±0.02
486    * - cache
487      - 1166.40 ±0.89
488      - 99.34 ±0.01
490    * - cache (strict)
491      - 1166.00 ±0.71
492      - 99.35 ±0.01
496 machine but the cache-affine ones outperform by 0.6% thanks to improved
501 -----------------------------------------------------
505   $ fio --filename=/dev/dm-0 --direct=1 --rw=randrw --bs=32k \
506     --ioengine=libaio --iodepth=64 --runtime=60 --numjobs=8 \
507     --time_based --group_reporting --name=iops-test-job --verify=sha512
509 The only difference from the previous scenario is ``--numjobs=8``. There are
513 .. list-table::
515    :header-rows: 1
517    * - Affinity
518      - Bandwidth (MiBps)
519      - CPU util (%)
521    * - system
522      - 1155.40 ±0.89
523      - 97.41 ±0.05
525    * - cache
526      - 1154.40 ±1.14
527      - 96.15 ±0.09
529    * - cache (strict)
530      - 1112.00 ±4.64
531      - 93.26 ±0.35
544 -----------------------------------------------------------
548   $ fio --filename=/dev/dm-0 --direct=1 --rw=randrw --bs=32k \
549     --ioengine=libaio --iodepth=64 --runtime=60 --numjobs=4 \
550     --time_based --group_reporting --name=iops-test-job --verify=sha512
552 Again, the only difference is ``--numjobs=4``. With the number of issuers
556 .. list-table::
558    :header-rows: 1
560    * - Affinity
561      - Bandwidth (MiBps)
562      - CPU util (%)
564    * - system
565      - 993.60 ±1.82
566      - 75.49 ±0.06
568    * - cache
569      - 973.40 ±1.52
570      - 74.90 ±0.07
572    * - cache (strict)
573      - 828.20 ±4.49
574      - 66.84 ±0.29
581 ------------------------------
588 While the loss of work-conservation in certain scenarios hurts, it is a lot
599   ``WQ_CPU_INTENSIVE`` per-cpu workqueue. There is no real advanage to the
605 * The loss of work-conservation in non-strict affinity scopes is likely
608   work-conservation in most cases. As such, it is possible that future
650     pod_node [0]=-1
656   pool[01] ref= 1 nice=-20 idle/workers=  2/  2 cpu=  0
658   pool[03] ref= 1 nice=-20 idle/workers=  2/  2 cpu=  1
660   pool[05] ref= 1 nice=-20 idle/workers=  2/  2 cpu=  2
662   pool[07] ref= 1 nice=-20 idle/workers=  2/  2 cpu=  3
666   pool[11] ref= 1 nice=-20 idle/workers=  1/  1 cpus=0000000f
667   pool[12] ref= 2 nice=-20 idle/workers=  1/  1 cpus=00000003
668   pool[13] ref= 2 nice=-20 idle/workers=  1/  1 cpus=0000000c
670   Workqueue CPU -> pool
696   events                      18545     0      6.1       0       5       -       -
697   events_highpri                  8     0      0.0       0       0       -       -
698   events_long                     3     0      0.0       0       0       -       -
699   events_unbound              38306     0      0.1       -       7       -       -
700   events_freezable                0     0      0.0       0       0       -       -
701   events_power_efficient      29598     0      0.2       0       0       -       -
702   events_freezable_pwr_ef        10     0      0.0       0       0       -       -
703   sock_diag_events                0     0      0.0       0       0       -       -
706   events                      18548     0      6.1       0       5       -       -
707   events_highpri                  8     0      0.0       0       0       -       -
708   events_long                     3     0      0.0       0       0       -       -
709   events_unbound              38322     0      0.1       -       7       -       -
710   events_freezable                0     0      0.0       0       0       -       -
711   events_power_efficient      29603     0      0.2       0       0       -       -
712   events_freezable_pwr_ef        10     0      0.0       0       0       -       -
713   sock_diag_events                0     0      0.0       0       0       -       -
760 Non-reentrance Conditions
763 Workqueue guarantees that a work item cannot be re-entrant if the following
771 executed by at most one worker system-wide at any given time.
781 .. kernel-doc:: include/linux/workqueue.h
783 .. kernel-doc:: kernel/workqueue.c
In current file

In project "undefined"

On Google