xref: /linux/Documentation/scheduler/sched-arch.rst (revision d6a3b247627a3bc0551504eb305d624cc6fb5453)
1*d6a3b247SMauro Carvalho Chehab=================================================================
2*d6a3b247SMauro Carvalho ChehabCPU Scheduler implementation hints for architecture specific code
3*d6a3b247SMauro Carvalho Chehab=================================================================
4*d6a3b247SMauro Carvalho Chehab
5*d6a3b247SMauro Carvalho Chehab	Nick Piggin, 2005
6*d6a3b247SMauro Carvalho Chehab
7*d6a3b247SMauro Carvalho ChehabContext switch
8*d6a3b247SMauro Carvalho Chehab==============
9*d6a3b247SMauro Carvalho Chehab1. Runqueue locking
10*d6a3b247SMauro Carvalho ChehabBy default, the switch_to arch function is called with the runqueue
11*d6a3b247SMauro Carvalho Chehablocked. This is usually not a problem unless switch_to may need to
12*d6a3b247SMauro Carvalho Chehabtake the runqueue lock. This is usually due to a wake up operation in
13*d6a3b247SMauro Carvalho Chehabthe context switch. See arch/ia64/include/asm/switch_to.h for an example.
14*d6a3b247SMauro Carvalho Chehab
15*d6a3b247SMauro Carvalho ChehabTo request the scheduler call switch_to with the runqueue unlocked,
16*d6a3b247SMauro Carvalho Chehabyou must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file
17*d6a3b247SMauro Carvalho Chehab(typically the one where switch_to is defined).
18*d6a3b247SMauro Carvalho Chehab
19*d6a3b247SMauro Carvalho ChehabUnlocked context switches introduce only a very minor performance
20*d6a3b247SMauro Carvalho Chehabpenalty to the core scheduler implementation in the CONFIG_SMP case.
21*d6a3b247SMauro Carvalho Chehab
22*d6a3b247SMauro Carvalho ChehabCPU idle
23*d6a3b247SMauro Carvalho Chehab========
24*d6a3b247SMauro Carvalho ChehabYour cpu_idle routines need to obey the following rules:
25*d6a3b247SMauro Carvalho Chehab
26*d6a3b247SMauro Carvalho Chehab1. Preempt should now disabled over idle routines. Should only
27*d6a3b247SMauro Carvalho Chehab   be enabled to call schedule() then disabled again.
28*d6a3b247SMauro Carvalho Chehab
29*d6a3b247SMauro Carvalho Chehab2. need_resched/TIF_NEED_RESCHED is only ever set, and will never
30*d6a3b247SMauro Carvalho Chehab   be cleared until the running task has called schedule(). Idle
31*d6a3b247SMauro Carvalho Chehab   threads need only ever query need_resched, and may never set or
32*d6a3b247SMauro Carvalho Chehab   clear it.
33*d6a3b247SMauro Carvalho Chehab
34*d6a3b247SMauro Carvalho Chehab3. When cpu_idle finds (need_resched() == 'true'), it should call
35*d6a3b247SMauro Carvalho Chehab   schedule(). It should not call schedule() otherwise.
36*d6a3b247SMauro Carvalho Chehab
37*d6a3b247SMauro Carvalho Chehab4. The only time interrupts need to be disabled when checking
38*d6a3b247SMauro Carvalho Chehab   need_resched is if we are about to sleep the processor until
39*d6a3b247SMauro Carvalho Chehab   the next interrupt (this doesn't provide any protection of
40*d6a3b247SMauro Carvalho Chehab   need_resched, it prevents losing an interrupt):
41*d6a3b247SMauro Carvalho Chehab
42*d6a3b247SMauro Carvalho Chehab	4a. Common problem with this type of sleep appears to be::
43*d6a3b247SMauro Carvalho Chehab
44*d6a3b247SMauro Carvalho Chehab	        local_irq_disable();
45*d6a3b247SMauro Carvalho Chehab	        if (!need_resched()) {
46*d6a3b247SMauro Carvalho Chehab	                local_irq_enable();
47*d6a3b247SMauro Carvalho Chehab	                *** resched interrupt arrives here ***
48*d6a3b247SMauro Carvalho Chehab	                __asm__("sleep until next interrupt");
49*d6a3b247SMauro Carvalho Chehab	        }
50*d6a3b247SMauro Carvalho Chehab
51*d6a3b247SMauro Carvalho Chehab5. TIF_POLLING_NRFLAG can be set by idle routines that do not
52*d6a3b247SMauro Carvalho Chehab   need an interrupt to wake them up when need_resched goes high.
53*d6a3b247SMauro Carvalho Chehab   In other words, they must be periodically polling need_resched,
54*d6a3b247SMauro Carvalho Chehab   although it may be reasonable to do some background work or enter
55*d6a3b247SMauro Carvalho Chehab   a low CPU priority.
56*d6a3b247SMauro Carvalho Chehab
57*d6a3b247SMauro Carvalho Chehab      - 5a. If TIF_POLLING_NRFLAG is set, and we do decide to enter
58*d6a3b247SMauro Carvalho Chehab	an interrupt sleep, it needs to be cleared then a memory
59*d6a3b247SMauro Carvalho Chehab	barrier issued (followed by a test of need_resched with
60*d6a3b247SMauro Carvalho Chehab	interrupts disabled, as explained in 3).
61*d6a3b247SMauro Carvalho Chehab
62*d6a3b247SMauro Carvalho Chehabarch/x86/kernel/process.c has examples of both polling and
63*d6a3b247SMauro Carvalho Chehabsleeping idle functions.
64*d6a3b247SMauro Carvalho Chehab
65*d6a3b247SMauro Carvalho Chehab
66*d6a3b247SMauro Carvalho ChehabPossible arch/ problems
67*d6a3b247SMauro Carvalho Chehab=======================
68*d6a3b247SMauro Carvalho Chehab
69*d6a3b247SMauro Carvalho ChehabPossible arch problems I found (and either tried to fix or didn't):
70*d6a3b247SMauro Carvalho Chehab
71*d6a3b247SMauro Carvalho Chehabia64 - is safe_halt call racy vs interrupts? (does it sleep?) (See #4a)
72*d6a3b247SMauro Carvalho Chehab
73*d6a3b247SMauro Carvalho Chehabsh64 - Is sleeping racy vs interrupts? (See #4a)
74*d6a3b247SMauro Carvalho Chehab
75*d6a3b247SMauro Carvalho Chehabsparc - IRQs on at this point(?), change local_irq_save to _disable.
76*d6a3b247SMauro Carvalho Chehab      - TODO: needs secondary CPUs to disable preempt (See #1)
77