<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="/source/rss.xsl.xml"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
    <title>Changes in entry-percpu.h</title>
    <description></description>
    <language>en</language>
    <copyright>Copyright 2015</copyright>
    <generator>Java</generator><item>
        <title>a737737cdb9c94e40a9926cdc2320f874c05d709 - s390/percpu: Infrastructure for more efficient this_cpu operations</title>
        <link>http://kernelsources.org:8080/source/history/linux/arch/s390/include/asm/entry-percpu.h#a737737cdb9c94e40a9926cdc2320f874c05d709</link>
        <description>s390/percpu: Infrastructure for more efficient this_cpu operationsWith the intended removal of PREEMPT_NONE this_cpu operations based onatomic instructions, guarded with preempt_disable()/preempt_enable() pairsbecome more expensive: the preempt_disable() / preempt_enable() pairs arenot optimized away anymore during compile time.In particular the conditional call to preempt_schedule_notrace() afterpreempt_enable() adds additional code and register pressure.E.g. this simple C code sequenceDEFINE_PER_CPU(long, foo);long bar(long a) { return this_cpu_add_return(foo, a); }generates this code:  11a976:       eb af f0 68 00 24       stmg    %r10,%r15,104(%r15)  11a97c:       b9 04 00 ef             lgr     %r14,%r15  11a980:       b9 04 00 b2             lgr     %r11,%r2  11a984:       e3 f0 ff c8 ff 71       lay     %r15,-56(%r15)  11a98a:       e3 e0 f0 98 00 24       stg     %r14,152(%r15)  11a990:       eb 01 03 a8 00 6a       asi     936,1            &lt;- __preempt_count_add(1)  11a996:       c0 10 00 d2 ac b5       larl    %r1,1b70300      &lt;- address of percpu var  11a9a0:       e3 10 23 b8 00 08       ag      %r1,952          &lt;- add percpu offset  11a9a6:       eb ab 10 00 00 e8       laag    %r10,%r11,0(%r1) &lt;- atomic op  11a9ac:       eb ff 03 a8 00 6e       alsi    936,-1           &lt;- __preempt_count_dec_and_test()  11a9b2:       a7 54 00 05             jnhe    11a9bc &lt;bar+0x4c&gt;  11a9b6:       c0 e5 00 76 d1 bd       brasl   %r14,ff4d30 &lt;preempt_schedule_notrace&gt;  11a9bc:       b9 e8 b0 2a             agrk    %r2,%r10,%r11  11a9c0:       eb af f0 a0 00 04       lmg     %r10,%r15,160(%r15)  11a9c6        07 fe                   br      %r14Even though the above example is more or less the worst case, since thebranch to preempt_schedule_notrace() requires a stackframe, whichotherwise wouldn&apos;t be necessary, there is also the conditional jnhe branchinstruction.Get rid of the conditional branch with the following code sequence:  11a8e6:       c0 30 00 d0 c5 0d       larl    %r3,1b33300  11a8ec:       b9 04 00 43             lgr     %r4,%r3  11a8f0:       eb 00 43 c0 00 52       mviy    960,4  11a8f6:       e3 40 03 b8 00 08       ag      %r4,952  11a8fc:       eb 52 40 00 00 e8       laag    %r5,%r2,0(%r4)  11a902:       eb 00 03 c0 00 52       mviy    960,0  11a908:       b9 08 00 25             agr     %r2,%r5  11a90c        07 fe                   br      %r14The general idea is that this_cpu operations based on atomic instructionsare guarded with mviy instructions:- The first mviy instruction writes the register number, which contains  the percpu address variable to lowcore. This also indicates that a  percpu code section is executed.- The first instruction following the mviy instruction must be the ag  instruction which adds the percpu offset to the percpu address register.- Afterwards the atomic percpu operation follows.- Then a second mviy instruction writes a zero to lowcore, which indicates  the end of the percpu code section.- In case of an interrupt/exception/nmi the register number which was  written to lowcore is copied to the exception frame (pt_regs), and a zero  is written to lowcore.- On return to the previous context it is checked if a percpu code section  was executed (saved register number not zero), and if the process was  migrated to a different cpu. If the percpu offset was already added to  the percpu address register (instruction address does _not_ point to the  ag instruction) the content of the percpu address register is adjusted so  it points to percpu variable of the new cpu.Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;

            List of files:
            /linux/arch/s390/include/asm/entry-percpu.h</description>
        <pubDate>Tue, 26 May 2026 07:56:56 +0200</pubDate>
        <dc:creator>Heiko Carstens &lt;hca@linux.ibm.com&gt;</dc:creator>
    </item>
</channel>
</rss>
