/linux/arch/parisc/include/asm/ |
H A D | futex.h | diff d0585d742ff2d82accd26c661c60a6d260429c4a Tue Jan 04 22:44:32 CET 2022 John David Anglin <dave.anglin@bell.net> parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap. Particularly for userspace, it is difficult to implement software atomic support. Page faults in critical regions can cause processes to sleep and block the forward progress of other processes. Thus, it is essential that page faults be disabled in critical regions. For performance reasons, we also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the critical region. Fortunately, parisc has the "stbys,e" instruction. When the leftmost byte of a word is addressed, this instruction triggers all the exceptions of a normal store but it does not write to memory. Thus, we can use it to trigger COW breaks outside the critical region without modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e" instruction, we still need to disable pagefaults around the critical region. If a fault occurs in the critical region, we return -EAGAIN. I had to add a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that returning -EAGAIN caused problems for some processes even though it is listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it might be possible to allow processes to be scheduled when they are on the gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from arch_futex_atomic_op_inuser(). It is always called with page faults disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
|
/linux/arch/parisc/kernel/ |
H A D | asm-offsets.c | diff d0585d742ff2d82accd26c661c60a6d260429c4a Tue Jan 04 22:44:32 CET 2022 John David Anglin <dave.anglin@bell.net> parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap. Particularly for userspace, it is difficult to implement software atomic support. Page faults in critical regions can cause processes to sleep and block the forward progress of other processes. Thus, it is essential that page faults be disabled in critical regions. For performance reasons, we also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the critical region. Fortunately, parisc has the "stbys,e" instruction. When the leftmost byte of a word is addressed, this instruction triggers all the exceptions of a normal store but it does not write to memory. Thus, we can use it to trigger COW breaks outside the critical region without modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e" instruction, we still need to disable pagefaults around the critical region. If a fault occurs in the critical region, we return -EAGAIN. I had to add a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that returning -EAGAIN caused problems for some processes even though it is listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it might be possible to allow processes to be scheduled when they are on the gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from arch_futex_atomic_op_inuser(). It is always called with page faults disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
|
H A D | syscall.S | diff d0585d742ff2d82accd26c661c60a6d260429c4a Tue Jan 04 22:44:32 CET 2022 John David Anglin <dave.anglin@bell.net> parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap. Particularly for userspace, it is difficult to implement software atomic support. Page faults in critical regions can cause processes to sleep and block the forward progress of other processes. Thus, it is essential that page faults be disabled in critical regions. For performance reasons, we also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the critical region. Fortunately, parisc has the "stbys,e" instruction. When the leftmost byte of a word is addressed, this instruction triggers all the exceptions of a normal store but it does not write to memory. Thus, we can use it to trigger COW breaks outside the critical region without modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e" instruction, we still need to disable pagefaults around the critical region. If a fault occurs in the critical region, we return -EAGAIN. I had to add a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that returning -EAGAIN caused problems for some processes even though it is listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it might be possible to allow processes to be scheduled when they are on the gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from arch_futex_atomic_op_inuser(). It is always called with page faults disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
|