2f681ba4 | 11-Nov-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: move thread info into task
This selects the THREAD_INFO_IN_TASK option for UM and changes the way that the current task is discovered. This is trivial though, as UML already tracks the current t
um: move thread info into task
This selects the THREAD_INFO_IN_TASK option for UM and changes the way that the current task is discovered. This is trivial though, as UML already tracks the current task in cpu_tasks[] and this can be used to retrieve it.
Also remove the signal handler code that copies the thread information into the IRQ stack. It is obsolete now, which also means that the mentioned race condition cannot happen anymore.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Hajime Tazaki <thehajime@gmail.com> Link: https://patch.msgid.link/20241111102910.46512-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
ce6e85a1 | 03-Nov-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: remove broken double fault detection
The show_stack function had some code to detect double faults. However, the logic is wrong and it would e.g. trigger if a WARNING happened inside an IRQ.
Re
um: remove broken double fault detection
The show_stack function had some code to detect double faults. However, the logic is wrong and it would e.g. trigger if a WARNING happened inside an IRQ.
Remove it without trying to add a new logic. The current behaviour, which will just fault repeatedly until the IRQ stack is used up and the host kills UML, seems to be good enough.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241103150506.1367695-5-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
b69f22df | 03-Nov-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: remove duplicate UM_NSEC_PER_SEC definition
Just remove the first entry as there is a second later on.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/2024
um: remove duplicate UM_NSEC_PER_SEC definition
Just remove the first entry as there is a second later on.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241103150506.1367695-4-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
37c69115 | 03-Nov-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: remove file sync for stub data
There is no need to sync the stub code to "disk" for the other process to see the correct memory. Drop the fsync there and remove the helper function.
Signed-off-
um: remove file sync for stub data
There is no need to sync the stub code to "disk" for the other process to see the correct memory. Drop the fsync there and remove the helper function.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241103150506.1367695-3-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
0b8b2668 | 10-Oct-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: insert scheduler ticks when userspace does not yield
In time-travel mode userspace can do a lot of work without any time passing. Unfortunately, this can result in OOM situations as the RCU core
um: insert scheduler ticks when userspace does not yield
In time-travel mode userspace can do a lot of work without any time passing. Unfortunately, this can result in OOM situations as the RCU core code will never be run.
Work around this by keeping track of userspace processes that do not yield for a lot of operations. When this happens, insert a jiffie into the sched_clock clock to account time against the process and cause the bookkeeping to run.
As sched_clock is used for tracing, it is useful to keep it in sync between the different VMs. As such, try to remove added ticks again when the actual clock ticks.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241010142537.1134685-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
188b64f2 | 10-Oct-2024 |
Johannes Berg <johannes.berg@intel.com> |
um: remove fault_catcher infrastructure
This was perhaps intended to do _nofault copies, but the real reason is lost to history. Remove this, it's not needed, and using longjmp() out of the middle o
um: remove fault_catcher infrastructure
This was perhaps intended to do _nofault copies, but the real reason is lost to history. Remove this, it's not needed, and using longjmp() out of the middle of the signal handler with all the state it has modified is not going to be a good idea anyway.
Link: https://patch.msgid.link/20241010224513.901c4d390b3e.Ia74742668b44603c1ca23dd36f90e964e6e7ee55@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
68b9883c | 19-Sep-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: Discover host_task_size from envp
When loading the UML binary, the host kernel will place the stack at the highest possible address. It will then map the program name and environment variables o
um: Discover host_task_size from envp
When loading the UML binary, the host kernel will place the stack at the highest possible address. It will then map the program name and environment variables onto the start of the stack.
As such, an easy way to figure out the host_task_size is to use the highest pointer to an environment variable as a reference.
Ensure that this works by disabling address layout randomization and re-executing UML in case it was enabled.
This increases the available TASK_SIZE for 64 bit UML considerably.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240919124511.282088-9-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
32e8eaf2 | 19-Sep-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: use execveat to create userspace MMs
Using clone will not undo features that have been enabled by libc. An example of this already happening is rseq, which could cause the kernel to read/write m
um: use execveat to create userspace MMs
Using clone will not undo features that have been enabled by libc. An example of this already happening is rseq, which could cause the kernel to read/write memory of the userspace process. In the future the standard library might also use mseal by default to protect itself, which would also thwart our attempts at unmapping everything.
Solve all this by taking a step back and doing an execve into a tiny static binary that sets up the minimal environment required for the stub without using any standard library. That way we have a clean execution environment that is fully under the control of UML.
Note that this changes things a bit as the FDs are not anymore shared with the kernel. Instead, we explicitly share the FDs for the physical memory and all existing iomem regions. Doing this is fine, as iomem regions cannot be added at runtime.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240919124511.282088-3-benjamin@sipsolutions.net [use pipe() instead of pipe2(), remove unneeded close() calls] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
5a695127 | 13-Sep-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: always use the internal copy of the FP registers
When switching from userspace to the kernel, all registers including the FP registers are copied into the kernel and restored later on. As such,
um: always use the internal copy of the FP registers
When switching from userspace to the kernel, all registers including the FP registers are copied into the kernel and restored later on. As such, the true source for the FP register state is actually already in the kernel and they should never be grabbed from the userspace process.
Change the various places to simply copy the data from the internal FP register storage area. Note that on i386 the format of PTRACE_GETFPREGS and PTRACE_GETFPXREGS is different enough that conversion would be needed. With this patch, -EINVAL is returned if the non-native format is requested.
The upside is, that this patchset fixes setting registers via ptrace (which simply did not work before) as well as fixing setting floating point registers using the mcontext on signal return on i386.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913133845.964292-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
242fef36 | 16-Sep-2024 |
Tiwei Bie <tiwei.btw@antgroup.com> |
um: Fix the definition for physmem_size
Currently physmem_size is defined as long long but declared locally as unsigned long long before using it in separate .c files. Make them match by defining ph
um: Fix the definition for physmem_size
Currently physmem_size is defined as long long but declared locally as unsigned long long before using it in separate .c files. Make them match by defining physmem_size as unsigned long long and also move the declaration to a common header to allow the compiler to check it.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20240916045950.508910-5-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
71fae9df | 13-Sep-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: Remove unused os_getpgrp function
The function is not used anywhere.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913134442.967599-5-benjamin@sipso
um: Remove unused os_getpgrp function
The function is not used anywhere.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913134442.967599-5-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
377c23c5 | 13-Sep-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: Remove unused os_stop_process
The function is not used anywhere.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913134442.967599-4-benjamin@sipsoluti
um: Remove unused os_stop_process
The function is not used anywhere.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913134442.967599-4-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
47e17496 | 13-Sep-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: Remove unused os_process_parent
The function is not used anywhere.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913134442.967599-3-benjamin@sipsolu
um: Remove unused os_process_parent
The function is not used anywhere.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240913134442.967599-3-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
fe6abeba | 26-Aug-2024 |
Tiwei Bie <tiwei.btw@antgroup.com> |
um: Remove the declaration of user_thread function
This function has never been defined since its declaration was introduced by commit 1da177e4c3f4 ("Linux-2.6.12-rc2").
Signed-off-by: Tiwei Bie <t
um: Remove the declaration of user_thread function
This function has never been defined since its declaration was introduced by commit 1da177e4c3f4 ("Linux-2.6.12-rc2").
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
show more ...
|
bcf3d957 | 03-Jul-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: refactor TLB update handling
Conceptually, we want the memory mappings to always be up to date and represent whatever is in the TLB. To ensure that, we need to sync them over in the userspace ca
um: refactor TLB update handling
Conceptually, we want the memory mappings to always be up to date and represent whatever is in the TLB. To ensure that, we need to sync them over in the userspace case and for the kernel we need to process the mappings.
The kernel will call flush_tlb_* if page table entries that were valid before become invalid. Unfortunately, this is not the case if entries are added.
As such, change both flush_tlb_* and set_ptes to track the memory range that has to be synchronized. For the kernel, we need to execute a flush_tlb_kern_* immediately but we can wait for the first page fault in case of set_ptes. For userspace in contrast we only store that a range of memory needs to be synced and do so whenever we switch to that process.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-13-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
573a446f | 03-Jul-2024 |
Benjamin Berg <benjamin.berg@intel.com> |
um: simplify and consolidate TLB updates
The HVC update was mostly used to compress consecutive calls into one. This is mostly relevant for userspace where it is already handled by the syscall stub
um: simplify and consolidate TLB updates
The HVC update was mostly used to compress consecutive calls into one. This is mostly relevant for userspace where it is already handled by the syscall stub code.
Simplify the whole logic and consolidate it for both kernel and userspace. This does remove the sequential syscall compression for the kernel, however that shouldn't be the main factor in most runs.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-12-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|
3c83170d | 03-Jul-2024 |
Benjamin Berg <benjamin@sipsolutions.net> |
um: Delay flushing syscalls until the thread is restarted
As running the syscalls is expensive due to context switches, we should do so as late as possible in case more syscalls need to be queued la
um: Delay flushing syscalls until the thread is restarted
As running the syscalls is expensive due to context switches, we should do so as late as possible in case more syscalls need to be queued later on. This will also benefit a later move to a SECCOMP enabled userspace as in that case the need for extra context switches is removed entirely.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Link: https://patch.msgid.link/20240703134536.1161108-9-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
show more ...
|