#
04e83267 |
| 15-Dec-2024 |
Bojan Novković <bnovkov@FreeBSD.org> |
x86: Allow sharing of perfomance counter interrupts
This patch refactors the Performance Counter interrupt setup code to allow sharing the interrupt line between multiple drivers. More specifically,
x86: Allow sharing of perfomance counter interrupts
This patch refactors the Performance Counter interrupt setup code to allow sharing the interrupt line between multiple drivers. More specifically, Performance Counter interrupts are used by both hwpmc(4) and hwt(4)'s upcoming Intel Processor Trace backend.
Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D46420
show more ...
|
Revision tags: release/14.2.0 |
|
#
90fb07ed |
| 01-Oct-2024 |
Elliott Mitchell <ehem+freebsd@m5p.com> |
intr/x86: add ioapic_drv_t to reduce number of casts in IO-APIC implementation
void * is handy when you truly do not care about the type. Yet there is so much casting back and forth in the IO-APIC
intr/x86: add ioapic_drv_t to reduce number of casts in IO-APIC implementation
void * is handy when you truly do not care about the type. Yet there is so much casting back and forth in the IO-APIC code as to be hazardous. Achieve better static checking by the compiler using a typedef.
Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1457
show more ...
|
#
ea4e4449 |
| 13-Oct-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
apic: add ioapic_get_dev() method
which returns apic device_t by apic_id, if there exists the pci representer
Sponsored by: Advanced Micro Devices (AMD) Sponsored by: The FreeBSD Foundation MFC aft
apic: add ioapic_get_dev() method
which returns apic device_t by apic_id, if there exists the pci representer
Sponsored by: Advanced Micro Devices (AMD) Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
Revision tags: release/13.4.0, release/14.1.0 |
|
#
4e328681 |
| 10-May-2024 |
Ed Maste <emaste@FreeBSD.org> |
Increase IOAPIC_MAX_ID to 255 (from 254)
A test system provided by AMD panicked with "madt_parse_apics: I/O APIC ID 255 too high". I/O APIC ID 255 is acceptable, so increase the limit.
Reviewed by
Increase IOAPIC_MAX_ID to 255 (from 254)
A test system provided by AMD panicked with "madt_parse_apics: I/O APIC ID 255 too high". I/O APIC ID 255 is acceptable, so increase the limit.
Reviewed by: jhb, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D45157
show more ...
|
Revision tags: release/13.3.0, release/14.0.0 |
|
#
95ee2897 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
4d846d26 |
| 10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
show more ...
|
Revision tags: release/13.2.0, release/12.4.0 |
|
#
c8113dad |
| 20-Oct-2022 |
Ed Maste <emaste@FreeBSD.org> |
Increase MAX_APIC_ID safeguard to 0x800
MAX_APIC_ID must be at least twice MAXCPU. Increase it to 0x800 so that it is possible to set MAXCPU to 512 or 1024 in a custom kernel config file.
Note tha
Increase MAX_APIC_ID safeguard to 0x800
MAX_APIC_ID must be at least twice MAXCPU. Increase it to 0x800 so that it is possible to set MAXCPU to 512 or 1024 in a custom kernel config file.
Note that increasing this limit does not itself cause any allocations to be larger; it just allows madt_parse_cpu() to process higher APIC IDs.
APIC IDs may be sparse and so we can waste memory. This is independent of this change, but becomes more of an issue as the maximum APIC ID grows. This should be addressed with future work.
Reviewed by: royger MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D37067
show more ...
|
Revision tags: release/13.1.0 |
|
#
e0516c75 |
| 13-Jan-2022 |
Roger Pau Monné <royger@FreeBSD.org> |
x86/apic: remove apic_ops
All supported Xen instances by FreeBSD provide a local APIC implementation, so there's no need to replace the native local APIC implementation anymore.
Leave just the ipi_
x86/apic: remove apic_ops
All supported Xen instances by FreeBSD provide a local APIC implementation, so there's no need to replace the native local APIC implementation anymore.
Leave just the ipi_vectored hook in order to be able to override it with an implementation based on event channels if the underlying local APIC is not virtualized by hardware. Note the hook cannot use ifuncs, because at the point where ifuncs are resolved the kernel doesn't yet know whether it will benefit from using the optimization.
Sponsored by: Citrix Systems R&D Reviewed by: kib Differential revision: https://reviews.freebsd.org/D33917
show more ...
|
#
62d09b46 |
| 06-Dec-2021 |
Mark Johnston <markj@FreeBSD.org> |
x86: Defer LAPIC calibration until after timecounters are available
This ensures that we have a good reference timecounter for performing calibration.
Change lapic_setup to avoid configuring the ti
x86: Defer LAPIC calibration until after timecounters are available
This ensures that we have a good reference timecounter for performing calibration.
Change lapic_setup to avoid configuring the timer when booting, and move calibration and initial configuration to a new lapic routine, lapic_calibrate_timer. This calibration will be initiated from cpu_initclocks(), before an eventtimer is selected.
Reviewed by: kib, jhb MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33206
show more ...
|
Revision tags: release/12.3.0, release/13.0.0, release/12.2.0 |
|
#
ab6c81a2 |
| 01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
x86: clean up empty lines in .c and .h files
|
#
c7aa572c |
| 31-Jul-2020 |
Glen Barber <gjb@FreeBSD.org> |
MFH
Sponsored by: Rubicon Communications, LLC (netgate.com)
|
#
aba10e13 |
| 25-Jul-2020 |
Alexander Motin <mav@FreeBSD.org> |
Allow swi_sched() to be called from NMI context.
For purposes of handling hardware error reported via NMIs I need a way to escape NMI context, being too restrictive to do something significant.
To
Allow swi_sched() to be called from NMI context.
For purposes of handling hardware error reported via NMIs I need a way to escape NMI context, being too restrictive to do something significant.
To do it this change introduces new swi_sched() flag SWI_FROMNMI, making it careful about used KPIs. On platforms allowing IPI sending from NMI context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI, otherwise it works just like SWI_DELAY. To handle the delayed SWIs this patch calls clk_intr_event on every hardclock() tick.
MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25754
show more ...
|
#
e2c0e292 |
| 16-Jul-2020 |
Glen Barber <gjb@FreeBSD.org> |
MFH
Sponsored by: Rubicon Communications, LLC (netgate.com)
|
#
dc43978a |
| 14-Jul-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
amd64: allow parallel shootdown IPIs
Stop using smp_ipi_mtx to protect global shootdown state, and move/multiply the global state into pcpu. Now each CPU can initiate shootdown IPI independently fr
amd64: allow parallel shootdown IPIs
Stop using smp_ipi_mtx to protect global shootdown state, and move/multiply the global state into pcpu. Now each CPU can initiate shootdown IPI independently from other CPUs. Initiator enters critical section, then fills its local PCPU shootdown info (pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu, my_cpuid) for each target cpu. After that IPI is sent to all targets which scan for zeroed scoreboard generation words. Upon finding such word the shootdown data is read from corresponding cpu' pcpu, and generation is set. Meantime initiator loops waiting for all zeroed generations in scoreboard to update.
Initiator does not disable interrupts, which should allow non-invalidation IPIs from deadlocking, it only needs to disable preemption to pin itself to the instance of the pcpu smp_tlb data.
The generation is set before the actual invalidation is performed in handler. It is safe because target CPU cannot return to userspace before handler finishes. In principle only NMI can preempt the handler, but NMI would see the kernel handler frame and not touch not-invalidated user page table.
Handlers loop until they do not see zeroed scoreboard generations. This, together with hardware keeping one pending IPI in LAPIC IRR should prevent lost shootdowns.
Notes. 1. The code does protect writes to LAPIC ICR with exclusion. I believe this is fine because we in fact do not send IPIs from interrupt handlers. More for !x2APIC mode where ICR access for write requires two registers write, we disable interrupts around it. If considered incorrect, I can add per-cpu spinlock around ipi_send(). 2. Scoreboard lines owned by given target CPU can be padded to the cache line, to reduce ping-pong.
Reviewed by: markj (previous version) Discussed with: alc Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks Differential revision: https://reviews.freebsd.org/D25510
show more ...
|
Revision tags: release/11.4.0 |
|
#
59abbffa |
| 31-Jan-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357270 through r357349.
|
#
1c29da02 |
| 31-Jan-2020 |
Mark Johnston <markj@FreeBSD.org> |
Reimplement stack capture of running threads on i386 and amd64.
After r355784 the td_oncpu field is no longer synchronized by the thread lock, so the stack capture interrupt cannot be delievered pre
Reimplement stack capture of running threads on i386 and amd64.
After r355784 the td_oncpu field is no longer synchronized by the thread lock, so the stack capture interrupt cannot be delievered precisely. Fix this using a loop which drops the thread lock and restarts if the wrong thread was sampled from the stack capture interrupt handler.
Change the implementation to use a regular interrupt instead of an NMI. Now that we drop the thread lock, there is no advantage to the latter.
Simplify the KPIs. Remove stack_save_td_running() and add a return value to stack_save_td(). On platforms that do not support stack capture of running threads, stack_save_td() returns EOPNOTSUPP. If the target thread is running in user mode, stack_save_td() returns EBUSY.
Reviewed by: kib Reported by: mjg, pho Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23355
show more ...
|
Revision tags: release/12.1.0, release/11.3.0 |
|
#
2aaf9152 |
| 18-Mar-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead@r345275
|
#
ff511f1f |
| 11-Mar-2019 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r344996
|
#
2e43efd0 |
| 06-Mar-2019 |
John Baldwin <jhb@FreeBSD.org> |
Drop "All rights reserved" from my copyright statements.
Reviewed by: rgrimes MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D19485
|
Revision tags: release/12.0.0 |
|
#
da2d1e9d |
| 29-Aug-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r338298 through r338391.
|
#
fd036dea |
| 28-Aug-2018 |
John Baldwin <jhb@FreeBSD.org> |
Dynamically allocate IRQ ranges on x86.
Previously, x86 used static ranges of IRQ values for different types of I/O interrupts. Interrupt pins on I/O APICs and 8259A PICs used IRQ values from 0 to
Dynamically allocate IRQ ranges on x86.
Previously, x86 used static ranges of IRQ values for different types of I/O interrupts. Interrupt pins on I/O APICs and 8259A PICs used IRQ values from 0 to 254. MSI interrupts used a compile-time-defined range starting at 256, and Xen event channels used a compile-time-defined range after MSI. Some recent systems have more than 255 I/O APIC interrupt pins which resulted in those IRQ values overflowing into the MSI range triggering an assertion failure.
Replace statically assigned ranges with dynamic ranges. Do a single pass computing the sizes of the IRQ ranges (PICs, MSI, Xen) to determine the total number of IRQs required. Allocate the interrupt source and interrupt count arrays dynamically once this pass has completed. To minimize runtime complexity these arrays are only sized once during bootup. The PIC range is determined by the PICs present in the system. The MSI and Xen ranges continue to use a fixed size, though this does make it possible to turn the MSI range size into a tunable in the future.
As a result, various places are updated to use dynamic limits instead of constants. In addition, the vmstat(8) utility has been taught to understand that some kernels may treat 'intrcnt' and 'intrnames' as pointers rather than arrays when extracting interrupt stats from a crashdump. This is determined by the presence (vs absence) of a global 'nintrcnt' symbol.
This change reverts r189404 which worked around a buggy BIOS which enumerated an I/O APIC twice (using the same memory mapped address for both entries but using an IRQ base of 256 for one entry and a valid IRQ base for the second entry). Making the "base" of MSI IRQ values dynamic avoids the panic that r189404 worked around, and there may now be valid I/O APICs with an IRQ base above 256 which this workaround would incorrectly skip.
If in the future the issue reported in PR 130483 reoccurs, we will have to add a pass over the I/O APIC entries in the MADT to detect duplicates using the memory mapped address and use some strategy to choose the "correct" one.
While here, reserve room in intrcnts for the Hyper-V counters.
PR: 229429, 130483 Reviewed by: kib, royger, cem Tested by: royger (Xen), kib (DMAR) Approved by: re (gjb) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16861
show more ...
|
#
7847e041 |
| 24-Aug-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r338026 through r338297, and resolve conflicts.
|
#
a5688189 |
| 19-Aug-2018 |
John Baldwin <jhb@FreeBSD.org> |
Remove some vestiges of IPI_LAZYPMAP on i386.
The support for lazy pmap invalidations on i386 was removed in r281707. This removes the constant for the IPI and stops accounting for it when sizing th
Remove some vestiges of IPI_LAZYPMAP on i386.
The support for lazy pmap invalidations on i386 was removed in r281707. This removes the constant for the IPI and stops accounting for it when sizing the interrupt count arrays.
Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16801
show more ...
|
Revision tags: release/11.2.0 |
|
#
315fbaec |
| 23-Feb-2018 |
Ed Maste <emaste@FreeBSD.org> |
Correct pseudo misspelling in sys/ comments
contrib code and #define in intel_ata.h unchanged.
|
#
bd50262f |
| 17-Jan-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
PTI for amd64.
The implementation of the Kernel Page Table Isolation (KPTI) for amd64, first version. It provides a workaround for the 'meltdown' vulnerability. PTI is turned off by default for now
PTI for amd64.
The implementation of the Kernel Page Table Isolation (KPTI) for amd64, first version. It provides a workaround for the 'meltdown' vulnerability. PTI is turned off by default for now, enable with the loader tunable vm.pmap.pti=1.
The pmap page table is split into kernel-mode table and user-mode table. Kernel-mode table is identical to the non-PTI table, while usermode table is obtained from kernel table by leaving userspace mappings intact, but only leaving the following parts of the kernel mapped:
kernel text (but not modules text) PCPU GDT/IDT/user LDT/task structures IST stacks for NMI and doublefault handlers.
Kernel switches to user page table before returning to usermode, and restores full kernel page table on the entry. Initial kernel-mode stack for PTI trampoline is allocated in PCPU, it is only 16 qwords. Kernel entry trampoline switches page tables. then the hardware trap frame is copied to the normal kstack, and execution continues.
IST stacks are kept mapped and no trampoline is needed for NMI/doublefault, but of course page table switch is performed.
On return to usermode, the trampoline is used again, iret frame is copied to the trampoline stack, page tables are switched and iretq is executed. The case of iretq faulting due to the invalid usermode context is tricky, since the frame for fault is appended to the trampoline frame. Besides copying the fault frame and original (corrupted) frame to kstack, the fault frame must be patched to make it look as if the fault occured on the kstack, see the comment in doret_iret detection code in trap().
Currently kernel pages which are mapped during trampoline operation are identical for all pmaps. They are registered using pmap_pti_add_kva(). Besides initial registrations done during boot, LDT and non-common TSS segments are registered if user requested their use. In principle, they can be installed into kernel page table per pmap with some work. Similarly, PCPU can be hidden from userspace mapping using trampoline PCPU page, but again I do not see much benefits besides complexity.
PDPE pages for the kernel half of the user page tables are pre-allocated during boot because we need to know pml4 entries which are copied to the top-level paging structure page, in advance on a new pmap creation. I enforce this to avoid iterating over the all existing pmaps if a new PDPE page is needed for PTI kernel mappings. The iteration is a known problematic operation on i386.
The need to flush hidden kernel translations on the switch to user mode make global tables (PG_G) meaningless and even harming, so PG_G use is disabled for PTI case. Our existing use of PCID is incompatible with PTI and is automatically disabled if PTI is enabled. PCID can be forced on only for developer's benefit.
MCE is known to be broken, it requires IST stack to operate completely correctly even for non-PTI case, and absolutely needs dedicated IST stack because MCE delivery while trampoline did not switched from PTI stack is fatal. The fix is pending.
Reviewed by: markj (partially) Tested by: pho (previous version) Discussed with: jeff, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
show more ...
|