#
9f1a007d |
| 26-Mar-2019 |
Justin Hibbits <jhibbits@FreeBSD.org> |
powerpc64: Micro-optimize moea64 native pmap tlbie
* Cache moea64_need_lock in a local variable; gcc generates slightly better code this way, it doesn't need to reload the value from memory each r
powerpc64: Micro-optimize moea64 native pmap tlbie
* Cache moea64_need_lock in a local variable; gcc generates slightly better code this way, it doesn't need to reload the value from memory each read. * VPN cropping is only needed on PowerPC ISA 2.02 and older cores, a subset of those that need serialization, so move this under the need_lock check, so those that don't need the lock don't even need to check this.
show more ...
|
#
bc94b700 |
| 22-Mar-2019 |
Justin Hibbits <jhibbits@FreeBSD.org> |
powerpc: Re-merge isa3 HPT with moea64 native HPT
r345402 fixed the bug that led to the split of the ISA 3.0 HPT handling from the existing manager. The cause of the bug was gcc moving the register
powerpc: Re-merge isa3 HPT with moea64 native HPT
r345402 fixed the bug that led to the split of the ISA 3.0 HPT handling from the existing manager. The cause of the bug was gcc moving the register holding VPN to a different register (not r0), which triggered bizarre behaviors. With the fix, things work, so they can be re-merged. No performance lost with the merge.
show more ...
|
#
091a23cb |
| 22-Mar-2019 |
Justin Hibbits <jhibbits@FreeBSD.org> |
powerpc64: Handle the modern (2.05+) implementaiton of tlbie
By happenstance gcc4 puts 'vpn' into r0 in all uses of TLBIE(), but modern gcc does not. Also, the single-argument form of tlbie zeros a
powerpc64: Handle the modern (2.05+) implementaiton of tlbie
By happenstance gcc4 puts 'vpn' into r0 in all uses of TLBIE(), but modern gcc does not. Also, the single-argument form of tlbie zeros all unused arguments, making the modern tlbie instruction use r0 as the RS field (LPID).
The vpn argument has the bottom 12 bits cleared (the input having been left-shifted by 12 bits), which just so happens, on the POWER9 and previous incarnations, to be the number of LPID bits supported. With those bits being zero, the instruction:
tlbie r0, r0
will invalidate the VPN in r0, in LPAR 0 (ignoring the upper bits of r0 for the RS field). One build with gcc8 yields:
tlbie r9, r0
with r0 having arbitrary contents, not equal to r9. This leads to strange crashes, behaviors, and panics, due to the requested TLB entry not actually being invalidated.
As the moea64_native must work on both old and new, we explicitly zero out r0 so that it can work with only the single argument, built with base gcc and modern gcc. isa3_hashtb takes a different approach, encoding the two-argument form, soas not to explicitly clobber r0, and instead let the compiler decide.
Reported by: Brandon Bergren Tested by: Brandon Bergren MFC after: 1 week
show more ...
|
Revision tags: release/12.0.0, release/11.2.0 |
|
#
ebf95d96 |
| 14-Jun-2018 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Split the PowerISA 3.0 HPT implementation from historic
PowerISA 3.0 makes several changes to not only the format of the HPT but also the behavior surrounding it. For instance, TLBIE no longer requ
Split the PowerISA 3.0 HPT implementation from historic
PowerISA 3.0 makes several changes to not only the format of the HPT but also the behavior surrounding it. For instance, TLBIE no longer requires serialization. Removing this lock cuts buildworld time in half on a 18-core/72-thread POWER9 system, demonstrating that this lock is highly contended on such a system.
There was odd behavior observed trying to make this change in a backwards-compatible manner in moea64_native.c, so the best option was to fully split it, and largely revert the original changes adding POWER9 support to the original file.
Suggested by: nwhitehorn
show more ...
|
#
402c7806 |
| 14-Jun-2018 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Fix CTR formatting for moea64_native bootstrap
On very large memory systems 'size' can become 2GB or larger, resulting in a negative value being formatted. Also, moea64_pteg_count is already a long
Fix CTR formatting for moea64_native bootstrap
On very large memory systems 'size' can become 2GB or larger, resulting in a negative value being formatted. Also, moea64_pteg_count is already a long, so format it as such.
show more ...
|
#
5ab39b65 |
| 26-May-2018 |
Justin Hibbits <jhibbits@FreeBSD.org> |
On POWER9 clear the HID0_RADIX before enabling the page tables
POWER9 supports Radix page tables in addition to Hashed page tables. When Radix page tables are in use, the TLB is cut in half, so tha
On POWER9 clear the HID0_RADIX before enabling the page tables
POWER9 supports Radix page tables in addition to Hashed page tables. When Radix page tables are in use, the TLB is cut in half, so that half of the TLB is used for the page walk cache. This is the default behavior, however FreeBSD currently does not support Radix tables. Clear this bit so that we can use the full TLB. Do this in the MMU logic so that configuration can be localized to the specific translation format. Once we do support Radix tables, the setup for that will be localized to the Radix MMU kobj.
show more ...
|
#
204d7432 |
| 26-May-2018 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Only crop the VPN on POWER4 and derivatives for TLBIE operations
Summary: PowerISA 2.03 and later require bits 14:65 in the RB register argument, which is the full value of the vpn argument post-shi
Only crop the VPN on POWER4 and derivatives for TLBIE operations
Summary: PowerISA 2.03 and later require bits 14:65 in the RB register argument, which is the full value of the vpn argument post-shift. Only POWER4, POWER4+, and PPC970* need the upper 16 bits cropped.
With this change FreeBSD can boot to multi-user on POWER9.
Reviewed by: nwhitehorn Differential Revision: https://reviews.freebsd.org/D15581
show more ...
|
#
b00df92b |
| 14-May-2018 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Final fix for alignment issues with the page table first patched with r333273 and partially reverted with r333594.
Older CPUs implement addition of offsets into the page table by a bitwise OR rather
Final fix for alignment issues with the page table first patched with r333273 and partially reverted with r333594.
Older CPUs implement addition of offsets into the page table by a bitwise OR rather than actual addition, which only works if the table is aligned at a multiple of its own size (they also require it to be aligned at a multiple of 256KB). Newer ones do not have that requirement, but it hardly matters to enforce it anyway.
The original code was failing on newer systems with huge amounts of RAM (> 512 GB), in which the page table was 4 GB in size. Because the bootstrap memory allocator took its alignment parameter as an int, this turned into a 0, removing any alignment constraint at all and making the MMU fail. The first round of this patch (r333273) fixed this case by aligning it at 256 KB, which broke older CPUs. Fix this instead by widening the alignment parameter.
show more ...
|
#
b9ff14e6 |
| 14-May-2018 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Revert changes to hash table alignment in r333273, which booting on all G5 systems, pending further analysis.
|
#
10d0cdfc |
| 05-May-2018 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Add support for powernv POWER9 MMU initialization
The POWER9 MMU (PowerISA 3.0) is slightly different from current configurations, using a partition table even for hypervisor mode, and dropping the
Add support for powernv POWER9 MMU initialization
The POWER9 MMU (PowerISA 3.0) is slightly different from current configurations, using a partition table even for hypervisor mode, and dropping the SDR1 register. Key off the newly early-enabled CPU features flags for the new architecture, and configure the MMU appropriately.
The POWER9 MMU ignores the "PSIZ" field in the PTCR, and expects a 64kB table. As we are enabled for powernv (hypervisor mode, no VMs), only initialize partition table entry 0, and zero out the rest. The actual contents of the register are identical to SDR1 from previous architectures.
Along with this, fix a bug in the page table allocation with very large memory. The table can be allocated on any 256k boundary. The bootstrap_alloc alignment argument is an int, and with large amounts of memory passing the size of the table as the alignment will overflow an integer. Hard-code the alignment at 256k as wider alignment is not necessary.
Reviewed by: nwhitehorn Tested by: Breno Leitao Relnotes: Yes
show more ...
|
#
f9edb09d |
| 07-Mar-2018 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Move the powerpc64 direct map base address from zero to high memory. This accomplishes a few things: - Makes NULL an invalid address in the kernel, which is useful for catching bugs. - Lays groundw
Move the powerpc64 direct map base address from zero to high memory. This accomplishes a few things: - Makes NULL an invalid address in the kernel, which is useful for catching bugs. - Lays groundwork for radix-tree translation on POWER9, which requires the direct map be at high memory. - Similarly lays groundwork for a direct map on 64-bit Book-E.
The new base address is chosen as the base of the fourth radix quadrant (the minimum kernel address in this translation mode) and because all supported CPUs ignore at least the first two bits of addresses in real mode, allowing direct-map addresses to be used in real-mode handlers. This is required by Linux and is part of the architecture standard starting in POWER ISA 3, so can be relied upon.
Reviewed by: jhibbits, Breno Leitao Differential Revision: D14499
show more ...
|
#
bce6d88b |
| 17-Feb-2018 |
Justin Hibbits <jhibbits@FreeBSD.org> |
Merge AIM and Book-E PCPU fields
This is part of a long-term goal of merging Book-E and AIM into a single GENERIC kernel. As more work is done, the struct may be optimized further.
Reviewed by: nw
Merge AIM and Book-E PCPU fields
This is part of a long-term goal of merging Book-E and AIM into a single GENERIC kernel. As more work is done, the struct may be optimized further.
Reviewed by: nwhitehorn
show more ...
|
#
71e3c308 |
| 27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/powerpc: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - e
sys/powerpc: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
show more ...
|
#
312fb3d8 |
| 25-Nov-2017 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Invalidate TLB at boot using the correct IS settings on newer-than-POWER5 CPUs.
MFC after: 3 weeks
|
Revision tags: release/10.4.0, release/11.1.0, release/11.0.1, release/11.0.0, release/10.3.0 |
|
#
b626f5a7 |
| 04-Jan-2016 |
Glen Barber <gjb@FreeBSD.org> |
MFH r289384-r293170
Sponsored by: The FreeBSD Foundation
|
#
a5d8944a |
| 19-Nov-2015 |
Navdeep Parhar <np@FreeBSD.org> |
Catch up with head (r291075).
|
#
4a38fe54 |
| 17-Nov-2015 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Make native page table access endian-safe. Even on CPUs running in little-endian mode, the hardware page table is big-endian. This is a no-op on all currently supported systems.
MFC after: 1 month
|
Revision tags: release/10.2.0 |
|
#
98e0ffae |
| 27-May-2015 |
Simon J. Gerraty <sjg@FreeBSD.org> |
Merge sync of head
|
#
e6e746bf |
| 25-Mar-2015 |
Glen Barber <gjb@FreeBSD.org> |
MFH: r278968-r280640
Sponsored by: The FreeBSD Foundation
|
#
fa1e92b6 |
| 04-Mar-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Merge from head
|
#
ca65be80 |
| 04-Mar-2015 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r279313 through r279595.
|
#
072aeeb6 |
| 02-Mar-2015 |
Navdeep Parhar <np@FreeBSD.org> |
Merge r278538 through r279514.
|
#
d4eb568e |
| 27-Feb-2015 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
Fix unitialized variable.
|
#
0d56a8cb |
| 26-Feb-2015 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r279163 through r279308.
|
#
827cc9b9 |
| 24-Feb-2015 |
Nathan Whitehorn <nwhitehorn@FreeBSD.org> |
New pmap implementation for 64-bit PowerPC processors. The main focus of this change is to improve concurrency: - Drop global state stored in the shadow overflow page table (and all other global st
New pmap implementation for 64-bit PowerPC processors. The main focus of this change is to improve concurrency: - Drop global state stored in the shadow overflow page table (and all other global state) - Remove all global locks - Use per-PTE lock bits to allow parallel page insertion - Reconstruct state when requested for evicted PTEs instead of buffering it during overflow
This drops total wall time for make buildworld on a 32-thread POWER8 system by a factor of two and system time by a factor of three, providing performance 20% better than similarly clocked Core i7 Xeons per-core. Performance on smaller SMP systems, where PMAP lock contention was not as much of an issue, is nearly unchanged.
Tested on: POWER8, POWER5+, G5 UP, G5 SMP (64-bit and 32-bit kernels) Merged from: user/nwhitehorn/ppc64-pmap-rework Looked over by: jhibbits, andreast MFC after: 3 months Relnotes: yes Sponsored by: FreeBSD Foundation
show more ...
|