#
a905c589 |
| 04-Nov-2024 |
Doug Moore <dougm@FreeBSD.org> |
pctrie: breakup pctrie_root_store
Break up pctrie_root_store into the part that casts the root to a smr_pctnode_t *, and the rest. The rest is just pctrie_node_store, and the casting part can be use
pctrie: breakup pctrie_root_store
Break up pctrie_root_store into the part that casts the root to a smr_pctnode_t *, and the rest. The rest is just pctrie_node_store, and the casting part can be used in a few more places.
This is strictly a code-cleanup change, with no functional change expected.
Reviewed by: bnovkov Differential Revision: https://reviews.freebsd.org/D47347
show more ...
|
#
0d965bc0 |
| 26-Oct-2024 |
Doug Moore <dougm@FreeBSD.org> |
subr_pctrie: improve iter nbr search
pctrie_toval(node) can be applied to either a leaf or an internal node; in the latter case it provides the address of the pn_owner field. In a couple of places w
subr_pctrie: improve iter nbr search
pctrie_toval(node) can be applied to either a leaf or an internal node; in the latter case it provides the address of the pn_owner field. In a couple of places where a neighbor search is about to begin for an iterator, the current code distinguishes the leaf and non-leaf cases in a way that isn't really necessary. This change shrinks each function by 16 bytes, and by a branch instruction.
Reviewed by: bnovkov Differential Revision: https://reviews.freebsd.org/D47207
show more ...
|
#
d2d0d6cb |
| 21-Oct-2024 |
Doug Moore <dougm@FreeBSD.org> |
subr_pctrie: fix a comment
A comment used least > instead of greatest <. Fix it.
|
#
d0b225d1 |
| 08-Oct-2024 |
Doug Moore <dougm@FreeBSD.org> |
swap_pager: use iterators in swp_pager_meta_build
Add a method to use an iterator for pctrie insertion; this should improve performance when the last search ended near the place where the new item w
swap_pager: use iterators in swp_pager_meta_build
Add a method to use an iterator for pctrie insertion; this should improve performance when the last search ended near the place where the new item will be inserted.
Add an iterator argument to swp_pager_meta_build, so that the lookups and insertions it does can be faster in the common case when keys are bunched close together, or appear in sequence.
Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D46848
show more ...
|
#
9147a0c9 |
| 06-Oct-2024 |
Doug Moore <dougm@FreeBSD.org> |
pctrie: don't assign to root
User pctrie_root_store(*, PCTRIE_LOCKED) to change the root value of a pctrie, to ensure proper synchronization when smr is in use.
Reviewed by: alc Differential Revisi
pctrie: don't assign to root
User pctrie_root_store(*, PCTRIE_LOCKED) to change the root value of a pctrie, to ensure proper synchronization when smr is in use.
Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D46968
show more ...
|
#
fd1d6662 |
| 13-Sep-2024 |
Doug Moore <dougm@FreeBSD.org> |
pctrie: create iterator
Define a pctrie iterator type. A pctrie iterator is a wrapper around a pctrie that remembers a position in the trie where the last search left off, and where a new search can
pctrie: create iterator
Define a pctrie iterator type. A pctrie iterator is a wrapper around a pctrie that remembers a position in the trie where the last search left off, and where a new search can resume. When the next search is for an item very near in the trie to where the last search left off, iter-based search is faster because instead of starting from the root, the search usually only has to back up one or two steps up the root-to-last-search path to find the branch that leads to the new search target.
Every kind of lookup (plain, lookup_ge, lookup_le) that can begin with the trie root can begin with an iterator instead. An iterator can also do a relative search ("look for the item 4 greater than the last item I found") because it remembers where that last search ended. It can also search within limits ("look for the item bigger than this one, but it has to be less than 100"), which can save time when the next item beyond the limits and that is known before we actually know what that item it is. An iterator can also be used to remove an item that has already been found, without having to search for it again.
Iterators are vulnerable to unsynchronized data changes. If the iterator is created with a lock held, and that lock is released and acquired again, there's no guarantee that the iterator path remains valid.
Reviewed by: markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D45627
show more ...
|
Revision tags: release/13.4.0 |
|
#
d19851f0 |
| 13-Jun-2024 |
Doug Moore <dougm@FreeBSD.org> |
subr_pctrie: add a word to a comment
No functional changes. Reported by: alc
|
#
c0d0bc2b |
| 13-Jun-2024 |
Doug Moore <dougm@FreeBSD.org> |
subr_pctrie: add leaf callbacks to pctrie_reclaim
PCTRIE_RECLAIM frees all the interior nodes in a pctrie, but is little used because most trie-destroyers want to free leaves of the tree too. Add PC
subr_pctrie: add leaf callbacks to pctrie_reclaim
PCTRIE_RECLAIM frees all the interior nodes in a pctrie, but is little used because most trie-destroyers want to free leaves of the tree too. Add PCTRIE_RECLAIM_CALLBACK, with two extra arguments, a callback function and an auxiliary argument, that is invoked on every non-NULL leaf in the tree as the tree is destroyed.
Reviewed by: rlibby, kib (previous version) Differential Revision: https://reviews.freebsd.org/D45565
show more ...
|
#
bbf81f46 |
| 06-Jun-2024 |
Ryan Libby <rlibby@FreeBSD.org> |
pctrie: add combined insert/lookup operations
In several places in code, we do a pctrie lookup followed by a pctrie insert. Provide a few flavors of combined lookup/insert. This may save a portion
pctrie: add combined insert/lookup operations
In several places in code, we do a pctrie lookup followed by a pctrie insert. Provide a few flavors of combined lookup/insert. This may save a portion of the work from walking a large pctrie twice.
The general idea is that while we walk the trie during insert, we also do the same kind of tracking work that we do during pctrie_lookup_ge or pctrie_lookup_le, and we pass out a pctrie node from where such a lookup may continue.
Reviewed by: dougm (previous version), kib (previous version), markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D45394
show more ...
|
#
749c249d |
| 03-Jun-2024 |
Doug Moore <dougm@FreeBSD.org> |
subr_pctrie: use ilog2(x) instead of fls(x)-1
In three instances where fls(x)-1 is used, the compiler does not know that x is nonzero and so adds needless zero checks. Using ilog(x) instead saves,
subr_pctrie: use ilog2(x) instead of fls(x)-1
In three instances where fls(x)-1 is used, the compiler does not know that x is nonzero and so adds needless zero checks. Using ilog(x) instead saves, in each instance, about 4 instructions, including a conditional, and 16 or so bytes, on an amd64 build.
Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D45330
show more ...
|
#
e3537f92 |
| 03-Jun-2024 |
Doug Moore <dougm@FreeBSD.org> |
Revert "subr_pctrie: use ilog2(x) instead of fls(x)-1"
This reverts commit 574ef650695088d56ea12df7da76155370286f9f.
|
#
574ef650 |
| 03-Jun-2024 |
Doug Moore <dougm@FreeBSD.org> |
subr_pctrie: use ilog2(x) instead of fls(x)-1
In three instances where fls(x)-1 is used, the compiler does not know that x is nonzero and so adds needless zero checks. Using ilog(x) instead saves,
subr_pctrie: use ilog2(x) instead of fls(x)-1
In three instances where fls(x)-1 is used, the compiler does not know that x is nonzero and so adds needless zero checks. Using ilog(x) instead saves, in each instance, about 4 instructions, including a conditional, and 16 or so bytes, on an amd64 build.
Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D45330
show more ...
|
Revision tags: release/14.1.0, release/13.3.0, release/14.0.0 |
|
#
3b7ffacd |
| 21-Aug-2023 |
Doug Moore <dougm@FreeBSD.org> |
pctrie: change for vm_radix compatibility
Restructure parts of pctrie code to make it more compatible with the needs of vm_radix code.
1. End passing function pointers for memory management.
By br
pctrie: change for vm_radix compatibility
Restructure parts of pctrie code to make it more compatible with the needs of vm_radix code.
1. End passing function pointers for memory management.
By breaking insertion into two functions, the call for allocating memory can happen at the top level and be inlined, rather than happening via an function pointer to a memory allocator.
By changing the remove function slightly, freeing of memory, when necessary, can happen at the top level and be inlined.
By turning the reclamation code into two functions, one for starting iteration over to-be-freed nodes and the other continuing it, all the freeing can happen at the top level and be inlined.
2. Offer a version of remove that does not panic and returns the freed value (or NULL). 3. Offer a 'replace' operation, to replace one leaf with another that has the same key.
These are three of the roadblocks that prevent code sharing between pctrie and vm_radix code.
Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D41396
show more ...
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
ac0572e6 |
| 30-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_tree: compute slot from keybarr
The computation of keybarr(), the function that determines when a search has failed at a non-leaf node, can be done in a way that computes the 'slot' value when
radix_tree: compute slot from keybarr
The computation of keybarr(), the function that determines when a search has failed at a non-leaf node, can be done in a way that computes the 'slot' value when keybarr() fails, which is exactly when slot() would next be invoked. Computing things this way saves space in search loops.
This reduces the amd64 coding of the search loop in vm_radix_lookup from 40 bytes to 28 bytes.
Reviewed by: alc Tested by: pho (as part of a larger change) Differential Revision: https://reviews.freebsd.org/D41235
show more ...
|
#
38f5cb1b |
| 30-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_tree: redefine the clev field
The clev field in the node struct is almost always multiplied by WIDTH; occasionally, it is incremented and then multiplied by WIDTH. Instructions can be saved by
radix_tree: redefine the clev field
The clev field in the node struct is almost always multiplied by WIDTH; occasionally, it is incremented and then multiplied by WIDTH. Instructions can be saved by storing it always multiplied by WIDTH.
For the computation of slot(), this just eliminates a multiplication. For trimkey(), where the caller always adds one to clev before passing it as an argument, this change has the caller, not the caller, do that. Trimkey() handles it not by adding WIDTH to the input parameter, but by shifting COUNT, and not 1. That produces the same result, and it relieves keybarr of the need to test to avoid shifting by more than 63 bits, since level is always <= 63.
This takes 3 instrutions and 14 bytes out of the basic lookup loop on amd64.
Reviewed by: kib Tested by: pho (as part of a larger change) Differential Revision: https://reviews.freebsd.org/D41226
show more ...
|
#
2d2bcba7 |
| 28-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
Every path in a radix trie ends with a leaf or a NULL. By replacing NULL (non-leaf) pointers with NULL leaves, there is a NULL test removed from every iteration of an index-based search loop.
This s
Every path in a radix trie ends with a leaf or a NULL. By replacing NULL (non-leaf) pointers with NULL leaves, there is a NULL test removed from every iteration of an index-based search loop.
This speeds up radix trie searches by few percent. If there are any radix tries that are not initialized with the init() function, but instead depend on zeroing everything being proper initialization, this will break those tries.
Reviewed by: alc, kib Tested by: pho (as part of a larger change) Differential Revision: https://reviews.freebsd.org/D41171
show more ...
|
#
6f251ef2 |
| 19-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: simplify ge, le lookups
Replace the implementations of lookup_le and lookup_ge with ones that do not use a stack or climb back up the tree, and instead exploit the popmap field to quickl
radix_trie: simplify ge, le lookups
Replace the implementations of lookup_le and lookup_ge with ones that do not use a stack or climb back up the tree, and instead exploit the popmap field to quickly identify the place to resume searching if the straightforward indexed search fails.
The code size of the original functions shrinks by a combined 160 bytes on amd64, and the cumulative cycle count per invocation of the two functions together is reduced 20% in a buildworld test.
Reviewed by: alc, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D40936
show more ...
|
#
16e01c05 |
| 09-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: avoid code duplication in insert
Two cases in the insert routine are written differently, when they're really doing the same thing. Writing that case only once saves 208 bytes in the com
radix_trie: avoid code duplication in insert
Two cases in the insert routine are written differently, when they're really doing the same thing. Writing that case only once saves 208 bytes in the compiled vm_radix_insert code and reduces instructions executed by about 2%. Reviewed by: alc Tested by: pho Differential Revision: https://reviews.freebsd.org/D40807
show more ...
|
#
8df38859 |
| 07-Jul-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: replace node count with popmap
Replace the 'count' field in a trie node with a bitmap that identifies non-NULL children. Drop the 'last' field, and use the last bit set in the bitmap ins
radix_trie: replace node count with popmap
Replace the 'count' field in a trie node with a bitmap that identifies non-NULL children. Drop the 'last' field, and use the last bit set in the bitmap instead. In lookup_le, lookup_ge, remove, and reclaim_all, use the bitmap to find the previous/next/only/every non-null child in constant time by examining the bitmask instead of looping across array elements and null-checking them one-by-one.
A buildworld test suggests that this reduces the cycle count on those functions that eliminate some null-checks by 4.9%, 1.5%, 0.0% and 13.3%. Reviewed by: alc Tested by: pho Differential Revision: https://reviews.freebsd.org/D40775
show more ...
|
#
da72505f |
| 27-Jun-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: pass fewer params to node_get
Let node_get calculate it's own owner value. Don't pass the count parameter, since it's always 2. Save 16 bytes in insert(). Move, without modifying, slot a
radix_trie: pass fewer params to node_get
Let node_get calculate it's own owner value. Don't pass the count parameter, since it's always 2. Save 16 bytes in insert(). Move, without modifying, slot and trimkey to handle use-before-declaration problem. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D40723
show more ...
|
#
9cfed089 |
| 27-Jun-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: clean up overlong lines
This is purely a cosmetic change. vm_radix.c has lines that reach past column 80 and this change cleans that up. The associated changes to subr_pctrie.c are just
radix_trie: clean up overlong lines
This is purely a cosmetic change. vm_radix.c has lines that reach past column 80 and this change cleans that up. The associated changes to subr_pctrie.c are just to keep mirroring vm_radix.c. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D40764
show more ...
|
#
72c3a43b |
| 27-Jun-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: skip compare in lookup_le, lookup_ge
In _lookup_ge, where a loop "looks for an available edge or val within the current bisection node" (to quote the code comment), the value of index ha
radix_trie: skip compare in lookup_le, lookup_ge
In _lookup_ge, where a loop "looks for an available edge or val within the current bisection node" (to quote the code comment), the value of index has already been modified to guarantee that it is the least value than can be found in the non-NULL child node being examined. Therefore, if the non-NULL child is a leaf, there's no need to compare 'index' to anything, and the value can just be returned.
The same is true for _lookup_le with 'most' replacing 'least'. Reviewed by: alc Tested by: pho Differential Revision: https://reviews.freebsd.org/D40746
show more ...
|
#
a42d8fe0 |
| 25-Jun-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: simplify trimkey functions
Replacing a branch and two shifts with a single masking operation saves 64 bytes the pair of functions lookup_le and lookup_ge on amd64. Refresh the associate
radix_trie: simplify trimkey functions
Replacing a branch and two shifts with a single masking operation saves 64 bytes the pair of functions lookup_le and lookup_ge on amd64. Refresh the associated comments. Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D40722
show more ...
|
#
e8efee29 |
| 24-Jun-2023 |
Doug Moore <dougm@FreeBSD.org> |
radix_trie: avoid reloading radix node
In the vm_radix:remove loop that searches for the last child, load that child once, without loading it again after the search is over. Change KASSERTS from ind
radix_trie: avoid reloading radix node
In the vm_radix:remove loop that searches for the last child, load that child once, without loading it again after the search is over. Change KASSERTS from index check to NULL node check. Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D40721
show more ...
|