X86SpeculativeLoadHardening.cpp - OpenGrok cross reference for /freebsd/contrib/llvm-project/llvm/lib/Target/X86/X86SpeculativeLoadHardening.cpp

Lines Matching full:we
142     // We mostly have one conditional branch, and in extremely rare cases have
234   // We have to insert the new block immediately after the current one as we  in splitEdge()
235   // don't know what layout-successor relationships the successor has and we  in splitEdge()
246     // we might have *broken* fallthrough and so need to inject a new  in splitEdge()
256       // Update the unconditional branch now that we've added one.  in splitEdge()
274   // If this is the only edge to the successor, we can just replace it in the  in splitEdge()
275   // CFG. Otherwise we need to add a new entry in the CFG for the new  in splitEdge()
323 /// FIXME: It's really frustrating that we have to do this, but SSA-form in MIR
324 /// isn't what you might expect. We may have multiple entries in PHI nodes for
325 /// a single predecessor. This makes CFG-updating extremely complex, so here we
336       // First we scan the operands of the PHI looking for duplicate entries  in canonicalizePHIOperands()
337       // a particular predecessor. We retain the operand index of each duplicate  in canonicalizePHIOperands()
349       // FIXME: It is really frustrating that we have to use a quadratic  in canonicalizePHIOperands()
353       // Note that we have to process these backwards so that we don't  in canonicalizePHIOperands()
367 /// Helper to scan a function for loads vulnerable to misspeculation that we
370 /// We use this to avoid making changes to functions where there is nothing we
389       // We found a load.  in hasVulnerableLoad()
403   // Only run if this pass is forced enabled or we detect the relevant function  in runOnMachineFunction()
421   // We support an alternative hardening technique based on a debug flag.  in runOnMachineFunction()
433   // Do a quick scan to see if we have any checkable loads.  in runOnMachineFunction()
436   // See if we have any conditional branching blocks that we will need to trace  in runOnMachineFunction()
440   // If we have no interesting conditions or loads, nothing to do here.  in runOnMachineFunction()
452   // If we have loads being hardened and we've asked for call and ret edges to  in runOnMachineFunction()
455     // We need to insert an LFENCE at the start of the function to suspend any  in runOnMachineFunction()
459     // FIXME: We could skip this for functions which unconditionally return  in runOnMachineFunction()
466   // If we guarded the entry with an LFENCE and have no conditionals to protect  in runOnMachineFunction()
467   // in blocks, then we're done.  in runOnMachineFunction()
469     // We may have changed the function's code at this point to insert fences.  in runOnMachineFunction()
475     // pointer so we pick up any misspeculation in our caller.  in runOnMachineFunction()
479     // as we don't need any initial state.  in runOnMachineFunction()
497   // We're going to need to trace predicate state throughout the function's  in runOnMachineFunction()
510   // We may also enter basic blocks in this function via exception handling  in runOnMachineFunction()
511   // control flow. Here, if we are hardening interprocedurally, we need to  in runOnMachineFunction()
531     // If we are going to harden calls and jumps we need to unfold their memory  in runOnMachineFunction()
535     // Then we trace predicate state through the indirect branches.  in runOnMachineFunction()
540   // Now that we have the predicate state available at the start of each block  in runOnMachineFunction()
542   // as we go.  in runOnMachineFunction()
563 /// We include this as an alternative mostly for the purpose of comparison. The
568   // First, we scan the function looking for blocks that are reached along edges  in hardenEdgesWithLFENCE()
569   // that we might want to harden.  in hardenEdgesWithLFENCE()
582     // Add all the non-EH-pad succossors to the blocks we want to harden. We  in hardenEdgesWithLFENCE()
603   // we need to trace through.  in collectBlockCondInfo()
609     // We want to reliably handle any conditional branch terminators in the  in collectBlockCondInfo()
610     // MBB, so we manually analyze the branch. We can handle all of the  in collectBlockCondInfo()
616     // edge. For each conditional edge, we track the target and the opposite  in collectBlockCondInfo()
619     // edge, we inject a separate cmov for each conditional branch with  in collectBlockCondInfo()
622     // directly implement that. We don't bother trying to optimize either of  in collectBlockCondInfo()
625     // instruction count. This late, we simply assume the minimal number of  in collectBlockCondInfo()
634       // Once we've handled all the terminators, we're done.  in collectBlockCondInfo()
638       // If we see a non-branch terminator, we can't handle anything so bail.  in collectBlockCondInfo()
644       // If we see an unconditional branch, reset our state, clear any  in collectBlockCondInfo()
652       // If we get an invalid condition, we have an indirect branch or some  in collectBlockCondInfo()
653       // other unanalyzable "fallthrough" case. We model this as a nullptr for  in collectBlockCondInfo()
654       // the destination so we can still guard any conditional successors.  in collectBlockCondInfo()
660       // We still want to harden the edge to `L1`.  in collectBlockCondInfo()
667       // We have a vanilla conditional branch, add it to our list.  in collectBlockCondInfo()
693   // Collect the inserted cmov instructions so we can rewrite their uses of the  in tracePredStateThroughCFG()
698   // jumps where we need to update this register along each edge.  in tracePredStateThroughCFG()
729           // First, we split the edge to insert the checking block into a safe  in tracePredStateThroughCFG()
746           // We will wire each cmov to each other, but need to start with the  in tracePredStateThroughCFG()
755             // Note that we intentionally use an empty debug location so that  in tracePredStateThroughCFG()
798       // Decrement the successor count now that we've split one of the edges.  in tracePredStateThroughCFG()
799       // We need to keep the count of edges to the successor accurate in order  in tracePredStateThroughCFG()
805     // Since we may have split edges and changed the number of successors,  in tracePredStateThroughCFG()
806     // normalize the probabilities. This avoids doing it each time we split an  in tracePredStateThroughCFG()
810     // Finally, we need to insert cmovs into the "fallthrough" edge. Here, we  in tracePredStateThroughCFG()
811     // need to intersect the other condition codes. We can do this by just  in tracePredStateThroughCFG()
814       // If we have no fallthrough to protect (perhaps it is an indirect jump?)  in tracePredStateThroughCFG()
819            "We should never have more than one edge to the unconditional "  in tracePredStateThroughCFG()
853     // We use make_early_inc_range here so we can remove instructions if needed  in unfoldCallAndJumpLoads()
859       // We only care about loading variants of these instructions.  in unfoldCallAndJumpLoads()
878         // We cannot mitigate far jumps or calls, but we also don't expect them  in unfoldCallAndJumpLoads()
899         // Use the generic unfold logic now that we know we're dealing with  in unfoldCallAndJumpLoads()
901         // FIXME: We don't have test coverage for all of these!  in unfoldCallAndJumpLoads()
911         // If we were able to compute an unfolded reg class, any failure here  in unfoldCallAndJumpLoads()
962   // We use the SSAUpdater to insert PHI nodes for the target addresses of  in tracePredStateThroughIndirectBranches()
963   // indirect branches. We don't actually need the full power of the SSA updater  in tracePredStateThroughIndirectBranches()
964   // in this particular case as we always have immediately available values, but  in tracePredStateThroughIndirectBranches()
972   // We need to know what blocks end up reached via indirect branches. We  in tracePredStateThroughIndirectBranches()
1001       // We cannot mitigate far jumps or calls, but we also don't expect them  in tracePredStateThroughIndirectBranches()
1025     // We have definitely found an indirect  branch. Verify that there are no  in tracePredStateThroughIndirectBranches()
1026     // preceding conditional branches as we don't yet support that.  in tracePredStateThroughIndirectBranches()
1050   // Keep track of the cmov instructions we insert so we can return them.  in tracePredStateThroughIndirectBranches()
1053   // If we didn't find any indirect branches with targets, nothing to do here.  in tracePredStateThroughIndirectBranches()
1057   // We found indirect branches and targets that need to be instrumented to  in tracePredStateThroughIndirectBranches()
1065     // We don't expect EH pads to ever be reached via an indirect branch. If  in tracePredStateThroughIndirectBranches()
1066     // this is desired for some reason, we could simply skip them here rather  in tracePredStateThroughIndirectBranches()
1071     // We should never end up threading EFLAGS into a block to harden  in tracePredStateThroughIndirectBranches()
1079     // We can't handle having non-indirect edges into this block unless this is  in tracePredStateThroughIndirectBranches()
1080     // the only successor and we can synthesize the necessary target address.  in tracePredStateThroughIndirectBranches()
1082       // If we've already handled this by extracting the target directly,  in tracePredStateThroughIndirectBranches()
1087       // Otherwise, we have to be the only successor. We generally expect this  in tracePredStateThroughIndirectBranches()
1089       // split already. We don't however need to worry about EH pad successors  in tracePredStateThroughIndirectBranches()
1105       // Now we need to compute the address of this block and install it as a  in tracePredStateThroughIndirectBranches()
1106       // synthetic target in the predecessor. We do this at the bottom of the  in tracePredStateThroughIndirectBranches()
1137     // Materialize the needed SSA value of the target. Note that we need the  in tracePredStateThroughIndirectBranches()
1139     // branch back to itself. We can do this here because at this point, every  in tracePredStateThroughIndirectBranches()
1151       // Check directly against a relocated immediate when we can.  in tracePredStateThroughIndirectBranches()
1224       // Otherwise we've def'ed it, and it is live.  in isEFLAGSLive()
1227     // While at this instruction, also check if we use and kill EFLAGS  in isEFLAGSLive()
1233   // If we didn't find anything conclusive (neither definitely alive or  in isEFLAGSLive()
1241 /// We call this routine once the initial predicate state has been established
1246 /// currently valid predicate state. We have to do these two things together
1247 /// because the SSA updater only works across blocks. Within a block, we track
1250 /// This operates in two passes over each block. First, we analyze the loads in
1253 /// amenable to hardening. We have to process these first because the two
1254 /// strategies may interact -- later hardening may change what strategy we wish
1255 /// to use. We also will analyze data dependencies between loads and avoid
1257 /// address. We also skip hardening loads already behind an LFENCE as that is
1260 /// Second, we actively trace the predicate state through the block, applying
1261 /// the hardening steps we determined necessary in the first pass as we go.
1263 /// These two passes are applied to each basic block. We operate one block at a
1276   // value which we would have checked, we can omit any checks on them.  in tracePredStateThroughBlocksAndHarden()
1282     // hardened. During this walk we propagate load dependence for address  in tracePredStateThroughBlocksAndHarden()
1285     // we check to see if any registers used in the address will have been  in tracePredStateThroughBlocksAndHarden()
1290     // FIXME: We should consider an aggressive mode where we continue to keep as  in tracePredStateThroughBlocksAndHarden()
1294     // Note that we only need this pass if we are actually hardening loads.  in tracePredStateThroughBlocksAndHarden()
1297         // We naively assume that all def'ed registers of an instruction have  in tracePredStateThroughBlocksAndHarden()
1309         // LFENCE to be a speculation barrier, so if we see an LFENCE, there is  in tracePredStateThroughBlocksAndHarden()
1336         // If we have at least one (non-frame-index, non-RIP) register operand,  in tracePredStateThroughBlocksAndHarden()
1337         // and neither operand is load-dependent, we need to check the load.  in tracePredStateThroughBlocksAndHarden()
1349         // If any register operand is dependent, this load is dependent and we  in tracePredStateThroughBlocksAndHarden()
1351         // FIXME: Is this true in the case where we are hardening loads after  in tracePredStateThroughBlocksAndHarden()
1358         // post-load hardening, and we aren't already going to harden one of the  in tracePredStateThroughBlocksAndHarden()
1387     // hardening strategy we have elected. Note that we do this in a second  in tracePredStateThroughBlocksAndHarden()
1388     // pass specifically so that we have the complete set of instructions for  in tracePredStateThroughBlocksAndHarden()
1389     // which we will do post-load hardening and can defer it in certain  in tracePredStateThroughBlocksAndHarden()
1393         // We cannot both require hardening the def of a load and its address.  in tracePredStateThroughBlocksAndHarden()
1416           // interference, we want to try and sink any hardening as far as  in tracePredStateThroughBlocksAndHarden()
1419             // Sink the instruction we'll need to harden as far as we can down  in tracePredStateThroughBlocksAndHarden()
1423             // If we managed to sink this instruction, update everything so we  in tracePredStateThroughBlocksAndHarden()
1424             // harden that instruction when we reach it in the instruction  in tracePredStateThroughBlocksAndHarden()
1428               // we're done.  in tracePredStateThroughBlocksAndHarden()
1432               // Otherwise, add this to the set of defs we harden.  in tracePredStateThroughBlocksAndHarden()
1440           // Mark the resulting hardened register as such so we don't re-harden.  in tracePredStateThroughBlocksAndHarden()
1447         // even if we couldn't find the specific load used, or were able to  in tracePredStateThroughBlocksAndHarden()
1448         // avoid hardening it for some reason. Note that here we cannot break  in tracePredStateThroughBlocksAndHarden()
1449         // out afterward as we may still need to handle any call aspect of this  in tracePredStateThroughBlocksAndHarden()
1455       // After we finish hardening loads we handle interprocedural hardening if  in tracePredStateThroughBlocksAndHarden()
1469       // Otherwise we have a call. We need to handle transferring the predicate  in tracePredStateThroughBlocksAndHarden()
1481     // Currently, we only track data-dependent loads within a basic block.  in tracePredStateThroughBlocksAndHarden()
1482     // FIXME: We should see if this is necessary or if we could be more  in tracePredStateThroughBlocksAndHarden()
1500   // We directly copy the FLAGS register and rely on later lowering to clean  in saveEFLAGS()
1520 /// stack pointer. The state is essentially a single bit, but we merge this in
1528   // to stay canonical on 64-bit. We should compute this somehow and support  in mergePredStateIntoSP()
1549   // We know that the stack pointer will have any preserved predicate state in  in extractPredStateFromSP()
1550   // its high bit. We just want to smear this across the other bits. Turns out,  in extractPredStateFromSP()
1578     // harden it if we're covering fixed address loads as well.  in hardenLoadAddr()
1592     // For both RIP-relative addressed loads or absolute loads, we cannot  in hardenLoadAddr()
1596     // FIXME: When using a segment base (like TLS does) we end up with the  in hardenLoadAddr()
1597     // dynamic address being the base plus -1 because we can't mutate the  in hardenLoadAddr()
1629     // Otherwise, we can directly update this operand and remove it.  in hardenLoadAddr()
1633   // If there are none left, we're done.  in hardenLoadAddr()
1642   // If EFLAGS are live and we don't have access to instructions that avoid  in hardenLoadAddr()
1643   // clobbering EFLAGS we need to save and restore them. This in turn makes  in hardenLoadAddr()
1656     // If this is a vector register, we'll need somewhat custom logic to handle  in hardenLoadAddr()
1664       // FIXME: We could skip this at the cost of longer encodings with AVX-512  in hardenLoadAddr()
1740         // We need to avoid touching EFLAGS so shift out all but the least  in hardenLoadAddr()
1774   // See if we can sink hardening the loaded value.  in sinkPostLoadHardenedInst()
1779     // We need to find a single use which we can sink the check. We can  in sinkPostLoadHardenedInst()
1784       // If we're already going to harden this use, it is data invariant, it  in sinkPostLoadHardenedInst()
1788           // If we've already decided to harden a non-load, we must have sunk  in sinkPostLoadHardenedInst()
1816         // We already have a single use, this would make two. Bail.  in sinkPostLoadHardenedInst()
1820       // interfering EFLAGS, we can't sink the hardening to it.  in sinkPostLoadHardenedInst()
1825       // If this instruction defines multiple registers bail as we won't harden  in sinkPostLoadHardenedInst()
1830       // If this register isn't a virtual register we can't walk uses of sanely,  in sinkPostLoadHardenedInst()
1831       // just bail. Also check that its register class is one of the ones we  in sinkPostLoadHardenedInst()
1847     // Update which MI we're checking now.  in sinkPostLoadHardenedInst()
1857   // We only support hardening virtual registers.  in canHardenRegister()
1864     // We don't support post-load hardening of vectors.  in canHardenRegister()
1871   // require REX prefix, we may not be able to satisfy that constraint when  in canHardenRegister()
1873   // FIXME: This seems like a pretty lame hack. The way this comes up is when we  in canHardenRegister()
1898 /// larger than the predicate state register. FIXME: We should support vector
1946 /// We can harden a non-leaking load into a register without touching the
1947 /// address by just hiding all of the loaded bits during misspeculation. We use
1948 /// an `or` instruction to do this because we set up our poison value as all
1961   // Because we want to completely replace the uses of this def'ed value with  in hardenPostLoad()
1968   // use. Note that we insert the instructions to compute this *after* the  in hardenPostLoad()
1983 /// Returns implicitly perform a load which we need to harden. Without hardening
1990 /// We can harden this by introducing an LFENCE that will delay any load of the
1992 /// speculated), or we can harden the address used by the implicit load: the
1995 /// If we are not using an LFENCE, hardening the stack pointer has an additional
2010     // No need to fence here as we'll fence at the return site itself. That  in hardenReturnInstr()
2011     // handles more cases than we can handle here.  in hardenReturnInstr()
2014   // Take our predicate state, shift it to the high 17 bits (so that we keep  in hardenReturnInstr()
2016   // extract it when we return (speculatively).  in hardenReturnInstr()
2025 /// First, we need to send the predicate state into the called function. We do
2028 /// For tail calls, this is all we need to do.
2030 /// For calls where we might return and resume the control flow, we need to
2034 /// We also need to verify that we intended to return to this location in the
2041 /// The way we verify that we returned to the correct location is by preserving
2044 /// was left by the RET instruction when it popped `%rsp`. Alternatively, we can
2046 /// call. We compare this intended return address against the address
2048 /// mismatch, we have detected misspeculation and can poison our predicate
2059       // Tail call, we don't return to this function.  in tracePredStateThroughCall()
2060       // FIXME: We should also handle noreturn calls.  in tracePredStateThroughCall()
2063     // We don't need to fence before the call because the function should fence  in tracePredStateThroughCall()
2064     // in its entry. However, we do need to fence after the call returns.  in tracePredStateThroughCall()
2073   // First, we transfer the predicate state into the called function by merging  in tracePredStateThroughCall()
2078   // If this call is also a return, it is a tail call and we don't need anything  in tracePredStateThroughCall()
2080   // instructions and no successors, this call does not return so we can also  in tracePredStateThroughCall()
2086   // machine instruction. We will lower extra symbols attached to call  in tracePredStateThroughCall()
2096   // If we have no red zones or if the function returns twice (possibly without  in tracePredStateThroughCall()
2097   // using the `ret` instruction) like setjmp, we need to save the expected  in tracePredStateThroughCall()
2101     // If we don't have red zones, we need to compute the expected return  in tracePredStateThroughCall()
2108     // the stack). But that isn't our primary goal, so we only use it as  in tracePredStateThroughCall()
2112     // rematerialization in the register allocator. We somehow need to force  in tracePredStateThroughCall()
2116     // FIXME: It is even less clear why MachineCSE can't just fold this when we  in tracePredStateThroughCall()
2137   // If we didn't pre-compute the expected return address into a register, then  in tracePredStateThroughCall()
2139   // stack immediately after the call. As the very first instruction, we load it  in tracePredStateThroughCall()
2152   // Now we extract the callee's predicate state from the stack pointer.  in tracePredStateThroughCall()
2155   // Test the expected return address against our actual address. If we can  in tracePredStateThroughCall()
2157   // we compute it.  in tracePredStateThroughCall()
2160     // FIXME: Could we fold this with the load? It would require careful EFLAGS  in tracePredStateThroughCall()
2178   // Now conditionally update the predicate state we just extracted if we ended  in tracePredStateThroughCall()
2203 /// will be adequately hardened already, we want to ensure that they are
2207 /// execution. We forcibly unfolded all relevant loads above and so will always
2208 /// have an opportunity to post-load harden here, we just need to scan for cases
2220     // We don't need to harden either far calls or far jumps as they are  in hardenIndirectCallOrJumpInstr()
2228   // We should never see a loading instruction at this point, as those should  in hardenIndirectCallOrJumpInstr()
2242   // Try to lookup a hardened version of this register. We retain a reference  in hardenIndirectCallOrJumpInstr()
2243   // here as we want to update the map to track any newly computed hardened  in hardenIndirectCallOrJumpInstr()
2247   // If we don't have a hardened register yet, compute one. Otherwise, just use  in hardenIndirectCallOrJumpInstr()
2250   // FIXME: It is a little suspect that we use partially hardened registers that  in hardenIndirectCallOrJumpInstr()