AMDGPUAtomicOptimizer.cpp - OpenGrok cross reference for /freebsd/contrib/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp

Lines Matching full:lane
10 /// This pass optimizes atomic operations by using a single lane of a wavefront
235   // If the pointer operand is divergent, then each lane is doing an atomic  in visitAtomicRMWInst()
243   // If the value operand is divergent, each lane is contributing a different  in visitAtomicRMWInst()
329   // If the value operand is divergent, each lane is contributing a different  in visitIntrinsicInst()
439   // Pick an arbitrary lane from 0..31 and an arbitrary lane from 32..63 and  in buildReduction()
481     // Combine lane 15 into lanes 16..31 (and, for wave 64, lane 47 into lanes  in buildScan()
494       // Combine lane 31 into lanes 32..63.  in buildScan()
534     // Copy the old lane 15 to the new lane 16.  in buildShiftRight()
539       // Copy the old lane 31 to the new lane 32.  in buildShiftRight()
544       // Copy the old lane 47 to the new lane 48.  in buildShiftRight()
583   // Use llvm.cttz instrinsic to find the lowest remaining active lane.  in buildScanIteratively()
606   // Set bit to zero of current active lane so that for next iteration llvm.cttz  in buildScanIteratively()
607   // return the next active lane  in buildScanIteratively()
674   // lane invocations, we need to record the entry and exit BB's.  in optimizeAtomic()
679   // entire atomic operation in a helper-lane check. We do not want any helper  in optimizeAtomic()
681   // in any cross-lane communication, and we use a branch on whether the lane is  in optimizeAtomic()
714   // below us. If we counted each lane linearly starting from 0, a lane is  in optimizeAtomic()
733   // For atomic sub, perform scan with add operation and allow one lane to  in optimizeAtomic()
750   // If we have a divergent value in each lane, we need to combine the value  in optimizeAtomic()
767         // Read the value from the last lane, which has accumulated the values  in optimizeAtomic()
768         // of each active lane in the wavefront. This will be our new value  in optimizeAtomic()
830   // We only want a single lane to enter our new control flow, and we do this  in optimizeAtomic()
831   // by checking if there are any active lanes below us. Only one lane will  in optimizeAtomic()
838   // We need to introduce some new control flow to force a single lane to be  in optimizeAtomic()
846   // At this point, we have split the I's block to allow one lane in wavefront  in optimizeAtomic()
853   // single lane done updating the final reduced value.  in optimizeAtomic()
891   // Clone the original atomic operation into single lane, replacing the  in optimizeAtomic()
907     // We need to broadcast the value who was the lowest active lane (the first  in optimizeAtomic()
908     // lane) to all other lanes in the wavefront. We use an intrinsic for this,  in optimizeAtomic()
914     // get our individual lane's slice into the result. We use the lane offset  in optimizeAtomic()
916     // from the first lane, to get our lane's index into the atomic result.  in optimizeAtomic()
959       // For fadd/fsub the first active lane of LaneOffset should be the  in optimizeAtomic()
968       // first active lane.  in optimizeAtomic()
973       // Need a final PHI to reconverge to above the helper lane branch mask.  in optimizeAtomic()