Lines Matching full:atomic

10 /// This pass optimizes atomic operations by using a single lane of a wavefront
11 /// to perform the atomic operation, thus reducing contention on that memory
13 /// Atomic optimizer uses following strategies to compute scan and reduced
36 #define DEBUG_TYPE "amdgpu-atomic-optimizer"
196 // Early exit for unhandled address space atomic instructions. in visitAtomicRMWInst()
226 // Only 32 and 64 bit floating point atomic ops are supported. in visitAtomicRMWInst()
235 // If the pointer operand is divergent, then each lane is doing an atomic in visitAtomicRMWInst()
244 // value to the atomic calculation. We can only optimize divergent values if in visitAtomicRMWInst()
245 // we have DPP available on our subtarget (for DPP strategy), and the atomic in visitAtomicRMWInst()
255 // If we get here, we can optimize the atomic using a single wavefront-wide in visitAtomicRMWInst()
256 // atomic operation to do the calculation for the entire wavefront, so in visitAtomicRMWInst()
330 // value to the atomic calculation. We can only optimize divergent values if in visitIntrinsicInst()
331 // we have DPP available on our subtarget (for DPP strategy), and the atomic in visitIntrinsicInst()
349 // If we get here, we can optimize the atomic using a single wavefront-wide in visitIntrinsicInst()
350 // atomic operation to do the calculation for the entire wavefront, so in visitIntrinsicInst()
357 // Use the builder to create the non-atomic counterpart of the specified
365 llvm_unreachable("Unhandled atomic op"); in buildNonAtomicBinOp()
589 // Get the value required for atomic operation in buildScanIteratively()
629 llvm_unreachable("Unhandled atomic op"); in getIdentityValueForAtomicOp()
678 // If we're optimizing an atomic within a pixel shader, we need to wrap the in optimizeAtomic()
679 // entire atomic operation in a helper-lane check. We do not want any helper in optimizeAtomic()
703 // This is the value in the atomic operation we need to combine in order to in optimizeAtomic()
704 // reduce the number of atomic operations. in optimizeAtomic()
733 // For atomic sub, perform scan with add operation and allow one lane to in optimizeAtomic()
769 // which we will provide to the atomic operation. in optimizeAtomic()
783 llvm_unreachable("Atomic Optimzer is disabled for None strategy"); in optimizeAtomic()
788 llvm_unreachable("Unhandled atomic op"); in optimizeAtomic()
792 // The new value we will be contributing to the atomic operation is the in optimizeAtomic()
815 // These operations with a uniform value are idempotent: doing the atomic in optimizeAtomic()
821 // The new value we will be contributing to the atomic operation is the in optimizeAtomic()
891 // Clone the original atomic operation into single lane, replacing the in optimizeAtomic()
902 // Create a PHI node to get our new atomic result into the exit block. in optimizeAtomic()
913 // Now that we have the result of our single atomic operation, we need to in optimizeAtomic()
915 // we previously calculated combined with the atomic result value we got in optimizeAtomic()
916 // from the first lane, to get our lane's index into the atomic result. in optimizeAtomic()
925 llvm_unreachable("Atomic Optimzer is disabled for None strategy"); in optimizeAtomic()
932 llvm_unreachable("Unhandled atomic op"); in optimizeAtomic()
981 // Replace the original atomic instruction with the new one. in optimizeAtomic()
991 "AMDGPU atomic optimizations", false, false)
995 "AMDGPU atomic optimizations", false, false) in INITIALIZE_PASS_DEPENDENCY()