These are all the places where Nanite shaders mention atomic
or Interlocked***
HLSL intrinsics.
- Unsigned 64-bit atomic max (which Metal 2.4 supports): NaniteWritePixel.ush. This does not require 64-bit image atomics; only 64-bit buffer atomics.
- Says memory accesses have atomic behaviors but does not call atomic instructions: VirtualShadowMapPageManagement.usf.
- Only requires 32-bit atomics: VirtualShadowMapStats.ush.
- Only requires 32-bit atomics: DistanceFieldShadowing.usf.
- Only requires 32-bit atomics: NaniteCulling.ush.
If my interpretation is correct, Nanite doesn’t require actual 64-bit atomics. I can see how someone got it running on macOS and iOS. They force-enabled the feature using Metal 2.4’s UInt64
atomic min/max. I thought the work distribution queue required a massive 64-bit counter, because more than 4 billion work chunks exist; I guess that understanding is incorrect.
Edit: There’s some bad news. Metal 64-bit atomic min/max only runs Apple8 hardware (A15/M2). The A14/M1 were released with Metal 2.3, which didn’t have these instructions. I made a Swift script and accompanying Metal shader for anyone to reproduce (below). NaniteWritePixel.ush includes a 32-bit lock-based workaround on Nvidia 1080/2080, which doesn’t need 64-bit atomics.
That could work on Apple7 GPUs, and might be how someone got Nanite working on them. This workaround’s nature could explain why Nanite had messed up depth order. It also explains the lower performance. I can test this theory by deploying Nanite-enabled UE5 to my iPhone 13, which has an Apple8-family GPU.
Swift script
import Foundation
import Metal
let device = MTLCreateSystemDefaultDevice()!
let commandQueue = device.makeCommandQueue()!
let library = device.makeDefaultLibrary()!
let function = library.makeFunction(name: "testAtomic64")!
let pipeline = try! library.device.makeComputePipelineState(function: function)
// A new `MTLBuffer` is always zero-initialized.
let buffer = device.makeBuffer(length: 8, options: .storageModeShared)!
let cmdbuf = commandQueue.makeCommandBuffer()!
let enc = cmdbuf.makeComputeCommandEncoder()!
enc.setComputePipelineState(pipeline)
enc.setBuffer(buffer, offset: 0, index: 0)
let oneSize = MTLSizeMake(1, 1, 1)
enc.dispatchThreads(oneSize, threadsPerThreadgroup: oneSize)
enc.endEncoding()
cmdbuf.commit()
cmdbuf.waitUntilCompleted()
let result = buffer.contents().assumingMemoryBound(to: UInt64.self)
precondition(result.pointee == 7)
print("Success! \(result.pointee) == 7")
Metal shader
#include <metal_stdlib>
using namespace metal;
kernel void testAtomic64(device atomic_ulong *buffer [[buffer(0)]])
{
atomic_max_explicit(buffer, 7, memory_order_relaxed);
}
Xcode output, which happens when MTLCompilerService encounters an unsupported feature in AIR
2022-09-03 23:50:04.632240-0400 TestAtomic64[27891:1448661] Metal GPU Frame Capture Enabled
2022-09-03 23:50:04.632593-0400 TestAtomic64[27891:1448661] Metal API Validation Enabled
2022-09-03 23:50:04.679200-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:04.679246-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 1 try
2022-09-03 23:50:04.699454-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:04.699508-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 2 try
2022-09-03 23:50:04.719754-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:04.719804-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 3 try
2022-09-03 23:50:14.782645-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:14.782777-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 1 try
2022-09-03 23:50:24.846962-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:24.847071-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 2 try
2022-09-03 23:50:34.914092-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:34.914213-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 3 try
TestAtomic64/main.swift:15: Fatal error: 'try!' expression unexpectedly raised an error: Error Domain=AGXMetalG13X Code=3 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}
2022-09-03 23:50:34.915280-0400 TestAtomic64[27891:1448661] TestAtomic64/main.swift:15: Fatal error: 'try!' expression unexpectedly raised an error: Error Domain=AGXMetalG13X Code=3 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}