Lumen & Nanite on MacOS

this is nice but waiting for them to do something about silicon-based macs basically not running UE5.

2 Likes

Hey, I’m new to Unreal Engine Forums but not new to Unreal Engine itself. I read through the website’s tutorials ~2 years ago and played with the editor a little. I was really excited when seeing UE5 for the first time, but suspicious when they only mentioned Nanite working on AMD/NVIDIA. I’m really skilled in the Metal API (>1000 hours of experience), and I would be interested in helping out. It seems the Reddit thread was legit, but this was not emphasized properly. Nanite has been run on macOS and iOS, but its performance may be poor. We did not get source code, but we may have enough information to reproduce what happened.

At the WWDC, Apple released Metal 3. The new Metal Shading Language specification includes 64-bit atomics, including atomic_ulong. Earlier in this thread, someone said the lack of that feature was blocking Nanite. I can confirm that feature did not exist, as up to Metal 2.3 (maybe 2.4), the largest atomic size was 32-bit. I don’t know how 64-bit got enabled in Metal 3 (either a hidden hardware feature or emulation), but it doesn’t matter.

Epic might not really want to help Apple users because of their feud and ongoing lawsuit with Apple. If you guys help me out, give me enough time, I see no reason why we can’t bring full-fledged Nanite to macOS. I’m very experienced with Git/GitHub, MSL, C programming, forum posting, and most of all: bypassing artificial restrictions on technology. Perhaps I should ping @gladhu who made the Reddit thread, but also posted on UE forums.

4 Likes

Might be worth waiting until the programable rasterizer rewrite for Nanite is finished and included in a major release, seeing if that comes with MacOS support.

3 Likes

Looking at the MSL specification, atomic_ulong was added in Metal 2.4. That is good news because we don’t have to wait until macOS Ventura to make Nanite accessibility. I have Xcode 14 beta with the macOS 13 SDK + Metal 3 installed, but UE5 can work with macOS 12 + Metal 2.4.

Edit: Metal 64-bit atomics only support max and min operations, not the full feature set of addition, compare-exchange, load, etc. This means they might not be very helpful. Also, it’s not clear whether they work exclusively on Apple GPUs.

Although regular updates to the UE5 software might be helpful, Nanite has been working before the programmable rasterizer was rewritten. It would be wise to halt investigation until that is finished, saving time if they support macOS. We need to know this rewrite’s expected release date and what it entails before going either way.

Assuming that we investigate Nanite now:

Our first steps should be just reproducing the compilation of UE5 with Nanite working. The GitHub repository here includes macOS binary files. In my work as an open-source software engineer, raw binary files are a rare occurrence. It’s generally bad practice because it’s a security vulnerability and impossible to reproduce. They might run on x86_64 but not arm64. I have seen one example where a major company put binaries in the GitHub repo, and they converted the binaries into source code after someone raised an issue. The most common place for binaries is the “Releases” section of a repository, not the Git source tree. I’m not sure whether that practice is standard with Unreal Engine.

In short, create a repo that includes only source code files, along with instructions for how to compile UE5 with them. I made a preliminary repo at ue5-nanite-macos for this source code.

4 Likes

With 5.02 Lumen is now working on my iMac 2020 and it is glorious!

Specs for those who are interested:

Screen Shot 2022-07-21 at 1.41.37 PM

Hey guys, I think we have some good news. I was taking a look at the UE5.1 Roadmap when I came across this:

With Unreal Engine 5.0 and earlier versions, the Unreal Editor was running as an x86_64 (Intel) executable on M1 macOS computers. This approach did leverage all the performance benefits of the platform.
The UE team is actively working on porting the Unreal Editor and its dependencies to support natively the Apple Silicon platform, thus offering increased performance and stability.

3 Likes

That’s support of the CPU instruction set (compiling into an ARM binary), but does not explicitly state anything regarding GPU. If anyone has news regarding GPU specifically, I would like to know. If not, perhaps we could contact Epic directly to ask whether they’re working on GPU support for Nanite?

Also, this is the Unreal Editor running on ARM, not the engine itself. Lumen and Nanite are part of the Unreal Engine, which already makes production apps for ARM on iOS. Probably the same for macOS.

1 Like

Does anyone know if there’s a moderator we could ping to ask whether Epic is working on Nanite for macOS/iOS?

2 Likes

Hey there @philipturner! So officially there’s no word on if Nanite on MacOS will become available as of yet for 5.1. However the sentiment that everyone’s waiting on it has not gone unnoticed. There is however increased support for MacOS across the board is a goal for 5.1 itself but that’s not specific to Nanite. If any official information is put out on the subject, I’ll be the first to let you know!

5 Likes

Thanks for the info! I’ve gotten to the point where I can compile UE5 + Unreal Editor using the macOS 13 SDK. Heads up, there’s already a build error when using Xcode 14 beta. Under the section about compiling unmodified ue5-main in philipturner/ue5-nanite-macos, I describe an error where an Info.plist doesn’t generate by default. Have you been testing the new M1 build against Xcode 14 beta, or should I make a pull request to fix this problem?

For reference when I start posting images of Lumen and Nanite:

  • Mac: 32-core, Apple7 family, 10.4 TFLOPS, 32 GB
  • iPhone: 5-core, Apple8 family, 1.5 TFLOPS, 6 GB
  • The Mac is 7 times faster than the iPhone.

Screen Shot 2022-09-01 at 10.32.59 PM

1 Like

I am extra not qualified to make that call, but I’d recommend preparing a PR if you get things rolling, as I’m not extremely well versed on the engine code myself, especially MacOS side. I’m mostly around to assist users of the binary engine builds. Though in my report I’ll be leaving a link back to this post. Good luck and let me know if you need anything!

1 Like

Can anyone determine whether Virtual Shadow Maps are supported on Apple platforms? They’re like a third counterpart to Lumen and Nanite, but not emphasized as much in the original UE5 demo from 2020. The docs say they’re designed to be the same resolution as Nanite, and have the same platform restrictions as Nanite. I also see 64-bit atomics mentioned in the platform restrictions :grimacing:

Edit: This seems promising if we need the functionality of 64-bit atomics, and can shake up the implementation to utilize locks. It doesn’t require acquire-release memory order, as I assume the atomic_xchg intrinsic defaults to relaxed ordering.

If I click this, will it enable Nanite? The “Software Ray Tracing” option mentions Lumen in the tooltip, and it’s enabled.

Also, any plans to support MetalFX upscaling in the next release? FSR runs on Vulkan, but UE uses bare Metal not MoltenVK. Apple is most certainly not collaborating with Epic to get this supported, but AMD and Nvidia ensured their APIs work with UE5.

Edit: The Reddit post said “Metal coming soon” regarding Temporal Super Resolution.

5 Likes

These are all the places where Nanite shaders mention atomic or Interlocked*** HLSL intrinsics.

If my interpretation is correct, Nanite doesn’t require actual 64-bit atomics. I can see how someone got it running on macOS and iOS. They force-enabled the feature using Metal 2.4’s UInt64 atomic min/max. I thought the work distribution queue required a massive 64-bit counter, because more than 4 billion work chunks exist; I guess that understanding is incorrect.

Edit: There’s some bad news. Metal 64-bit atomic min/max only runs Apple8 hardware (A15/M2). The A14/M1 were released with Metal 2.3, which didn’t have these instructions. I made a Swift script and accompanying Metal shader for anyone to reproduce (below). NaniteWritePixel.ush includes a 32-bit lock-based workaround on Nvidia 1080/2080, which doesn’t need 64-bit atomics.

That could work on Apple7 GPUs, and might be how someone got Nanite working on them. This workaround’s nature could explain why Nanite had messed up depth order. It also explains the lower performance. I can test this theory by deploying Nanite-enabled UE5 to my iPhone 13, which has an Apple8-family GPU.

Swift script
import Foundation
import Metal

let device = MTLCreateSystemDefaultDevice()!
let commandQueue = device.makeCommandQueue()!
let library = device.makeDefaultLibrary()!
let function = library.makeFunction(name: "testAtomic64")!
let pipeline = try! library.device.makeComputePipelineState(function: function)

// A new `MTLBuffer` is always zero-initialized.
let buffer = device.makeBuffer(length: 8, options: .storageModeShared)!
let cmdbuf = commandQueue.makeCommandBuffer()!
let enc = cmdbuf.makeComputeCommandEncoder()!
enc.setComputePipelineState(pipeline)
enc.setBuffer(buffer, offset: 0, index: 0)
let oneSize = MTLSizeMake(1, 1, 1)
enc.dispatchThreads(oneSize, threadsPerThreadgroup: oneSize)
enc.endEncoding()
cmdbuf.commit()
cmdbuf.waitUntilCompleted()

let result = buffer.contents().assumingMemoryBound(to: UInt64.self)
precondition(result.pointee == 7)
print("Success! \(result.pointee) == 7")
Metal shader
#include <metal_stdlib>
using namespace metal;

kernel void testAtomic64(device atomic_ulong *buffer [[buffer(0)]])
{
    atomic_max_explicit(buffer, 7, memory_order_relaxed);
}
Xcode output, which happens when MTLCompilerService encounters an unsupported feature in AIR
2022-09-03 23:50:04.632240-0400 TestAtomic64[27891:1448661] Metal GPU Frame Capture Enabled
2022-09-03 23:50:04.632593-0400 TestAtomic64[27891:1448661] Metal API Validation Enabled
2022-09-03 23:50:04.679200-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:04.679246-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 1 try
2022-09-03 23:50:04.699454-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:04.699508-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 2 try
2022-09-03 23:50:04.719754-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:04.719804-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 3 try
2022-09-03 23:50:14.782645-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:14.782777-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 1 try
2022-09-03 23:50:24.846962-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:24.847071-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 2 try
2022-09-03 23:50:34.914092-0400 TestAtomic64[27891:1448661] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-03 23:50:34.914213-0400 TestAtomic64[27891:1448661] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 3 try
TestAtomic64/main.swift:15: Fatal error: 'try!' expression unexpectedly raised an error: Error Domain=AGXMetalG13X Code=3 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}
2022-09-03 23:50:34.915280-0400 TestAtomic64[27891:1448661] TestAtomic64/main.swift:15: Fatal error: 'try!' expression unexpectedly raised an error: Error Domain=AGXMetalG13X Code=3 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}
3 Likes

I sincerely apologize for sending you guys so many notifications, but I think this ought to be emphasized.

Apple plans for hardware features several years before they’re implemented. Apple may have added atomic UInt64 min/max precisely to get Nanite running on M2. In early 2020 they saw the UE5 demo and wanted AS to support it for the planned “Metal/Macs for gaming” focus. Then relations with Epic derailed (late 2020) but the chip design was already established.

This explains why they strangely added only one 64-bit atomic instruction when they could have added all of Shader Model 6.6 functionality.


Until further notice, all information about the Nanite on Mac/iOS investigation will be posted on the ue5-nanite-macos repository’s README.

3 Likes

I’m getting excited. Hopefully soon Nanite will come to MacOS Intel.

2 Likes

I’m striving to minimize how much I post on this thread, but an update is overdue. I got Nanite “running” on macOS, whatever you define running to be. In my case, running sometimes means freezing my entire Mac and requiring that I reboot it. The iGPU encountered some kind of infinite loop.

Also, the crash. This is new territory - something that nobody has described a workaround for. I’m currently investigating it, although help from someone who knows the UE5 code base would make this happen much faster.

I replicated the source code that @gladhu had made public. The shaders once had a hack that enabled Nanite through 32-bit texture atomics. @gladhu made a hack around the hack, because Metal only supports 32-bit atomics through buffers. UE5NanitePort replaced each atomic modify with a regular read + write. This is inherently thread-unsafe, and may explain the graphical glitches surrounding incorrect depth/occlusion.

Since then, Epic removed the 32-bit texture atomic workaround, so that Nanite only runs on DX12/Vulkan devices with 64-bit atomics. I just thought of an entirely different way to run Nanite without needing 64-bit atomics or texture atomics. It’s thread-safe by nature, unlike the previous lock-based workaround. It runs not only on macOS (Apple + Intel), but also DX11. I pitched the idea to @SupportiveEntity in a PM because it’s excruciatingly long.

Nanite through 32-bit atomics

I’m planning to implement and explain it in the AtomicsWorkaround directory, so it might be worth checking that periodically. In short, you have to think theoretically regarding information transfer. 64-bit atomics are required because depth must be synchronized with color. In rasterization pipelines, this is called z-buffering. Nanite performs rasterization through a compute shader.

However, the depth data is only 24 bits. You’re transferring 56 bits of information when doing a 64-bit atomic max. So what if you separated the 24 bits of depth, then broke the remaining 32 bits of color into 8 bit chunks? Then, rearranged them like so:

  • 24 bits of depth + 8 bits color data = 32-bit word
  • 24 bits of depth + 8 bits color data = 32-bit word
  • 24 bits of depth + 8 bits color data = 32-bit word
  • 24 bits of depth + 8 bits color data = 32-bit word

The depth is duplicated 4 times, but it works! Four 32-bit memory chunks that can be atomically modified. The actual implementation is more complex, and it splits color into two chunks of 16 bits. It uses locks, but differently than the previous lock-based workaround. I’m not describing that here for brevity.

3 Likes

I ran the AtomicsWorkaround script on my A15 device with iOS 16, and 64-bit atomics seem to not work. Is anybody on this thread planning to get an M2 Mac, iPhone 14 Pro, or upgrade to an M2 Pro/Max/Ultra/Extreme machine this fall? If so, and you’re willing to help me investigate, that would be awesome.

2022-09-13 16:22:11.102662-0400 AtomicsWorkaroundA15[1515:82192] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-13 16:22:11.111895-0400 AtomicsWorkaroundA15[1515:82192] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-13 16:22:11.122466-0400 AtomicsWorkaroundA15[1515:82192] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED
2022-09-13 16:22:11.122514-0400 AtomicsWorkaroundA15[1515:82192] MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 3 try
2022-09-13 16:22:11.122653-0400 AtomicsWorkaroundA15[1515:82192] Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED

Perhaps Apple exposed the 64-bit atomic instruction from recent AMD devices. If so, would someone with an Intel Mac be willing to test this hypothesis?

2 Likes

Alright. So I got Lumen working. It’s glorious! Works for me 100% on my Intel iMac 5k with the latest versions of both macOS and UE5 installed. Waiting for Nanite. :pray:

2 Likes


(My specs)

1 Like

Of course Lumen is working for you, you’re on an Intel Mac. For the folks with Apple Silicon chips… it’s a little more complicated.

1 Like