Looking for clarification for SIMD instruction support

Hi Everyone,

In short, I was hoping someone could explain, or point me towards the documentation, for correctly using SIMD instructions in our code.​ What to configure in Build.cs/Target.cs etc., what to #ifdef guard code with that uses SIMD instructions etc.

As far as I have been able to test, for all our target platforms SIMD instructions seem to be available and usable by default without changing anything.

Based on what I read in the docs that I could fine, I thought that you would need to add “MinCpuArchX64 = MinimumCpuArchitectureX64.AVX2” to the Target.cs file, but even without that SIMD instructions compile correctly, properly emit SIMD instructions, and function as expected. I’m guessing that the version of the MSVC compiler we are using impicitly supports these or something…

One thing that adding “MinCpuArchX64 = MinimumCpuArchitectureX64.AVX2” to the Target.cs file does do, is that preprocessor directives such as __AVX__ and __AVX2__ get defined somewhere. And according to the UE docs, it means that running the game will not launch on a PC that doesn’t support at least AVX2, making this an explicit requirement (not sure where this would appear, but I guess it would be something we would have to declare when publishing for example).

Strangely the default value for MinCpuArchX64 appears to be MinimumCpuArchitectureX64.Default, which is defined as “None” in our version of UE (5.6 ish), so I would assume that if we just used SIMD instructions without setting MinCpuArchX64, the application would appear to support CPU’s that don’t support these instructions, but would inexplicably crash when you attempt to launch the game… And I would have thought that using SIMD instructions would either lead to a compiler error, runtime crashes, or maybe just the compiler emitting non SIMD instructions for the related code.

Anyway I would appreciate if we could get some clarification on this matter.

Thanks,

James

[Attachment Removed]

Steps to Reproduce
N/A

[Attachment Removed]

Hello, looks like this one slipped through the cracks!

[Attachment Removed]

Hi James,

no idea what happened here, but I was only just assigned this ticket. Sorry for the delay!

It depends on which compiler you’re using. As you already noticed, MSVC always supports all SIMD intrinsics regardless of what CPU architecture is configured. In MSVC, that setting controls what code the compiler will generate on its own (e.g. via auto-vectorization), but all intrinsics are always available. Clang treats this somewhat different, so in that case in matters. Additionally, when the target is set to AVX or higher, the compiler will use the VEX encoding for all SIMD instructions which lets them have three instead of two operands and tends to decrease the number of SIMD instructions generated by about 5-10% since a lot of register-register moves become unnecessary as a result.

If you compile an executable with AVX or AVX2 instructions enabled (or use those intrinsics) and then try to run it on a machine without that support, it will crash with an “invalid opcode” exception the first time it tries to execute such an instruction.

The “__AVX__” and “__AVX2__” defines gets set by the compiler itself when you pass the architecture flags.

Setting “MinimumCpuArchitectureX64” to “Default” doesn’t mean “nothing”, it just means “minimum UE supports”, which is (as of this writing) SSE4.2. In any x86-64 program, you can always use at least SSE2-level intrinsics since SSE2 support is required as part of x86-64. About two years ago we bumped the minspec for UE5 to SSE4.2, that’s the minimum the engine supports at this point, so UBT doesn’t support targeting anything lower than that.

Hope this helps!

-Fabian

[Attachment Removed]