This is primarily a question for Steve Robb, as I believe he drove the original change in 2021.
Now that UE5 requires C++20 across all platforms, would it be possible to revisit the definition of UTF8CHAR in GenericPlatform.h?
Today, the type is still declared as:
enum UTF8CHAR : unsigned char {};UTF8CHAR briefly became char8_t in CL16487510 when __cpp_char8_t was defined, but that branch was removed after an [ABI‑mismatch report: some modules were still being built in C++17 while others opted into C++20, so their signatures [Content removed] Today, every shipped toolchain and console SDK compiles the engine itself in C++20 (at least at the language level from what I can tell), so the mixed‑standard situation that forced the rollback shouldn’t arise. In the original UDN thread, the plan was to keep the enum until C++20 became the baseline, which it now is.
With that in mind, could we restore (or even make unconditional) the char8_t definition from CL16487510?
#if defined(__cpp_char8_t)
using UTF8CHAR = char8_t;
#else
enum UTF8CHAR : unsigned char {};
#endif
For the past few weeks, I’ve been building with UTF8CHAR aliased to char8_t. The only follow-up work was clearing a handful of UTF8CHAR <=> ANSICHAR overload ambiguities, which were straightforward fixes. Making char8_t the engine‑wide type should help prevent future implicit conversion ambiguities from creeping in.
Day‑to‑day development already feels safer: whenever we pass UTF‑8 text to C libraries like SQLite, we can cast u8‑prefixed literals to const char* function parameters, and be confident that the bytes are genuine UTF‑8 and that we’re not relying on undefined behavior.
Are there any remaining target platforms, SDKs, or other concerns that would block this change? If not, would Epic be open to a pull request that reinstates the char8_t feature switch (or simply makes it the default)?
Thank you for looking into this. Please let me know if further details would be helpful!
~ Connor Widtfeldt