Source Encoding & Perforce Corruption

Whenever I submit to Perforce, anyone who downloads an UTF encoded file will receive a corrupted version of it - resulting in what appears to be some form of Chinese symbols:

Per the documentation this is due to the fact that when you submit a UTF encoded file to perforce with a typemap setup of ‘text’ it will corrupt the file by misunderstanding the encoding

However per the “Perforce Setup” documentation - that’s exactly what is said to be done, it’s all marked as ‘text’ in the typemap which then results in the corruption as far as I can tell during submission of these files

In an attempt to create a brand new C++ class from the C++ Wizard in the editor, this appears to automatically save the file as what seems to be UTF-16 LE, which even the documentation says all text is stored in the editor memory as this:

All strings in Unreal Engine 4 (UE4) are stored in memory in UTF-16 format as FStrings or TCHAR arrays

So can some one help me understand what I am missing? There appears to be a direct connection between telling me to setup the typemap as ‘text’ and the engine by default saving things into a UTF encoding that is then corrupted by the ‘text’ typemap.

What’s really odd, is it appears to be random files that were submitted that are corrupted while some others are not. We have had DefaultGame.ini corrupted, but DefaultInput.ini and DefaultEngine.ini fine. Defaultgame is UTF-16 LE while the other two are UTF-8 and I’m not sure what’s triggering them to be saved this way.

Some header (.h) files are saved as UTF-16 causing the corruption, while others are fine as UTF-8 it appears as well

On a last note - this may be unrelated, but seems odd. The C++ Wizard doesn’t properly map out the .cpp generated include path, the header is pushed to public/Players/PvCreatureCharacter but the .cpp only puts “include PVCreatureCharacter.h” resulting in a bad path since the standard has been switched to the “include What You Need” method a while back, but this does cause failure of the Wizard to recompile (is this perphaps resulting in a bad save of the encoding too since it’s never fully leaving memory of the editor?)

Any help would be really appreciated as I cannot find a link to why some are utf8 vs utf16, and I’m afraid of corrupting my Perforce database at some point by overwriting my local files with badly saved revisions.

1 Like

If you copy certain characters into your code and save it, it will change the encoding type of the file.

Ah’ha! You got me on the right track. I noticed that the ones corrupted had my copyright filled out while the non-corrupted ones didn’t.

Turns out you don’t want that little copyright symbol in if you plan to use Perforce. Once i removed that my files started generating as UTF-8 again. Looks like that seems to be the issue! Thanks!