Textual file formats

Alright, this is a pretty big request.

I really like diffing things. I like being able to see what changed from one revision to another. With the exception of C++ code, Unreal’s binary file formats make this nearly impossible. And given that a lot of the game’s logic can be buried quite deeply (defaults tab, collapsed blueprints inside collapsed blueprints inside collapsed blueprints, etc) or just very difficult to visually diff (blueprints in general) this has already bitten me a few times where I’ve lost track of what changed.

I’ve worked on projects where everything, everything, was stored in text files. It was wonderful. Worked out great.

I’d love it if UE4 took the same approach.

Realistically, I assume some data baking would be necessary for great runtime performance. But there are data serialization libraries designed for this, that use a single abstract layout, and can use either compressed binary or text formats as backend storage systems. The two big ones I’m aware of are Google’s Protocol Buffers and Cap’n Proto, both of which would work quite well and may - judging from what I’ve seen of UE4’s serialization - actually end up with smaller files, when all is said and done.

(Cap’n Proto specifically is built to be insanely fast and compact, although it doesn’t yet support MSVC.)

So here’s the series of questions, just so I can figure out where Epic stands on this:

  • Are there any internal plans to do this already?
  • If someone went ahead and tried to make these changes, would it have any chance of making it into the official package?
  • Is there some diff method I’m not aware of?
  • Is there some major reason to stick to binary file formats that I’m not aware of?

Thanks for reading and for any commentary :slight_smile:

Source/Runtime/JsonUtilities look promising. I haven’t tried them out, but it looks like they can convert any UStruct (which is parent of UObject which in turn is used to represent assets) into json format there and back.

Hi ZorbaTHut,

The editor actually has built-in text diffing support for any .uasset file, as well as a special graphical diff tool for blueprints. I’ve written a blog post about asset diffing that will go up in a little while, but here is a summary.

You’ll need to store your project in a supported source control system for the built-in tools to work. The editor currently has source control plugins for both Perforce and SVN. Perforce is what we use internally at Epic, and it is now free for up to 20 users (and 20 workspaces; you’ll probably end up using 2 or 3 per human user). SVN is totally free and there are a number of cloud SVN providers if you don’t want to host your own server.

You can tell if your editor is currently connected to source control by the little icon on the top right of the main frame, next to the [Enter console command] prompt. If the icon is green, you are already connected, but if it shows up as a red no sign then click on it to enter your server settings. Next, you need to tell the editor where it can find your favorite text diff tool in Editor Preferences. It defaults to p4merge, but almost any diff tool should work (e.g., Beyond Compare, Araxis Merge, Tortoise Merge, etc…)

Once you’re configured, there’ll be additional options in the context-click menu in the content browser that let you view revision history and diff assets against each other or diff your current revision against @head.

I think there was an issue in packaging 4.0.1 that left out the SVN binaries from the installed version, so if you use SVN you’ll need to copy them into the Binaries/Win64 directory too (will see if I can find the instructions for that).

Cheers,
Michael Noland

Here are the instructions for adding the SVN binaries:

Cheers,
Michael Noland

I managed to find the built-in support after I made the above post, and it’s pretty nice, but . . . it’s not as powerful as true text files would be. You’re never going to manage to roll all of a source control system’s functionality into UEd. For example, I don’t see, at least so far, any way to view all the files in a given changelist. I don’t see any practical way to merge branches or deal with merge conflicts. These can certainly be solved one-by-one but for every missing feature implemented I bet I can come up with two more that aren’t yet implemented.

Meanwhile, putting everything into a textual format would fix that problem instantly.

The visual diffing is great, don’t get me wrong - for simple things, it’s probably more useful than a text format would be - but I strongly question whether it will solve the entire problem or not.

And of course this is all going to be multiplied by a factor of ten if/when Git support is introduced because Git UIs are a nightmare, and then you have to do it again for Mercurial or Bitbucket etc etc etc.

It’s certainly possible to work around all of this, but that’s time and effort and a source of bugs; at my day job we have exactly two tools that generate binary output files. One of them is used by precisely one person, the other is a frequent source of merge errors. If you’re expecting for significant amounts of games to be written in Blueprint/Cascade/etc then I suspect this will be a friction point for quite a while.

Of course this is where I say “. . . and you’re the one who has to decide on priorities and you probably have bigger things to worry about atm”, so don’t worry I’m not expecting this to change overnight :slight_smile: but I’m curious whether this would fall into the “we explicitly don’t want to do that” bucket or the “we haven’t gotten around to that” bucket.

Hi ZorbaTHut,

We don’t have any plans to switch to a text format for .uasset files.

There are certainly advantages to text files, but merging a complicated data structure even in textural format is likely to be very error prone (It’d be very easy to break the data structure when hand-merging; producing something corrupt), and parsing text files would have a significant impact on editor load speeds/etc…

In practice, we haven’t seen many collaboration issues using exclusive checkout to prevent conflicts, especially since the unit of granularity is now one asset instead of an entire cluster of assets like in UE3. Obviously this doesn’t work well when using something like Git, it was never designed for binary files and there’s no concept of locking a file in the first place (I’d guess mercurial would be similar but I’ve not used it personally).

RE: Other source control APIs, we’ve got a pluggable interface for the editor SCC integration, nothing else in the editor UI needs to change when adding support for another package (at least in theory). We’re not aiming to completely replace use of a SCC tool, just consolidate the asset-centric actions in the contexts where they make sense. This is probably an area that will see improvement over time as well.

Cheers,
Michael Noland

I too would love to see a text format for blueprints. I agree that merging them by hand can cause problems but the same can be said for any source file. The impact on load speeds due to parsing text can be minimized by caching the binary version and updating it only when the text version changes.

I know this thread is old, but I thought I’d throw in my 2c after a few weeks of working with UE4.

To Micheal’s point about not seeing collaboration issues, already on a small team (~8) we have had multiple issues, especially in key-stone ‘assets’ like the main character blueprint or main level. With levels at least we can break things down into sub-levels for higher granularity (though that introduces problems with referencing that forces us to use tags or some other mechanism…), but they’re still very opaque files, and for BPs the options to increase granularity are limited and clunky.

Merging script code in general should absolutely be possible. Assuming the text format is readable and stable (i.e. it doesn’t change when it’s re-saved with no changes), it might still get complicated, but that’s a function of the complexity of the code being merged, and the person merging it will know that and act appropriately.

To the point of corrupting files, I’ve twice been able to corrupt my BPs somehow (once in a way that just crashed the editor every time the project loaded), and have had no recourse to fixing them other than reverting and starting again because I can’t diff the file to see what’s wrong and manually fix it.

Even outside of BPs, having all assets have a text representation would help people understand the changes others are making, and help with integration of changes between versions. We’re now taking the approach of porting a BP that becomes a workflow bottleneck to C++, so we can have the benefits of a text format, but taking the hit in iteration time and non-coder visibility.

After several months of serious development using Blueprints we are also seeing a lot of these headaches with the binary-only format. We are considering a major refactoring to move more of the critical code down to c++ for this reason.

Primarily it’s the “corruption” of critical and large BP that are a major issue. We have a base class BP which crashes the editor constantly (but seems ok in the game).

Also mysterious error and warning messages coming from the BP compiler and no simple way to track down what it’s talking about.

Another big problem is the lack of good global search, the built-in BP search is very unreliable.

I think these issues would be mitigated if there was an at least somewhat stable text format to look at.

Yeah, I’m continuing to work with the file formats as is, but they’re . . . not good. The diff tool doesn’t work properly - it misses things that are actual differences, and “revert unchanged files” never reverts anything, because even saving a file with no changes still changes the binary representation.

If I had to pick one thing about UE4 that I felt was a major problem it would be the file formats. Everything else is a minor annoyance at best, the binary file formats are a continuous source of issues.

Loving the hell our of UE4 so far…

Though have to agree with ZorbaTHut: during development time a more accessible format such as xml not just for blueprints but for any .uasset would be preferable.

Coming from CryENGINE, we regularly dealt with their internal level_editor.xml to like auto-assemble our levels or delete faulty things without having to revert to older saves/revisions.

Right now I have a blueprint I cannot save because it has some links to external private packages…can’t edit the thing as it won’t let me save it any ways. XML may just be able to solve this for me by allowing to scan the text representaiton and delete / correct the faulty part manually.
Status just now…I’m considering dropping down to c++ for the BP logic as well.

That might be due to updated internal timestamps…

Probably, but that brings up the inevitable question, why does it even have internal timestamps. It just makes it impossible to revert unchanged files, and it’s not like UE4’s built-in “Revert unchanged files” does anything smarter.

I agree with this thread. Binary formats are a problem for many reasons. My two biggest issues are:

  1. The uasset files are constantly change even when I don’t touch the assets and when a visual diff shows no changes. This is a real pain in version control and collaboration among other things.
  2. At least once a day I seem to get into a state where the UE4 editor crashes and then each time I start it I get an immediate crash. Clearly something is messed up in one of the uasset files. At present, this is impossible for me to fix and I lose a lot of work. Textual file formats would help a lot here.

Epic please consider.

Thanks,
-X

How about making a utility which can translate a uasset file into a text or XML file and back?

That way if a uasset file gets corrupted, we could use that tool to try and fix things instead of losing all our work. Alternatively, such a tool could be helpful for verifying that nothing important has changed (e.g., except for a time-stamp) or build other tools to more easily analyze changes in a project, facilitate collaboration, and so on.

This would allow those who want to use textual file formats to do so but not impact the speed of loading in the editor for those who don’t.

This is a solved problem: FlatBuffers: FlatBuffers

You use text formats for version control friendliness and a schema compiler to validate patches and hand-edits. You build binaries assets offline for the runtime so there’s no cost in performance to the end user. We’ve used this with great success.

The big problem with binary uassets is that the metadata schemas are generated from code, and very fragile against migrations. Monkeypatching text with scripts is not just a convenience, it’s necessary for refactoring data which otherwise falls through the cracks on schema change. I’d prefer an error-prone solution to the status quo which is no solution at all. You have to trust your developers.

Also, don’t tell us a feature which other mainstream engines support and are helpful to thousands of developers around the world isn’t feasible for hand-wavey reasons (see for reference: http://docs.unity3d.com/Manual/TextualSceneFormat.html).

This thread is long dead, I know, but I just have to chime in here and say that the binary uasset files is something we constantly struggle with. A big “agreed!” to everything everyone has said in favor of text files. We have to tread lightly because merge conflicts are nearly impossible to resolve (and even when you do resolve them, you always have this sinking feeling that you overlooked something). We are always wondering why files we didn’t change (or did we? who knows!) show that they have been changed. I’ve lost track of how many times an asset has somehow become corrupted and causes the editor to crash and so the only solution is to throw away your changes and go back to an earlier version. And on and on.

I use the GUI-based diff tool inside the editor, and I’m grateful for it, but I don’t see how it could ever completely fill the need that people are talking about here. A great companion to what we really need, no doubt, but never a completely suitable alternative.

I understand that Epic isn’t planning on switching to text uasset files, so a most-of-the-way-there compromise would be if there was a tool that took a uasset file and spit out a decent text-based representation of it. If we had that tool (does it already exist somewhere? maybe it got created in the interim?) then we could write some scripts to satisfy many of the needs people are talking about in this thread.

Just 2 months ago I was told they made “a lot of progress” towards new text based assets, but requests for any further information were completely ignored like usual.

It’s an ongoing research project: when or if it will ever be ready for use is still unknown. If it reaches that point, we’ll definitely yell it from the rooftops.

Cheers,
Michael Noland