Textual file formats

They mentioned it in last preview twitch stream.

What’s holding it back so long? You already have a way to exports assets as text.

Assets are not 100% reflection-based serialization, they contain custom serialized data and bulk data. The output of the reflection-based serialization (e.g., T3D) is also not very nice to look at or work with, and can be fragile (e.g., if you were to edit one half of a reciprocal pointer while renaming an object but not the other half you’re going to get an error or silent failure on import). There’s also load time performance to consider and ensure there is no unacceptable regression.

There’s a big difference between a proof-of-concept that works for only fully reflected assets and a robust solution for all assets that is readable when diffing and has a decent chance of auto-merge working when users make non-conflicting edits.

Michael Noland

What if the bulk data was stored separately (in a .ubulk file next to the .uasset)? Merging a mesh or texture doesn’t make any sense so that’d be one problem solved.

Files that are ‘tied together’ by fiat are a nightmare when doing merges; it’d be entirely up to human effort to avoid doing accept mine on the text file and their’s on the binary file, or vis versa. Bulk data isn’t the only custom serialized stuff though; UStaticMesh::Serialize is a good example of a bunch of custom-serialized properties that you probably do want in the mergeable file rather than sidecars.

Michael Noland

Seems like all of that custom serialization code has to be rewritten at some point to support this. The earlier the better!

Chiming in yet again to add more color. Although I love the idea of someday getting to the point where I can see uassets in a form that is actually understandable by humans (i.e. not only in a text format, but also formatted in such a way that you can make sense of it), I want to stress that just getting us to a text-based format during development would still be a huge, huge win even if the data is still too complex and cumbersome to grok completely. Please let me share two examples that I spent a bunch of time on this morning:

(1) During development, files are frequently marked as modified when nothing in them has actually changed (from a version control perspective at least). I have no doubt that the file’s last-mod timestamp has been tickled, and maybe the file was regenerated, but nothing /really/ changed, and so marking them as changed in version control muddies up the object histories with a bunch of false entries. Worse, when this happens, you never really know what to do, and that’s the frustrating thing. You can just go ahead and commit the change, but that seems like a terrible idea (because you’re committing a change without really knowing what the change is).

Worse, a common side effect of this is that it creates spurious merge conflicts - if I’m working on one asset, and another developer is working on another asset, then in theory we shouldn’t ever really run into conflicts. Granted, UE4 does all sorts of magic so maybe a direct comparison to editing text-based files (such as a .cpp file) is not fair, but it’s very common for completely spurious changes to happen too.

This morning I modified a BP base class and after saving my changes, it was marked as modified (as expected) but all of the subclasses were also marked as modified too. Seemed odd, but maybe it was legit?

Before committing my changes, I updated from version control and it aborted with conflicts, because another member of the team had modified one of those subclasses. So I reverted my local copies of the subclasses, stashed my change to the base class, pulled in the subclass changes from the other developer, and then applied my change to the base class.

When I fired up UE4, everything was fine and dandy - both my changes and the other developer’s changes were present and happily coexisting. Yay!

What surprised me, though, was that even after recompiling the BPs, the subclassed BPs were not marked as modified in version control. While that much is expected - because I had not modified them - it seems to prove that them being marked as modified earlier was incorrect. Either they should not have been marked as modified in the first place when I changed the parent class, or they should also be marked as modified after re-applying my changes and after taking the other developer’s changes. But for them to be marked modified in one case and not the other doesn’t seem right (not to mention being incredibly frustrating).

(2) It’s ok if 100% robust and automagic merging of changes into assets is a long way off or not entirely achievable, but in reality it’s actually quite rare for two people to be modifying the exact same portion of the exact same file, and so there should be many cases where independent and unrelated changes can be automatically mergeable, but currently aren’t.

For example, if someone adds a function to a BP I’m working on, I should be to merge in that fairly easily. If someone renames something that is completely apart from anything I’m working on, that should merge in cleanly. Or if someone moves around some nodes without changing their connections. Ditto for editing the internals of a function. Or in the case where I’m trying to inspect a diff, even just being able to spot check the scope of the change would be a huge win.

I guess I’m saying that I can think of some really tricky scenarios that would be tough to support, but if UE4 just threw up its hands in those cases and said “I can’t handle this”, we’d be no worse off than we are today. But apart from that there would be many, many scenarios in which things like merging could work well and be done automatically, and those cases are ones that we run into every single day.

The current method of basically having to lock each file feels really archaic (our group chat history is full of “working on XYZ unless anyone objects” followed later by “done with XYZ” for any blueprint-related stuff, and then that sort of thing is completely absent for any C++ changes). Worse, no matter how diligent we are, it’s also often insufficient because of other files being marked as modified even though we didn’t modify them.

Anyway, like I said, I’m just trying to add more data points here. Binary, opaque BPs cause real issues when working with them, so I hope the idea of some sort of text-based alternative during development isn’t treated as something to maybe experiment with or an academic exercise just for kicks - UE4 desperately needs to become version-control friendly.

P.S. Don’t even get me started on how nice it would be to be able to use source control branches again! :slight_smile:

Just to note, data-based approach helps to mitigate this issue, leaving you with less changes to merge in binary form. And effects are even greater when coupled with dynamic actor assembling and component-based logic. Greatness comes not without costs though.

I just want to chime in here as someone who’s been developing in C/C++ for way too many years but is new to Blueprint development. A text-based format would be a HUGE help in so many ways. Just being able to grep for things. Where’s that variable I added the other day? And of course proper git diffing and merging… but I just really want to be able to edit in a proper tool. Refactorings, copying blueprint code to a new project… all these things would be much simpler and more bulletproof with a text-based format.

Apologies for resurrecting this, but did anything come of this research project? All of the points raised in this thread are still valid, and Git has become more popular than ever in the time since this was posted. I’m probably over optimistic but would it be foolish to cross my fingers for some kind of text-based BP representations in UE5? A major new engine version would certainly make it easier to justify such a big potentially breaking change like this.

It’s an ongoing research project: when or if it will ever be ready for use is still unknown. If it reaches that point, we’ll definitely yell it from the rooftops.

Michael Noland

Hi Michael,

Bumping this pretty important thread again for 2021.

In this day and age of remote work, dealing with conflicts / merge hell is more prevalent than ever.

And although Perforce was the game development industry standard 15 years ago, Github is overwhelmingly more popular than Perforce. It’s difficult to have companies switch over their tool chain workflows just for Perforce.

Any update would be appreciated!

~The programmers at Joydrop Ltd