Questions about the inner workings of package loading

I have an asset structure which includes texture assets as subobjects in a package. This structure is less optimal than it could be, however it does afford us convenience in editor & asset workflows so I would like to attempt to do a cook-time optimization to de-duplicate objects in these asset types. The rough sketch of this in my head goes something like:

* In ModifyCook() enumerate all assets of this type

* For each asset, reflect on the content hash of the texture subobjects and invent a stable path based on the hash

* Duplicate subobjects into the path, if they do not yet exist, then update the original asset to point at the asset location

To further improve this it would be easy at the start to cleanup old, stale assets. Another consideration would be incremental cooks (which we don’t currently do) but we could easily skip the cleanup step in that case.

But even if I were to do this, I’m not exactly sure if this solves the problem entirely, and I wanted to ask some general questions about asset serialization to make sure I’m taking the right approach - or if the idea of de-duping is just too hard…

Questions:

I want to make sure I understand what actually happens when a package loads. From reading the code it seems to be that when a UPackage loads, all objects internally are loaded as well - correct? What if the subobjects were only referenced from the root asset as soft object pointers? From what I can understand from reading the code, this wouldn’t matter either as the objects are in the metadata which is used to create the export map, correct? If I’m right on both of these, then effectively it means the only way to not load a UObject is to have it not exist in the UPackage.

At cook time, even if my root asset were no longer referencing the subobjects in its package, would those subobjects still exist in the UPackage and therefore load anyway? Ideally I’m trying to treat these subobjects as editor only data - however I don’t think a concept of “editor only object” is a thing. I could forcibly delete the subobjects, but I’m a little worried about the implications of this being a non-read only mutation of the data.

Is there a runtime solution to this where I can forcibly unload the subobjects after I detect one with a duplicate hash has already been loaded? If I were to have a manifest of unique textures (this is effectively what I’m doing as these are going into a texture collection for a new bindless pipeline that I’m standing up), will that mean that the whole UPackage and all its other textures will be disallowed to unload, or can the subobjects be allowed to unload and the package is in a partially loaded state? This is a little inefficient as it would mean we have to fully load all the objects, only to have them be removed immediately - I just want them to not even load at all.

I think even if all this winds up being just too complicated and I have to move away from subobjects, I still am interested to have these questions answered so that I can have a deeper understanding of the asset system.

Thank you

Steps to Reproduce
N/A

LoadPackage in the editor does load all objects in the package. I believe that happens at runtime but would need to doublecheck.

Editor-Only objects in editor packages are removed when they are cooked.

SavePackage does a graph search of the UObjects in the package, starting from the package’s root set - the objects in the package that are marked as RF_Public - and following every UObject reference that goes to an object in the same package. Objects that are not RF_Public and are no longer referenced at save time will not be included in the cook package. (Exception: the root set also includes Archetype objects, which is only applicable to blueprint objects; subobjects of a classdefaultobject are archetype objects and are also included in the package’s root set even if the class default object does not reference them).

Creating objects at runtime has some caveats in 5.6 and earlier: you should create them during PreSave from another object in the same package, and when creating them you need to call BeginCacheForCookedPlatformData on them and call IsCachedCookedPlatformDataLoaded on them until it returns true. Without this the textures you create will be missing their derived data that the runtime uses - the data that represents the platform-specific compressed version of the texture’s texels. We hope to remove the need for that BeginCache/IsCached handling in 5.7 and do it automatically, but haven’t started working on it yet. Doing it in 5.6 and earlier is guaranteed to work (Landscape uses it NaniteStatic meshes in maps, see ULandscapeNaniteComponent::InitializePlatformForLandscape) and will continue to work in 5.7 and after that until and unless we deprecate it and provide an alternative. For example, there are ways to mark UObjects as do-not-save-when-cooking even if they are referenced.

From your explanation of what you want, I believe running your texture update during PreSave of your outer UObject in the package will work as desired. You will removing references to the stale textures, create the new up to date textures, add them as references from your outer UObject, and call BeginCache/IsCached on them. The stale textures will not be present in the cooked package and the new textures will be.

If that doesn’t work let me know and we can get into more details about SavePackage. Whatever implementation details we have to handle, I think the best highlevel approach is to have the package saved by the cooker contain only the up to date objects.

Thank you for the information, I’m going to try this approach with the information provided and will post an update with what I find.