Size of a map

This question was created in reference to: [Audit Staged compressed [Content removed]

Hi,

I am trying to figure out how much a level/map actually costs on disk.

Size map is not showing assets that we reference through our own system placing things through PCG and the like.

I am also doing this

UnrealPak.exe “\ProjectName-Windows.ucas” -List -csv=C:\output.csv to generate what is in the packaged game. But there is no easy way to know the correct size of my map with all its dependencies. I know there are assets spawned with our systems on the map like a tree placed with PCG but inside the csv there isnt really a combined size for map.

I see all the _generated files and they are part of the size of a map I assume.

What is in the .ubulk, what makes the ubulk larger or smaller?

So are there any tools or command lines or something that allows me to collect everything in a level and calculate the actual disk size.

Reason is that we would like to understand how big a map is and then what is shared with other maps through PCG and other systems and then we can extrapolate on how many maps can we realistically create before we blow an arbitrary budget we have put for our selves so that the build is not forcing people to buy new hard drives… :slight_smile:

[Attachment Removed]

Steps to Reproduce
Hi,

I am trying to figure out how much a level/map actually costs on disk.

Size map is not showing assets that we reference through our own system placing things through PCG and the like.

I am also doing this

UnrealPak.exe “\ProjectName-Windows.ucas” -List -csv=C:\output.csv to generate what is in the packaged game. But there is no easy way to know the correct size of my map with all its dependencies. I know there are assets spawned with our systems on the map like a tree placed with PCG but inside the csv there isnt really a combined size for map.

I see all the _generated files and they are part of the size of a map I assume.

What is in the .ubulk, what makes the ubulk larger or smaller?

So are there any tools or command lines or something that allows me to collect everything in a level and calculate the actual disk size.

Reason is that we would like to understand how big a map is and then what is shared with other maps through PCG and other systems and then we can extrapolate on how many maps can we realistically create before we blow an arbitrary budget we have put for our selves so that the build is not forcing people to buy new hard drives… :slight_smile:

[Attachment Removed]

Hi [mention removed]​,

I have not been able to find a build-in tool that takes you the size of the map and all of its dependencies, but you can create your own tool for it. The AssetRegistry is the class that you are looking into as it allows you to extract all the dependencies and all the assets references directly by the map. You can build out the full dependency graph of the opened map. Now that you already have the csv file of the final .ucas you can parse the file and gather the size of the packages that are related with the map and sum them all together.

With the following code you can get the dependencies of the package:

static void GatherDependenciesRecursive(const FName& RootPackage, TSet<FName>& OutPackages)
{
    IAssetRegistry& AssetRegistry =
        FModuleManager::LoadModuleChecked<FAssetRegistryModule>("AssetRegistry").Get();
    
    TArray<FName> Stack;
    
    Stack.Add(RootPackage);
 
    while (!Stack.IsEmpty())
    {
        FName Current = Stack.Pop();
 
        if (OutPackages.Contains(Current))
            continue;
 
        OutPackages.Add(Current);
 
        TArray<FName> Dependencies;
        AssetRegistry.GetDependencies(
            Current,
            Dependencies,
            UE::AssetRegistry::EDependencyCategory::Package);
 
        for (const FName& Dep : Dependencies)
        {
            if (!OutPackages.Contains(Dep))
                Stack.Add(Dep);
        }
    }
}

The main idea would be that you tell this function to use the UPackage of the level you want to inspect. It will go through the csv file and group all the packages that are referenced by the map and gather all the sizes. With the following function, by passing the full path of the csv file you should get the amount of size references by the map and the assets, and you can keep expanding this function for any needed functionality as you have all the references of the map. Of course the functions need to be run in an editor environment.

static FString TrimCsvField(const FString& In)
{
    FString Out = In;
    Out = Out.TrimStartAndEnd();
    if (Out.StartsWith(TEXT("\"")) && Out.EndsWith(TEXT("\"")) && Out.Len() >= 2)
    {
        Out = Out.Mid(1, Out.Len() - 2);
    }
    return Out;
}
 
bool UMyOwnBlueprintFunctionLibrary::EstimateCookedMapSizeFromCsv(const FString& CsvFilePath)
{
    if (!GEditor)
        return false;
 
    UWorld* EditorWorld = GEditor->GetEditorWorldContext().World();
    if (!EditorWorld)
        return false;
 
    UPackage* MapPackage = EditorWorld->GetOutermost();
    if (!MapPackage)
        return false;
    
    const FName MapPackageName = MapPackage->GetFName();
 
    
    TSet<FName> AllPackages;
    GatherDependenciesRecursive(MapPackageName, AllPackages);
    
    UE_LOG(LogTemp, Log, TEXT("Dependent packages: %d"), AllPackages.Num());
 
    if (AllPackages.Num() == 0)
    {
        UE_LOG(LogTemp, Warning, TEXT("No dependencies found for map."));
        return false;
    }
 
    TSet<FString> RelevantRelPaths;
    for (const FName& PkgName : AllPackages)
    {
        const FString LongName = PkgName.ToString();
        FString RelPath;
        if (PackageNameToContentRelative(LongName, RelPath))
        {
            RelevantRelPaths.Add(RelPath);
        }
    }
 
    if (RelevantRelPaths.Num() == 0)
        return false;
    
 
    TArray<FString> Lines;
    if (!FFileHelper::LoadFileToStringArray(Lines, *CsvFilePath) || Lines.Num() < 2)
        UE_LOG(LogTemp, Error, TEXT("Failed to read CSV: %s"), *CsvFilePath);
 
    TArray<FString> HeaderFields;
    Lines[0].ParseIntoArray(HeaderFields, TEXT(","), true);
 
    int32 FilenameCol = -1;
    int32 SizeCol = -1;
 
    for (int32 i = 0; i < HeaderFields.Num(); ++i)
    {
        const FString Header = HeaderFields[i].TrimStartAndEnd();
        if (Header.Equals(TEXT("Filename"), ESearchCase::IgnoreCase))
        {
            FilenameCol = i;
        }
        else if (Header.Equals(TEXT("CompressedSize"), ESearchCase::IgnoreCase) ||
                 Header.Equals(TEXT("Size"), ESearchCase::IgnoreCase) ||
                 Header.Equals(TEXT("FileSize"), ESearchCase::IgnoreCase))
        {
            if (SizeCol == -1)
            {
                SizeCol = i;
            }
        }
    }
 
    if (FilenameCol == -1 || SizeCol == -1)
    {
        UE_LOG(LogTemp, Error,
            TEXT("Could not find 'Filename' and size column in CSV header. (%s)"),
            *CsvFilePath);
        return false;
    }
 
    int64 TotalBytes = 0;
 
    for (int32 LineId = 1; LineId < Lines.Num(); ++LineId)
    {
        const FString& Line = Lines[LineId];
        if (Line.IsEmpty())
        {
            continue;
        }
 
        TArray<FString> Fields;
        Line.ParseIntoArray(Fields, TEXT(","), true);
 
        if (Fields.Num() <= FMath::Max(FilenameCol, SizeCol))
        {
            continue;
        }
 
        FString FilenameField = TrimCsvField(Fields[FilenameCol]);
        FString SizeField     = TrimCsvField(Fields[SizeCol]);
 
        if (FilenameField.IsEmpty() || SizeField.IsEmpty())
        {
            continue;
        }
 
        int64 ThisSize = 0;
        {
            TCHAR* End = nullptr;
            const int64 Parsed = FCString::Strtoi64(*SizeField, &End, 10);
            if (End == *SizeField)
            {
                continue;
            }
            ThisSize = Parsed;
        }
 
        int32 ContentIdx = FilenameField.Find(TEXT("Content/"), ESearchCase::IgnoreCase, ESearchDir::FromStart);
        if (ContentIdx == INDEX_NONE)
        {
            continue; 
        }
 
        FString RelWithExt = FilenameField.Mid(ContentIdx);        
        FString RelBase    = FPaths::GetPath(RelWithExt) + TEXT("/") + FPaths::GetBaseFilename(RelWithExt);
 
        if (RelevantRelPaths.Contains(RelBase))
        {
            TotalBytes += ThisSize;
        }
    }
    
    UE_LOG(LogTemp, Warning,
        TEXT("Map '%s' cooked size (from CSV) = %lld bytes (%.2f MB)"),
        *MapPackageName.ToString(),
        TotalBytes,
        (float)TotalBytes / (1024.f * 1024.f));
 
    return true;
}

Let me know if it did the job.

Best,

Joan

[Attachment Removed]

Hi Joan

Thanks for the answer. I tested the code.

But the _generatedfiles is not part of this. So it is not correct and neither is the .ubulk.

And I would also like to understand which of all these assets are actually shared assets. Since we are making several levels with pcg content that is shared so it is not correct to count those assets every time we have a level since they are shared assets.

We really would like to know the size of a level. without the shared assets that we spawn through PCG.

Any more good ideas?

[Attachment Removed]

Hi Joan,

“the _generated files are not directly extracted or listed in the CSV output.”

The _generated IS IN the CSV. But the code you provided ignored this and the .ubulk.

So the sizes from the code you provided couldnt be correct.

Maybe you could also explain what is in _generated files and what is in the .ubulk file?

kind regards

//

Martin

[Attachment Removed]

Hi Martin,

I’ll take a look at the function again. The idea to also add the size _generated files for the map disk cost shouldn’t take that much work. The _generated and .ubulk files are necessary information for the engine that needs to be added to the build so systems like World Partition or mesh instancing work correctly. The .ubulk files contain large chunks of binary data for the game assets, and they go hand-in-hand with the .uexp and .uasset files to provide indexing and fast data access. This overall improves the loading performance and content streaming for assets loading with higher/lower resolution depending on the game’s needs.

For each asset, the .ubulk file contains necessary information, but it does not store the same information for every asset. If you would like to debug what gets into the asset, you should look at the Serialization process of each asset. Check what gets saved and, most importantly, look for any cooking distinctions, which are often found in the Serialize function. For example, TArrays use BulkSerialize functions to bulk-serialize structs or raw data directly into the final .ubulk assets.

So, it really depends on the asset. With textures, for example, you can see that almost all the size is in the .ubulk file. Reducing the resolution or the number of mipmaps of the texture will directly reduce the size of this .ubulk file.

_generated files are primarily for instancing data and World Partition. This includes the information needed, again, to stream and give the necessary information to the engine to handle the instance itself.

Some teammates told me that what they normally do to detect the raw disk size of the map is just to build that single map. Of course, you should ensure that all assets are correctly added to the build, but this method should give you the raw disk size.

I will take a look at the function again, as it would be useful to have a function like this in the engine. I will update you shortly.

Best,

Joan

[Attachment Removed]

“Some teammates told me that what they normally do to detect the raw disk size of the map is just to build that single map. Of course, you should ensure that all assets are correctly added to the build, but this method should give you the raw disk size.”

I am building and checking. Question is as stated in the first post “we would like to understand how big a map is and then what is shared with other maps through PCG and other systems”

Or is it the complete truth that counting the _generated and .ubulk is the size of the level without shared assets?

kind regards

//

Martin

[Attachment Removed]

Replying so this ticket isn’t closing. Waiting for update from Joan

Kind regards

//

Martin

[Attachment Removed]

Ok, thanks for your help

Kind regards

//

Martin

[Attachment Removed]

Hi Martin,

Sorry for the delay, I’ve been out for a few days.

You are right: the _generated files are not directly extracted or listed in the CSV output. Normally, a map loads content based on how assets are referenced. Assets placed in the map and their hard references are automatically loaded by the engine when the map loads, along with any additional assets they reference.

PCG content is made up of the meshes and textures that are used, plus small instance-related data (such as transforms or metadata). The heavy data, vertex/index buffers, Nanite data, texture mips, etc, comes from the underlying assets themselves and isn’t duplicated per instance.

I do think the Asset Registry should expose this information more clearly, but it’s not displayed by default. Building a tool to calculate the “total disk cost” for a map shouldn’t be too much work. I’ll do some research and check with the team to see if an existing tool or command already supports this. I’ll follow up shortly.

Best,

Joan

[Attachment Removed]

Hi martin,

Yes, these files are most of what the size of the map is. I’m still looking into the actual cooker and see what of the data is considered as part of the map so that we don’t miss any information.

There are also external tools like FModel which allows you to check the content of pak files, utocs and more that can make you quickly check what is inside your maps. It is a wrapper around what there is already in unreal like the AssetRegistry but visual, and easier to look at. Would recommend to take it a look as it might help you to reduce also the size of your projects.

Those are the main files that form the final size of the map, but will keep looking into the cooker to have a total complete response as it does not look that there is something in the engine that already does this.

Will keep updating.

Best,

Joan

[Attachment Removed]

Hi [Content removed]

Sorry for the late reply. I’ve just returned from vacation. I hope you had great holidays as well.

After reviewing the whole case again, I think the best solution, since there is nothing built in that already solves this, is to cook the content of individual maps.

Using the Project Launcher, you can create separate profiles for each map or group of maps you want to test. Once you cook a specific level, you can then inspect the resulting .pak file directly. As you already know, you can list the contents of the .pak to see what was included.

[Image Removed]

Alternatively, you can extract the .pak file, which will give you a folder structure that mirrors how the assets are stored inside the Unreal project. This usually makes it much easier to see which individual assets are present and how much disk space each one occupies.

//List all the files
UnrealPak.exe "YourPathPak.pak" - List
UnrealPak.exe "YourPathPak.ucas" - List
 
//Export the content to a folder  
UnrealPak.exe "YourPathPak.pak" - Extract "/FolderToExtract/"
UnrealPak.exe "YourPathPak.ucas" - Extract "/FolderToExtract/"

In practice, extracting the .pak and analyzing the files directly is simpler and more reliable than relying on Asset Registry queries in the editor, as those depend on what assets are loaded and may produce inconsistent results.

The next step would be to create a small script that reads the extracted data and sorts it by disk size. This can easily be done with a simple Python script, although you could also use Unreal’s Python API if you prefer to keep everything inside the engine.

As a short recap of some of the concepts we discussed over the last few days:

  • Bulk files: These contain the raw binary data associated with a .uasset. Most asset classes that store bulk data use TBulkData internally, with FByteBulkData being one of the most commonly used types. This is where the cooked payload of the asset lives. For example, texture mipmaps or higher texture resolutions directly increase the size of the bulk data.
  • _generated: This contains data required by World Partition for instances placed in the world. It includes the information the engine needs to render, transform, and stream those instances at runtime.

If you want to better understand how this data is generated, you can look at the Serialize() function of the asset type you’re interested in. On the cooking side, UCookOnTheFlyServer::TickMainCookLoop() is a good entry point to see how assets are loaded, processed, and saved during the cook.

Overall, I believe this is the most reliable approach. All the information you need is already present in the cooked .pak output, and by cooking maps individually you can isolate and measure the disk size contribution of each level. I would personally recommend creating individual Project Launcher profiles so they can be reused in the future.

Best regards,

Joan

[Attachment Removed]

Happy to help Martin.

I’ll close the case for now, feel free to open it again if there is any reason related to this same topic.

Kind Regards,

Joan

[Attachment Removed]