Download

Experimental WFH: Cloud DDC

Sep 17, 2020.Knowledge

One of the major challenges of working over VPN is having to rebuild derived data after syncing new content (compiling shaders, compressing textures, and so on). When working on-site, we encourage the use of a shared derived-data cache for storing this content - a network share that the editor can read from and write to in order to cache converted assets. Derived Data Cache | Unreal Engine Documentation

Access to the derived-data cache is typically highly latency sensitive due to being on the critical path for many editor operations, and accessing a network share over VPN can cause large hitches and very slow startup time.

We experimented with a few ways to improve the situation when running the editor from home. First, we tried uploading individual files to a cloud storage provider (this performed worse than regenerating the data locally due to the increased latency and time spent establishing connections). Next, we tried archiving the shared DDC folder and putting that in a cloud hosted drive (this performed poorly due to the sheer size of the DDC and constant churn).

The solution we ultimately found success with is:

  1. We added a DDC backend that records any queries being made against the DDC to a plain text file.
  2. We added an automated test that loads into a few commonly used maps in the editor and runs the play-in-editor command. The DDC recording backend mentioned above generates a text file to a network share.
  3. We added an AutomationTool command (UploadDDCToAWS), which scans all the text files, concatenates the derived data files into 100mb bundles, and uploads them to Amazon S3. The script is run every 4 hours on our on-site build farm.
  4. We added a new DDC backend which can download and uncompress these bundles at editor startup time, removing the latency required to do so on demand. Even downloading 10s of GBs over a home network connection provides reasonable performance due to caching and the re-use of existing bundles between iterations of the UploadDDCToAWS command.

_

1 - Setting Up the Access Recorder

This is done by running the engine with a custom DDC backend configured through the game’s DefaultEngine.ini file which looks something like this:

[EnumerateForS3DDC] MinimumDaysToKeepFile=7 Root=(Type=KeyLength, Length=120, Inner=AsyncPut) AsyncPut=(Type=AsyncPut, Inner=Hierarchy) Hierarchy=(Type=Hierarchical, Inner=Shared) Shared=(Type=FileSystem, ReadOnly=false, Clean=false, Flush=false, DeleteUnused=true, UnusedFileAge=5, FoldersToClean=10, MaxFileChecksPerSec=1, Path=\path\to\your\regular\shared\ddc, EnvPathOverride=UE-SharedDataCachePath, WriteAccessLog="%GAMEDIR%Saved/Logs/DDCAccessLog.txt")

Note: The Shared node in the graph contains a Path parameter pointing to your normally configured network DDC, and a WriteAccessLog parameter specifying where to output the access log to.

Support for the WriteAccessLog attribute was added in CL 12166433.

To enable this DDC backend when running the editor, pass the -DDC=EnumerateForS3DDC parameter on the command line.

2 - Automated Test

Implementation of this process is game-specific, and we do not currently have an example available. You could have someone periodically generate this data manually if necessary, as long as they run the editor with the recording backend enabled.

3 - Uploading Data to S3

The UploadDDCToAWS command is implemented in /UE4/Main/Engine/Source/Programs/AutomationTool/Scripts/UploadDDCToAWS.cs (see CL 12626541), and typically run using the RunUAT batch file. The command takes the following arguments:

-Bucket=… Specifies the name of the S3 bucket to upload to

-CredentialsFile=… Specifies the path to a configuration file containing S3 credentials. See here for the format of this file.

-CredentialsKey=… Specifies the section name within the credentials file to take credentials from.

-CacheDir=… Path to the shared network DDC. Paths from the recorded DDC access logs will be resolved against this base directory.

-FilterDir=… Path to the directory containing recorded access logs.

-Days=… Number of days to retain files in FilterDir. Any files older than this will be removed.

-Manifest=… Specifies the path within the current workspace to store the URLs to the root manifest. This file will be checked into P4. See notes below about how bundle URLs, manifest URLs, and the root manifest interact.

-KeyPrefix=… Object prefix to use for everything uploaded to a bucket.

-Reset If set, existing bundles will not be reused.

4 - Downloading Data at Runtime

Enabling the download of the data from S3 requires a new DDC graph to be configured, or modification of the default one.

To modify the default configuration to use the S3 backend rather than a network share, add a section as follows to your game’s DefaultEngine.ini file:

Shared=(Type=S3, Manifest="%GAMEDIR%Build/S3DDC.json", BaseUrl=“https://foo.s3.us-east-1.amazonaws.com/”, Region=“us-east-1”, AccessKey=“abc123”, SecretKey=“def465”)

Here, the Manifest argument specifies the path to the root manifest submitted to source control by the UploadDDCToAWS command, BaseUrl/Region specify the S3 bucket to download from, and AccessKey/SecretKey specify the credentials to download with.

The S3 derived data backend is implemented in //UE4/Main/Engine/Source/Developer/DerivedDataCache/Private/S3DerivedDataBackend.cpp and S3DerivedDataBackend.h.

The following changelists are required:

12149604 12149624 12155157 12156082 12158937 12195879 12459547 12468805

An option to disable the S3 DDC can be shown in the editor preferences panel by adding the following setting to DefaultEditor.ini:

[EditorSettings] bShowEnableS3DDC=true

Some Implementation Details

  • Text files generated by the recorder (1) are removed from the network share by the UploadDDCToAWS command (3) after 7 days.
  • The S3 backend (4) is configured with a secret key and access key, but these credentials have to be submitted to source control. To improve security, bundles are given a random, unguessable URL, and are indexed by a manifest - also uploaded with an unguessable URL. Paths to the last few days of manifests are stored in a configuration file in Perforce, and are deleted after that. The bucket is configured to deny LIST requests, so even having the access key, secret key and manifest URL will only provide access to the latest bundle data for a few days.
  • Bundles are retained in the active manifest as long as at least 40% of their data is still referenced. Once a bundle is discarded, the data within it can be added to new bundles.

_

List of all CL required to implement the Cloud DDC:

12149604 12149624 12155157 12156082 12158937 12166433 12195879 12459547 12468805 12626541

List of GitHub Commits for users who can’t access Perforce:

https://github.com/EpicGames/UnrealEngine/commit/7d6083104f873abac544d0a9b04ce4dffb9135bd

https://github.com/EpicGames/UnrealEngine/commit/b5fdbe9e35a91d2fa6b7e95603e2df15b9fc9dd4

https://github.com/EpicGames/UnrealEngine/commit/5393bfd155be9f81f337aa8dfeb46c7140c2d06e

https://github.com/EpicGames/UnrealEngine/commit/ff74b40543ac82cf110d7c8b1f84689947691d30

https://github.com/EpicGames/UnrealEngine/commit/4fbcf75b6e59670b4b184b1a48bbde7ea84e11aa

https://github.com/EpicGames/UnrealEngine/commit/31934e0cc13f020f974b1f6326a3f58a85bc24ad

https://github.com/EpicGames/UnrealEngine/commit/8b87619ad5134b7a094b4131829a7d55cf49457f

https://github.com/EpicGames/UnrealEngine/commit/604cd089682edb66ebf1bef76cea9134f54ea893

https://github.com/EpicGames/UnrealEngine/commit/e14b92bc18334ba2671969387f5c305c8f8985d2

https://github.com/EpicGames/UnrealEngine/commit/6f6fe556a9348d7426cfa3e161b61ea6fdc47623