Sep 17, 2020.Knowledge
One of the major challenges of working over VPN is having to rebuild derived data after syncing new content (compiling shaders, compressing textures, and so on). When working on-site, we encourage the use of a shared derived-data cache for storing this content - a network share that the editor can read from and write to in order to cache converted assets. Derived Data Cache | Unreal Engine Documentation
Access to the derived-data cache is typically highly latency sensitive due to being on the critical path for many editor operations, and accessing a network share over VPN can cause large hitches and very slow startup time.
We experimented with a few ways to improve the situation when running the editor from home. First, we tried uploading individual files to a cloud storage provider (this performed worse than regenerating the data locally due to the increased latency and time spent establishing connections). Next, we tried archiving the shared DDC folder and putting that in a cloud hosted drive (this performed poorly due to the sheer size of the DDC and constant churn).
The solution we ultimately found success with is:
- We added a DDC backend that records any queries being made against the DDC to a plain text file.
- We added an automated test that loads into a few commonly used maps in the editor and runs the play-in-editor command. The DDC recording backend mentioned above generates a text file to a network share.
- We added an AutomationTool command (UploadDDCToAWS), which scans all the text files, concatenates the derived data files into 100mb bundles, and uploads them to Amazon S3. The script is run every 4 hours on our on-site build farm.
- We added a new DDC backend which can download and uncompress these bundles at editor startup time, removing the latency required to do so on demand. Even downloading 10s of GBs over a home network connection provides reasonable performance due to caching and the re-use of existing bundles between iterations of the UploadDDCToAWS command.
This is done by running the engine with a custom DDC backend configured through the game’s DefaultEngine.ini file which looks something like this:
[EnumerateForS3DDC] MinimumDaysToKeepFile=7 Root=(Type=KeyLength, Length=120, Inner=AsyncPut) AsyncPut=(Type=AsyncPut, Inner=Hierarchy) Hierarchy=(Type=Hierarchical, Inner=Shared) Shared=(Type=FileSystem, ReadOnly=false, Clean=false, Flush=false, DeleteUnused=true, UnusedFileAge=5, FoldersToClean=10, MaxFileChecksPerSec=1, Path=\path\to\your\regular\shared\ddc, EnvPathOverride=UE-SharedDataCachePath, WriteAccessLog="%GAMEDIR%Saved/Logs/DDCAccessLog.txt")
Note: The Shared node in the graph contains a Path parameter pointing to your normally configured network DDC, and a WriteAccessLog parameter specifying where to output the access log to.
Support for the WriteAccessLog attribute was added in CL 12166433.
To enable this DDC backend when running the editor, pass the -DDC=EnumerateForS3DDC parameter on the command line.
Implementation of this process is game-specific, and we do not currently have an example available. You could have someone periodically generate this data manually if necessary, as long as they run the editor with the recording backend enabled.
The UploadDDCToAWS command is implemented in /UE4/Main/Engine/Source/Programs/AutomationTool/Scripts/UploadDDCToAWS.cs (see CL 12626541), and typically run using the RunUAT batch file. The command takes the following arguments:
-Bucket=… Specifies the name of the S3 bucket to upload to
-CredentialsFile=… Specifies the path to a configuration file containing S3 credentials. See here for the format of this file.
-CredentialsKey=… Specifies the section name within the credentials file to take credentials from.
-CacheDir=… Path to the shared network DDC. Paths from the recorded DDC access logs will be resolved against this base directory.
-FilterDir=… Path to the directory containing recorded access logs.
-Days=… Number of days to retain files in FilterDir. Any files older than this will be removed.
-Manifest=… Specifies the path within the current workspace to store the URLs to the root manifest. This file will be checked into P4. See notes below about how bundle URLs, manifest URLs, and the root manifest interact.
-KeyPrefix=… Object prefix to use for everything uploaded to a bucket.
-Reset If set, existing bundles will not be reused.
Enabling the download of the data from S3 requires a new DDC graph to be configured, or modification of the default one.
To modify the default configuration to use the S3 backend rather than a network share, add a section as follows to your game’s DefaultEngine.ini file:
Shared=(Type=S3, Manifest="%GAMEDIR%Build/S3DDC.json", BaseUrl=“https://foo.s3.us-east-1.amazonaws.com/”, Region=“us-east-1”, AccessKey=“abc123”, SecretKey=“def465”)
Here, the Manifest argument specifies the path to the root manifest submitted to source control by the UploadDDCToAWS command, BaseUrl/Region specify the S3 bucket to download from, and AccessKey/SecretKey specify the credentials to download with.
The S3 derived data backend is implemented in //UE4/Main/Engine/Source/Developer/DerivedDataCache/Private/S3DerivedDataBackend.cpp and S3DerivedDataBackend.h.
The following changelists are required:
12149604 12149624 12155157 12156082 12158937 12195879 12459547 12468805
An option to disable the S3 DDC can be shown in the editor preferences panel by adding the following setting to DefaultEditor.ini:
- Text files generated by the recorder (1) are removed from the network share by the UploadDDCToAWS command (3) after 7 days.
- The S3 backend (4) is configured with a secret key and access key, but these credentials have to be submitted to source control. To improve security, bundles are given a random, unguessable URL, and are indexed by a manifest - also uploaded with an unguessable URL. Paths to the last few days of manifests are stored in a configuration file in Perforce, and are deleted after that. The bucket is configured to deny LIST requests, so even having the access key, secret key and manifest URL will only provide access to the latest bundle data for a few days.
- Bundles are retained in the active manifest as long as at least 40% of their data is still referenced. Once a bundle is discarded, the data within it can be added to new bundles.
12149604 12149624 12155157 12156082 12158937 12166433 12195879 12459547 12468805 12626541