GregOrigin - PSO Autopilot: Shader Compilation Warm-up System to Prevent Lag & Crashes

Watch it in action.

Read the manual.

A functional core OSS version is available on GitHub: https://github.com/gregorik/PSO-Autopilot. The present Fab version is fully featured, including all USP's described in the manual.

PSO Autopilot is a reliable, production-ready solution to Unreal Engine 5's notorious shader compilation stutter and lag.

The "Shader Compilation" plugins on various marketplaces attempt to solve stutter by brute-forcing the Engine: loading all of your project's assets into memory simultaneously. If you are building a massive 50GB open-world game, this simplistic approach guarantees catastrophic RAM spikes, completely frozen loading screens, and eventually, Out-Of-Memory (OOM) crashes on consoles or lower-end PCs.

PSO Autopilot is different: it was created from the ground up to orchestrate Unreal's native hardware pipeline asynchronously, managing memory, threads, and driver caches with architectural precision.

Core functionality deep-dive:

  1. Memory-Safe Chunking: The plugin uses FStreamableManager to asynchronously load constrained batches of assets (default 100). It unloads them and explicitly forces Garbage Collection (GEngine->ForceGarbageCollection) between batches, which effectively mitigates Out-Of-Memory (OOM) crashes on large projects.

  2. Time-Sliced Game Thread Yielding: The ProcessBatchTimeSliced() function respects the MaxProcessingTimeMsPerFrame budget (default 5ms), breaking up the workload and yielding the Game Thread to keep UI animations smooth.

  3. Smart Cache Skipping: The plugin generates an MD5 fingerprint to bypass redundant warmups on subsequent boots by reading/writing to GGameUserSettingsIni.

  4. Engine PSO Pacing: It checks FShaderPipelineCache::NumPrecompilesRemaining() to ensure it doesn't overrun the engine's built-in background shader compilation queue.

Update:

[1.0.1] - 2026-03-08

Critical Fixes

  • Empty FName poisoning in asset queries - An empty DirectoriesToScan entry (Path="") in DefaultGame.ini silently injected an empty FName into FARFilter::PackagePaths, causing GetAssets() to return zero results with no error. Added an IsEmpty() guard in ScanForAssets() and removed the poisoned config line.
  • Asset registry race condition in -game mode - IsLoadingAssets() returned false before the background asset gatherer had even started, producing an empty scan. Added Registry.SearchAllAssets(true) to force synchronous registry population before querying.
  • Non-material assets silently skipped - ForceAssetWarmup() only handled UMaterialInterface directly. Meshes, particle systems, and other assets referencing materials were ignored. Replaced the raw property scan with FReferenceFinder to extract all referenced UMaterialInterface objects from any loaded asset.
  • BootLoader delegate leak - APSOAutopilotBootLoader had no EndPlay override. Subsystem delegates were never unbound on actor destruction, leading to dangling callbacks and potential crashes on level transitions. Added proper delegate unbinding in EndPlay.

High-Priority Fixes

  • Dead code removal - Removed ~50 lines of unreachable BuildWarmupFingerprint() declaration and implementation that had been superseded by the async thread-pool fingerprint computation.
  • Missing StopWarmup() method - There was no way to cancel an in-progress warmup. Added StopWarmup() as a BlueprintCallable method that resets the state machine, releases loaded assets, and broadcasts completion.
  • Fragile material extraction via raw property iteration - The original ForceAssetWarmup walked UObject properties manually to find material references, which was brittle across engine versions. Replaced with the engine's FReferenceFinder utility for robust, recursive reference discovery.
  • Untracked forced-streaming textures - Textures forced into VRAM via SetForceMipLevelsToBeResident() were never tracked. Added ForcedStreamingTextures tracking set so residency can be refreshed or released post-warmup (60-second extended residency applied on completion).

Medium-Priority Fixes

  • Fingerprint future timeout - The async fingerprint computation on the thread pool had no timeout guard. If the thread pool stalled, the subsystem would hang indefinitely in the Fingerprinting state. Added a 10-second timeout with fallback to force a full warmup on timeout.
  • CDO settings mutations stacking across PIE runs - APSOAutopilotDemoManager mutated the settings CDO (Class Default Object) directly without calling ReloadConfig() first. Settings overrides from previous runs accumulated. Added ReloadConfig() before applying overrides to start from clean defaults each run.
  • Custom HUDClass ignored during forced validation - When bForceValidationRun was true, the demo manager always spawned the default widget, ignoring any user-specified HUDClass. Fixed to respect the configured HUD class in all code paths.
  • StartWarmup re-entry from Finished state - The state guard only checked CurrentState != EPSOWarmupState::Idle, but a completed warmup left the state at Finished. Calling StartWarmup() again was silently rejected. Updated the guard to allow restart from both Idle and Finished states.
  • Deprecated GetUsedTextures 5-parameter API - UE 5.7 changed the GetUsedTextures signature, producing compiler warnings. Updated to the current 4-parameter overload.

Low-Priority Fixes

  • CanContainContent: true with empty Content directory - The .uplugin declared CanContainContent: true but the plugin ships no content assets. Set to false to avoid Fab validator warnings and unnecessary Content directory expectations.
  • Virtual Texture preheat incomplete - UVirtualTexture2D assets were discovered but UpdateResource() was never called during the streaming phase. Added explicit UpdateResource() calls and tracking via PendingVirtualTextures set to properly page virtual textures into GPU memory.
  • Flat spinner widget appearance - The spinner used WhiteSquareTexture which looked flat and dated. Replaced with a gradient-based circular spinner with rotation animation and pulsing opacity for a polished loading indicator.
  • Redundant BuildFallbackWidgetTree() calls - Both NativeConstruct implementations (DemoWidget and LoadingScreenWidget) called the fallback widget builder unconditionally, even when UMG bindings were present. Added checks to only build the C++ Slate tree when no designer-bound widgets exist.
  • Streaming telemetry message missing resource count - The "Waiting for texture streaming..." log message didn't include how many resources were pending. Added GetNumWantingResources() count to the status broadcast for better diagnostics.

Architecture

  • State machine: 10-state EPSOWarmupState enum drives all subsystem behavior through Tick().
  • Module loading phase: PreDefault to ensure availability before gameplay subsystems.
  • Async fingerprint: MD5 hash computed on thread pool (engine version + scan directories + asset paths + timestamps).
  • Time-sliced processing: Per-frame budget (default 5ms) prevents frame hitches during shader compilation.
  • Memory-safe batching: Configurable batch size (default 100) with explicit GC between batches.

2026-03-16
Fixed

Replaced shader-cache-only warming with real PSO precache paths for warmed assets.
Added asset-specific PSO precache for primitive assets through transient static mesh and skeletal mesh components.
Added explicit material PrecachePSOs(...) fallback with local, nanite, and particle vertex factory coverage.
Switched batch completion to plugin-owned PSO request and graph-event tracking instead of waiting on the global engine precompile backlog.
Logged async load failures, counted them against the run, and blocked fingerprint persistence after any warmup failure.
Sanitized scan settings before use so blank or duplicate directories and invalid class paths cannot collapse asset discovery.
Waited for pending virtual texture initialization and streaming before marking a batch complete.
Restored mutable settings after demo and boot overrides so changes do not leak across warmup runs.

Changed

Warmup execution now uses a per-run settings snapshot instead of rereading mutable defaults mid-run.
Added the module dependencies required by the new PSO precache path.
Synced the same source changes into Plugins/PSOAutopilot, PluginBuild, and PluginStaging.

Update [1.0.2] - 2026-03-21

Fixes

  • Texture streaming infinite hang β€” The StreamingTextures state had no timeout, so a single texture that never finished streaming could stall the entire warmup indefinitely. Added a 30-second timeout that logs a warning and advances to the next batch.
  • Batch load stuck-state recovery β€” FStreamableManager::RequestAsyncLoad could silently never complete if an asset was corrupted or had cascading dependency issues. Added a 60-second watchdog timer in the LoadingBatch state. On timeout, the pending load is cancelled, the batch is marked as failed, and processing continues with the next batch. BatchLoadStartTime tracking added to Initialize(), StartWarmup(), and BeginLoadingBatch().
  • Texture mip residency too short β€” SetForceMipLevelsToBeResident() was called with a 30-second duration, but large warmups could take several minutes. Textures loaded in early batches would drop their high-res mips before the loading screen finished. Increased to 300 seconds (5 minutes).
  • SearchAllAssets hitching the Editor β€” The synchronous Registry.SearchAllAssets(true) call (added in v1.1 for -game mode) was also running in Editor builds where it caused a multi-second hitch. Wrapped in if (!GIsEditor) guard so it only runs in standalone game mode where the async gatherer hasn't started yet.
  • Multi-fingerprint cache β€” The fingerprint system only stored a single LastCompletedFingerprint value, meaning different scan configurations (e.g., multiple Boot Loaders scanning different directories) would overwrite each other. Replaced with an LRU cache storing up to 8 completed fingerprints as a comma-separated CompletedFingerprints key. Legacy single-entry format is migrated automatically on first read.
  • Incomplete GC purge between batches β€” ForceGarbageCollection(false) only collected unreachable objects but didn't purge them from memory. Changed to ForceGarbageCollection(true) for a full purge, ensuring batch memory is actually reclaimed before the next load.
  • Transient mesh components parented to wrong outer β€” UStaticMeshComponent objects created for PSO precaching were parented to the subsystem's outer (the GameInstance), which prevented them from being garbage collected between batches. Changed to NewObject<UStaticMeshComponent>(GetTransientPackage()) so they live in the transient package and are properly cleaned up.
  • Boot Loader level transition crash β€” UGameplayStatics::OpenLevel() was called without validating that the target level exists. On a typo or missing map, this could crash or produce cryptic errors. Added FPackageName::SearchForPackageOnDisk() validation with a warning log if the level is not found.
  • Dead Outer parameter β€” After fixing transient components to use GetTransientPackage(), the UObject* Outer parameter on QueuePrimitiveAssetPSOPrecache() became unused dead code. Removed from both the function signature and all call sites.
  • Demo widget telemetry panel cut off β€” The bottom-left telemetry panel in UPSOAutopilotDemoWidget was sized too small (640x280) and positioned where it clipped off-screen. Enlarged to 920x420 and repositioned from (36,-36) to (24,-24).
  • Demo widget text overflow β€” Long status messages, cache comparison text, and metrics text overflowed the panel bounds. Added SetAutoWrapText(true) on MetricsText, CacheStatusText, and ComparisonSummaryText.

Documentation

  • Comprehensive manual rewrite β€” Expanded Docs/index.html from 595 lines to ~850 lines:
    • New State Machine section with visual diagram and per-state descriptions
    • New Fingerprint & Caching section covering MD5 hashing, editor timestamps, and multi-configuration LRU cache
    • New Best Practices section (targeting, batch tuning, dedicated loading levels, standalone testing)
    • Full Boot Loader property reference table
    • Configuration section rewritten with HTML tables organized by category (Targeting, Memory, Performance, Visual Continuity, Validation)
    • Troubleshooting FAQ expanded from 3 items to 22 items across 8 categories:
      • Startup & Scanning (4), Shader Compilation (2), Memory & Performance (3), Texture Streaming (2), Fingerprint & Cache (4), Boot Loader (3), Batch Processing (2), Integration & Platform (3), Logging & Debugging (3)
    • Sidebar navigation expanded with sub-entries for all new sections
1 Like

It was already a great product, but your hard work is making it even better. It’s truly amazing now. Thank you!

1 Like