Hi,
We’re working on optimizing GPU memory usage in our project (currently on UE 5.4), and we’ve encountered an issue with strand-based hair that also reproduces in UE 5.6. In close-up shots of a character’s face, we noticed significant spike in video memory usage. Further investigation revealed that the hair system allocates transient buffers whose dimensions grow linearly with the screen-space bounds of the hair. In certain cases, this results in nearly full-screen buffers that can consume gigabytes of GPU memory.
We understand that these transient buffers (such as Hair.CompactNodeData
and Hair.CompactNodeVelocity
) are intended to be created once per frame (e.g., during AddHairMaterialPass
). However, when pausing the game and inspecting the scene in the Render Resource Viewer, we often observe multiple instances of these buffers - sometimes dozens - even though the hair is rendered only once per frame. The number of buffers appears random between runs or pauses.
This can be reproduced consistently by moving the camera close enough to the character so that the hair occupies a large portion of the screen.
In the Render Resource Viewer, we understand that it displays all resources currently registered in GRHITrackedResources
, which are added and removed through the constructor and destructor of FRHIResource
. Based on this, it seems that some hair-related buffers are not being destroyed immediately and remain registered into the next frame. Could you clarify why these resources persist across frames? Is this expected behavior, or does it indicate a possible leak or delayed cleanup?
I’ve attached a minimal UE 5.6 sample project that reproduces the issue, along with a screenshot showing the multiple buffer instances in the Render Resource Viewer.
Any insights would be greatly appreciated. Thanks!
Hello,
Thank you for reaching out.
I’ve been assigned this issue, and we will be looking into the extra buffers for you.
Hello,
The transient FRDGBuffer “Hair.CompactNodeData” is deleted asynchronously. When an FRDGBuilder is deleted, remaining buffers, including this one, are handed to an async deleter to be deleted on a task thread.
Keep in mind that there is also one of these buffers created per-view, so, if you have multiple views, you will have multiple buffers.
Please make sure to profile a packaged builds on the target hardware as a definitive test.
Please let us know if this helps.
Hello!
That actually highlights the core of the problem - since the transient heap is deleted with a delay (GRHITransientAllocatorGarbageCollectLatency), a number of inactive buffers remain in memory for some time. As seen in the attached screenshot (captured using Radeon Memory Visualizer from a development build of the sample project), some of these buffers share the same virtual memory address, while others have different ones.
Given this, can we assume that some of these buffers from previous frames are still in memory and not being used anymore due to the delayed cleanup?
For example, in a scenario where the camera is moving closer to the character’s head frame by frame, the hair occupies more screen space each time, leading to transient buffers being allocated progressively larger. It seems possible that a new transient heap might be created for some frames if the existing one cannot accommodate the current buffer - further increasing memory usage temporarily.
Is there any best practice or recommended way to mitigate this kind of memory accumulation during such transitions?
[Image Removed]
Hello,
Keep in mind that the Transient Heap destruction controlled by the CVar “RHI.TransientAllocator.GarbageCollectLatency” is not the same thing as the async delete of resources from FRDGBuilder.
Parallel destruction from FRDGBuilder is controlled by the CVar “r.RDG.ParallelDestruction”.
Keep in mind that disabling parallel operations will come with a performance impact.
In our tests, we noticed that all the transient heaps tended to contain other transient resources as well, not just hair data. While it is conceivable that changes to the hair buffer sizes will require a new transient heap, freeing that buffer might not restore it immediately, as other resources are using it.
Please let us know if these CVars help.
Disabling r.RDG.ParallelDestruction does seem to help a bit - we noticed that many of the hair buffers start sharing the same virtual memory address, which means they are likely being reused more often. But this still doesn’t fully solve the problem.
As you mentioned, the transient heap often includes other resources, not just hair. So if a hair buffer ends up in the same heap with another buffer, that heap will stay alive until all resources inside it are freed. This means the hair memory might stick around longer than needed.
Is there any way to allocate hair buffers in a separate heap, so that other resources don’t keep the heap alive longer than necessary? Thanks!
Hello,
RDG is designed to manage the lifetimes of its own buffers, such as “Hair.CompactNodeData”.
If you want to manually manage that memory, you will need to make engine modifications. You will need to manually create and destroy the FRDGPooledBuffer, and use the function “FRDGBuilder::RegisterExternalBuffer(…)” to use it in the RDG passes.
You can see an example of the allocation and use of a FRDGPooledBuffer in the function “UpdateSkyIrradianceGpuBuffer(…)” from ReflectionEnviornment.cpp.
For engine modifications like this, we can offer information about existing functionality, and we leave the details of the implementation to you.
Please let us know if this helps.