Hey Tim, I’ll look into Superluminal to see if it can be leveraged with our player who has been actively communicating with us in case the situation we’re running into is different than the callstack that Dan has provided. I’ll try to drop the sample rate down low since the hitches are usually long enough that even sampling twice a second should still get us multiple confirmed samples with the same callstack on the locked resource.
TIm, I can use your internal file sharing service. Should the slp file have everything bundled in? Do I need to add extra pdbs?
While checking the insights, I do manage to end up with multiple origins for the CreateComittedResource. Here is Jordan’s initial callstack mentioned in the original post. [Image Removed]The superluminal capture has some extras. The one in the screenshot was the longest hitch.
Here is a public link to our onedrive. [https://netorg12424036-my.sharepoint.com/:u:/g/personal/dan_redroverinteractive_com/Ecs19Mlkk5RMhorkxC3JxhIBzHob7M1Qz8BepCb4kWbJCg?e=06Zuxb]
Couldnt login into the box.com and the free accounts dont allow 2gb files.
You are right. I did manage to find one forgotten metahuman 8k texture that shows up.
However I did found other cases as well that based on the extra instrumentation I added are pointing to small allocations.[Image Removed]
It’s okay. No urgency on this.
Hey Kenzo!
I do have an insight of a session that was on a system with high resource usage and the issue would be there almost every frame.
[Image Removed]
As in the normal cases with normal resource usage.
Since we nuked all the 8k textures used in the project we hardly see the spikes anymore.
But they are still there. Is more hard to repro it on demand now. This ones are from today’s playtest on my 9070xt. VRAM was ~10 out of 16gb.
[Image Removed]
I’ll return soon with insights on using the recommended changes to CleanUpAllocations.
Cheers,
Dan
yes - create committed resource path can be slow and should be avoided during critical path.
Is the problem fixed if you increase the poolsize to 32MB and also the max allocation size of big block? (and extend the frame lag as well with a lot). Perhaps it would be wise to make to always keep at least one (or a few blocks) of maximum pool size around.
We’re not using nanite and yet we see this absolutely everywhere. We’re also getting other weird things like FRHIPoolAllocator::Defrag (523.7 ms), STAT_FRHICommandListImmediate_DeleteExtendedLifetimeResources (1.4s) FD3D12UploadHeapAllocator::AllocUploadResource (349.5 ms), FD3D12PoolAllocator::CreateNewPool (234.1 ms)
Hi,
when checking your trace file it looks like it’s slow everytime a heap is allocated but not a committed resource for the pool. Committed resources are allocated when it can be suballocated (non placed resources which don’t require individual state tracking).
It would be interesting to add a profile tag around: VERIFYD3D12RESULT(Adapter->GetD3DDevice()->CreateHeap(&Desc, IID_PPV_ARGS(&Heap))); in FD3D12MemoryPool::Init() and also output the sizes of the heap to see what’s going on.
I haven’t seen this pattern before.
Kind regards,
Kenzo
I managed to do some more testing this week on this.
Increasing the pool size and the frame lag did seem to improve things a bit for Nanite. But I also see stalls elsewhere like others mentioned.
For example here:
[Image Removed]
So I guess by modifying the engine a bit to make sure there are enough big blocks pools available It should fix Nanite streaming stalls mostly. For the other stalls, I have not looked at those code paths yet.
To make the problem happen on that system, VRAM usage needs to be high. I have another application (Photoshop in that case) using a few GB of VRAM. The game itself is well under 10 GB.
[Image Removed]