Hello, I’m currently researching memory consumption under `ChaosAcceleration` tag and found weird behavior connected to `Chaos::TAABBTree`.
PS5 Memory Analyzer shows that `ChaosAcceleration` consumes ~800 MB with call stacks pointing to `Chaos::TAABBTree` array allocations. I stumbled upon `list sq` console command which outputs spatial acceleration stats and ran it. Below is the output:
`----- Begin SQ Listing ----------------------------------------
----- World SQ ----------------------------------------
SQ Data for world [REDACTED]
Bucket 0 (2 entries):
Entry 0
Contains 486982 elements
Max depth is 0
Avg depth is 0.000
Dirty element count is 0
Payload container size is 618286 elements
Payload container capacity is 691196 elements
Allocated size of payload container is 16588704 bytes (34 per tree element)
Dirty Tree:
Contains 38655 elements
Max depth is 0
Avg depth is 0.000
Dirty element count is 0
Payload container size is 0 elements
Payload container capacity is 0 elements
Allocated size of payload container is 0 bytes (0 per tree element)
Entry 1
Contains 13900 elements
Max depth is 28
Avg depth is 17.997
Dirty element count is 0
Payload container size is 618338 elements
Payload container capacity is 840803 elements
Allocated size of payload container is 20179272 bytes (1451 per tree element)
Dirty Tree:
Contains 0 elements
Max depth is 0
Avg depth is 0.000
Dirty element count is 0
Payload container size is 0 elements
Payload container capacity is 0 elements
Allocated size of payload container is 0 bytes (0 per tree element)
Bucket 1 (0 entries):
Bucket 2 (0 entries):
Bucket 3 (0 entries):
Bucket 4 (0 entries):
Bucket 5 (0 entries):
Bucket 6 (0 entries):
Bucket 7 (0 entries):
----- Cluster Union SQ ----------------------------------------
----- End SQ Listing ----------------------------------------`I have a few questions related to this:
- What are those buckets used for? Is it OK that only one is active? Why are there multiple entries under the first bucket?
- Entry 0 shows that all 487K entries are attached to the root node (Max depth is 0), thus the tree structure doesn’t work at all here. Why could this happen?
- Only ~16 MB allocation is shown under Entry 0, which seems to be far from wat it has actually allocated. There are additional arrays (e.g. `WorkPool`), is there any way to check memory stats on those?
Thank you in advance!
Hi Evgeniy!
- Yep the bucket numbers are fine here. The index relates to this Enum located in SpatialAccelerationCollection.h - 4 are named with 4 able to be used for custom things[Image Removed]
Buckets wise, 0 relates to the static objects while 1 relates to the dynamic elements looking at the debugger output
2.+ 3. I’ll take a deeper look into these and get back to you. Initially 2 looks like that will mean the objects are all in leaves, but not a hierarchy.
For 2. This is a false positive. The debug output here is looking for a parent pointer which doesn’t exist in the static tree (the generation code doesn’t set the parent - not 100% here and I’ve asked for clarification from dev side). Looking at the code though the parent does have children as expected, and could be traced from top to bottom. I’m also going to find out if we have a better query for this situation since that may be something we haven’t maintained (which can happen a lot in codebases this large!)
It does set the parent in the dynamic version, which seem to go through a different code path which then calls the code to set a parent.
There is however a situation where this algorithm can fail (and where you don’t end up with a tree), and that is where the algorithm can’t split up the objects. The most common case for this is if you have a load of geometry all on top of each other with the same bounds, that is pretty synthetic as a scenario though.
To complete point 2, the ‘list sq’ was added fairly recently for an internal product release. For static meshes it won’t work as there is not a scenario when we need a parent pointer (this tree is expected to be pretty much immutable - if it does change, the whole tree is rebuilt). All accesses are top down.
I need to check into point 3 a little bit, so will likely be able to get back to you next week on that!
Geoff
Hi Evgeniy, can I check if you’ve used this yet to look at where the allocations are happening?
https://dev.epicgames.com/documentation/en\-us/unreal\-engine/memory\-insights\-in\-unreal\-engine
Geoff
Hi Geoff, I’m using PS5 Memory Analyzer with a few fixes (out-of-the-box, it yields duplicate allocations and some allocations are not passing through LLM, thus, are missing).
I wrote code to track `TAABBTreeLeafArray` memory consumption, and found out that we have a total of 535K arrays with ~2.5M live elements, while ~5.4M are reserved. Thus, it consumes 540 MB, while 285 MB is lost due to reservation (most arrays reserve 18 elements, while actually containing much less).
Looks like, I will need to adjust code for a less aggressive array reservation to save some memory.
Am I correct that elements represent physical bodies + their AABB’s here? What does the number of arrays depend on?
Hi Eugene,
That is a very interesting find. It looks like the growth of our arrays is causing a higher memory cost than what we’d expect - but I can see how it is happening - there is always tradeoffs between how much memory is reserved vs how often we need to reseat an array in memory (to find a large enough area for it to be located in). Although we don’t do it - for the things which are pretty static, reserve() in the array would be really good to call since it skips the initial reseating of memory and gets an exact answer straight away (which also means less fragmentation).
In this scenario - elements may refer to shapes in this context (but as proxies - ie the bare minimum of information needed for the AABB) and the arrays are the objects/bodies - I am going to double check this though just to be sure.
Best
Geoff
Hi Eugene - just as well I checked..
Elements are the actual bodies in this case, and the arrays are the leaves of the tree (we tend to have ~8 bodies per leaf)
Geoff
Another suggestion would be to trim the memory of the static trees once they are generated. If you call Leaves.Shrink() and Nodes.Shrink() at the end of GenerateTree you will save some memory. For Leaves though you will need to fix up the template like so in AABB.h
[Image Removed]This is something which is now on our radar, so this should hopefully be improved in a later release!
Geoff
We are running into this same situation - the spatial acceleration structure is taking a lot of memory. The way we build our maps might contribute to this - we have large maps built with small meshes, so lots and lots of instances pushed into the acceleration structure.
Could you give some some tips on where we could leverage reserve() to get some of this back?
Also, the way we partition our maps and move through them means that we stream cells in and out very rarely, so maybe we could regain memory via calls to shrink / trim some containers - do you have suggestions on which to try this?
You can definitely try eliminating reservations and checking how much RAM you can get from that.
To do that, add a custom allocator to TAABBTreeLeafArray::Elems and update TAABBTreeLeafArray::Reset() to use ‘Elems.Empty();’ instead of ‘Elems.Reset();’.