I’m having an issue with my game where it is leaking ~8MB of memory for each level transition. Especially for hosting on a small linux VPS I have, this leak is very bad.
Some properties of the leak I know of:
- It occurs on Linux and WindowsServer
- I’ve seen the issue since 4.9, am on 4.12.2 now
- It is specific to dedicated server. While I see smaller “leaks” on standalone configurations, that could just be other state that goes along with long running UE processes, and is nowhere near 8MB
- It is independent of user interaction. I have resorted to automatically traveling between levels without any user connecting to it and confirm that the leak occurs
- I use the DebugGame build flavor. I doubt this is relevant but I have noticed it is an uncommon configuration
- The method I’m using to determine leaks is the high level working set commit (what you see in task manager on Windows or top on linux, “Current Memory” in memreport)
- I only do one manual memory allocation outside of UE4, which is adds up to two 128x128x32 byte grids. I’ve verified they aren’t leaking and the math wouldn’t check out anyways.
As I mention above, I have tried to use memreport -full to debug the issue, but it hasn’t been very fruitful. Are memreports only helpful at debugging the specific world, or for the entire process? Comparing memreports, the only meaningful deltas are:
- Current Memory grows by 8.6MB each level
- “StaticMesh Total Memory - STAT_StaticMeshTotalMemory - STATGROUP_Memory - STATCAT_Advanced” gets listed with negative value that decreases reliably by 15280.
- For first to second level transition only, memory “allocated in pools” grows 320kb
I attached four samples that show memreport -full ran ~5 seconds into a level where there is no activity, and where the travel to the next level occurs at ~5s after that. I froze the seeding of my procedural generation so its the same level over and over. I would say the most promising lead here is there is something wrong with StaticMeshes? What do negative values mean? I’ve also tried to use the instructions for debugging leaks found at https://www.unrealengine.com/blog/dealing-with-memory-leaks-in-ue4, but those instructions do not appear to be very appropriate for debugging memory leaks for inter-level scenarios. With profiling enabled, loading levels takes a very long time (I gave up after 5 minutes). Even if it worked, I’d have a tough time teasing out what is a leak vs what was actually loaded for a level and not killed yet, as I’m not sure what part of the game’s lifecycle is leaking here.
At this point I’m considering other manual memory techniques like attaching AppVerifier and manually debugging heaps, but that’s a really big and expensive hammer when it’s likely there is something relative simple happening, e.g., the entire old World is sticking around forever. Is anyone aware of some technique or UE tool I can do to debug this? It’s preventing me from releasing an alpha for people to try because my servers have a habit of dying in the middle of sessions due to running out of memory.
Update 19-Jun:
Using some A/B testing of removing “stuff” and seeing what happened, it looks like the leak has to do with the Map itself. I constructed a totally new map that looks pretty much the same, and started using it. After doing this, the leak has dropped from the 8.6MB to around 600KB, which is way better. An interesting thing to note is the leaks are on the order of the Map disk sized themselves: original map size 10MB, new map 677KB. Does anyone know of any gotchas with server travel and Maps that can leave behind maps traveled from forever?