What's the intended way to track memory leaks?

I’m trying to get a good automated test up and running to catch memory leaks. I’m using MallocLeakDetection, and I’m running into a few issues. I’m wondering what the intended way to catch leaks are, or what solutions to our issues are.

The way we’re trying to catch leaks, roughly, is that we’re loading a level FOO up(load #1), then loading it up again(load #2), then loading it up again(load #3), and then printing out any leftover memory allocations from load #2. We don’t print allocations from load #1 because we don’t care about things done on initial load, and we can’t print allocations from load #3 because, of course, they’re still allocated. (I did have to modify FMallocLeakDetection::FCallstackTrack operator== to also compare frame numbers so I can track which load they’re from)

When we did that we of course got some spurious “leaks”. For instance, an array in GObjectHash could resize during load #2, and although the object was later destroyed, the memory showed allocated during that time. We addressed that by removing all GObjectHash allocations from the MallocLeakDetection system right before dumping the callstacks.

That still left us with a few apparently spurious leaks.

In FShader::RegisterSerializedResource we load the shaders for the level, unless they’re already loaded. We’d expect that either loading FOO 3 times in a row would only load the shaders the first time(and catch that they were used in the new level before unloading them), or would load the shaders all 3 times(unloading them for each level before loading the next). In practice, it seems to be arbitrary depending on timing, so sometimes shaders are loaded in load #2 and kept around through load #3, thus showing as a “leak” although they are not.

Also, depending on when the scene renderer is shut down, we sometimes have FD3D11UniformBuffer allocations left over from load #2 (even if we wait until load #4 to print them out)

This doesn’t make the leak detection useless to us - I can still manually look for anything else in the callstacks, but it would be nice if we could get a “pristine” setup so that we could treat any apparent leak as an error(and so I wouldn’t even have to track callstacks to spot them initially - which definitely slows down leak detection)

So…

  1. What’s the intended way to track down leaks?
  2. Is there a way to get the FShaders to either stay loaded long enough for the new level to see which ones it also needs or to go away reliably early and just reload them for the new level?
  3. Is there a good way to sync up the render thread such that I can either reliably keep the FD3D11UniformBuffer allocations around between levels or reliably free them?

Hi,

Sorry for the delay in responding. MallocLeakDetection is a good way of looking for memory leaks in the engine. Another option would be to use the malloc profiler to get a mprof capture of your game over a few runs, with snapshots in appropriate places to mark your reference points. (By the way, let me know if you haven’t come across this system before and I’ll give you some more instructions). The MemoryAnalyser tool then allows you to filter the views to specific snapshot ranges which will give you more or less the same info as the leak detection system, but more functionality to interrogate the current allocation states and find more information.

A lot of systems within the engine allocate global resources on their first use, so you do tend to see them popping up in callstacks which are a little misleading. In an ideal world we would force a lot of these things to be explicitly allocated much earlier on so that it was easier to understand the logs, but for now you’ll just have to ignore them.

I’m including one of our rendering guys here to answer your questions about the shader resource allocation timelines.

Cheers,

Graeme

I actually haven’t looked at the malloc profiler - I’ll investigate that as well - thanks for that lead.

Hi Alex,

1. As you and Graham have found we have a few tools for finding leaks with various tradeoffs. MallocLeakDetection is an easy to use way to find leaks, but it loses a lot of data over time because it accumulates all the callstacks together. The frame number stored is just the earliest allocation seen. If ALL allocations from a given callstack were not freed, then Frame number will not change. See: FMallocLeakDetection::AddCallstack(). So you should probably take that frame number with a grain of salt.

Malloc Profiler is another tool, though it is rather cumbersome. You get data over time but it generates huge amounts of data that can be hard to work worth.

You can also use the ‘obj’ command to mark and dump marked objects for finding leaked UObjects at a high level, which is often what you really want to do.

2. If you are doing syncronous loads via LoadMap() I would really expect this sequence of commands to clear everything out reliably.
// Clean up the previous level out of memory.
CollectGarbage( GARBAGE_COLLECTION_KEEPFLAGS, true );

// For platforms which manage GPU memory directly we must Enqueue a flush, and wait for it to be processed
// so that any pending frees that depend on the GPU will be processed.  Otherwise a whole map's worth of GPU memory
// may be unavailable to load the next one.
ENQUEUE_UNIQUE_RENDER_COMMAND(FlushCommand, 
	{
		GRHICommandList.GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::FlushRHIThreadFlushResources);
		RHIFlushResources();
		GRHICommandList.GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::FlushRHIThreadFlushResources);
	}
);
FlushRenderingCommands();	  

But if you’re doing seamless travel then there’s definitely going to be some timing variance involved as FShaders will share shader bytecode where they can.

3. FlushRenderingCommands() should sync up the rendering thread, but there’s a lot of buffering going on with deletes which the code I referenced above should take care of as well.

Per your point #1
I previously modified MallocLeakDetection to treat allocations on different frame numbers as having non-matching callstacks (even if everything else matches) specifically so that we can trust the frame #, for exactly the reasons you describe.

Per point #2
I was doing the garbage collection already (using RF_NoFlags, but also not in the editor, so it should be the same), but I’ll try the EnqueueRenderCommand/FlushRenderingCommands. We’ll see how that goes, thanks.

It does still seem like FShaders should wait until the new level is loaded before dumping themselves - feels like they sometimes reload when they don’t need to(independent of this memory tracking)

It looks our SetclientTravel ultimately does call LoadMap which already calls the garbage collection code you pointed me to - but it isn’t sufficient?

If you changed the frame number accumulation I bet you get a LOT more data. Surprised you don’t run out of memory honestly but it’ll work. :slight_smile:

FShaders are just a ref-counted rendering resource like anything else. When the old level is unloaded their ref count could drop to 0 and they start the releasing process.

It does look like given the way the FPendingCleanupObjects is working it may actually require two calls to FlushRenderingCommands to pump it sufficiently. If that doesn’t do the trick we can have someone try to repro and see what’s up but it would probably be a while.

I’ll try the double cleanup. Maybe that will consistently clean up shaders as well.
I’m ok with dumping shader references between levels(keeping them is nice but also requires higher peak memory), it’s the timing-dependent inconsistency that was bothering me as the drawbacks of both worlds (same peak memory, but also forced reloads). If the double cleanup forces it one way that would help. I’ll let you know whether that does it.

Thanks.

Hi Alex,

We are marking this report as Resolved for tracking purposes. If you would like to continue investigating the issue, just post a comment to reopen the report.

Thanks,

TJ