How to measure memory allocations within a block of code

I am writing an automation test, and I want to measure the number of memory allocations between two points for benchmarking. For example:

    {
        LLM_SCOPE_BYTAG(RedpointAsync_ValueTask);
        for (int k = 0; k < MacroIter; k++)
        {
            uint64 ValueDuration = 0;
            {
                const uint64 StartTime = FPlatformTime::Cycles64();
                for (int32 i = 0; i < MicroIter; i++)
                {
                    co_await ReturnValue<T>(GetValue);
                }
                ValueDuration = FPlatformTime::Cycles64() - StartTime;
            }
            ValueTotalDuration += ValueDuration;
        }
    }

However, there doesn’t appear to be a way to read “total allocations” from the low-level memory API at runtime. Unreal Insights does show me allocations over time, but not in any sort of useful aggregate way for the short lifetime of this block of code.

GetTagsNamesWithAmount seems like it could get the current size allocated, but since this benchmark is expected to release any allocated memory, it doesn’t give me what I need - I want to know how many heap allocations happened.

There also used to be a working API by calling:

UE::Private::GMalloc->GetAllocatorStats(MemoryStats);

and reading the “Malloc calls” stat, but that doesn’t seem to be working these days. It looks like LocalTotalMallocCalls used to be initialized to FMalloc::TotalMallocCalls.Load(EMemoryOrder::Relaxed) way back in 4.23, but that this code was broken in 4.24 and never fixed (LocalTotalMallocCalls and related variables are now always initialised to 0). FMalloc::TotalMallocCalls doesn’t exist any more, but it is effectively what I want for this benchmark.

What is the correct way to benchmark heap allocations between two blocks of code these days?

Hi,

I would add a TRACE_CPUPROFILER_EVENT_SCOPE or TRACE_BEGIN_REGION/TRACE_END_REGION and keep your code as-it. The goal is to get a start/end time on UnrealInsight timeline. Then, when you have the time of your score, you can go in UnrealInsight query the allocation during your scope, filter by LLMTag and you should get the allocations (and deallocations if you need).

    {
        TRACE_CPUPROFILER_EVENT_SCOPE(MeasureMemoryUsage);
        LLM_SCOPE_BYTAG(RedpointAsync_ValueTask);
        for (int k = 0; k < MacroIter; k++)
        {
            uint64 ValueDuration = 0;
            {
                const uint64 StartTime = FPlatformTime::Cycles64();
                for (int32 i = 0; i < MicroIter; i++)
                {
                    co_await ReturnValue<T>(GetValue);
                }
                ValueDuration = FPlatformTime::Cycles64() - StartTime;
            }
            ValueTotalDuration += ValueDuration;
        }
    }

This could looks like this:

In the allocation view, you can add the StartTime/EndTime/Duration columns, so you will also be able to see if it was deallocated during the scope. If you have overlapping scopes, i.e. that the code you are monitoring can run concurrently, you can add the allocThread/freeThread columns to see which threads does the work. I think you should be able to see the allocations within the scope with that. Let me know if that doesn’t work.

Regards,

Patrick

Just chiming in here with a quick note on allocation tracking in Insights:

The submit of recent allocations is batched in Insights, so it only happens every few milliseconds or so.

That means you won’t have individual timestamps on the allocations in a batch, which can make it very tricky to filter for these properly if your scope of interest is very short:

You will have individual call stacks that you can filter for the function you’re interested in, but the timestamps will likely be misleading in such cases.

Cheers,

Sebastian

Hi,

Sebastian is right, I forget about this details of the memory event time granularity, finding your memory event might be a bit tricky if you have small scope or at the edge of the scope. So you better select larger time range in Insights (See my screenshot). For the runtime thing, I attached a modified version of LowLevelMemoryTracker.h/.cpp with a test case in LaunchEngineLoop.cpp. You can possibly reuse that for your case. See the attached zip.

Here the logs outputted by the test case. It shows allocation count/total allocated/total freed within the scope. I think this is what you wanted. I added a reset function and I reset the stats every 10 loops.

In Insights, it would looks like this. This shows my region, my time range query larger than the scope region and my filter to only get what I wanted.

In my test code, I also added counters with TRACE_MEMORY_VALUE and if you run the game with the ‘counter’ channel enabled like in ‘*-trace=default,frame,cpu,bookmarks,memory,*counter -tracehost=127.0.0.1’, you will also see your counter in Insights:

I would suggest to try this, I just made a very simple tests, so you better verify with a real life case if the runtime tracking is working by verify in Insights if everything matches, then you can add some assertions, errors or something to trigger if you have a regression within your scope.

Regards,

Patrick

CL_51886251.zip(123 KB)

Ah, I was hoping to get this information at runtime, so that the automation test could assert the number of allocations. The intent is to ensure that we don’t regress in the number of allocations that this code is doing in the future (or at least not without us being aware of it).

We temporarily patched the engine locally to record the number of total allocations so we could at least see what is happening, but ideally we’d want a solution that doesn’t involve interactively using Unreal Insights.