Long-term allocator strategy

This is not a problem, per se. I am just soliciting advice about how to bucket allocations that are known to have usefully distinct lifetimes. It’s a memory management technique we’ve been carrying forward through many games now. When we moved to UE4, it was a bad fit, but we kludged something together. Now, under UE5, that kludge is costing more and more to maintain, and we’re seeking a new approach.

The basic idea is you have allocations that are deemed “short term” -- things that are likely to go away pretty quickly, and those that are “long term” -- likely to hang around a long time, potentially the entire lifetime of the game. For example: Render commands come and go within a frame, while FName blocks or Config data stick around forever. By sending those to different memory pages, we increase the likelihood of being able to hand some of those short term pages back to the OS, reducing memory fragmentation.

Before virtual memory became the norm, we used to spatialize the long-term allocations, pushing them to top-down memory pages. The allocations were physically separate, making a clear “largest contiguous block” visible in our memory mapping tools. With UE4 and a new hardware generation, virtual memory became the pattern for everything, and that spatialization wasn’t worth it as much, but keeping the different lifetimes on different pages was still worthwhile.

Under UE4, with GMalloc, we got a little stuck. We couldn’t see a good way to route these lifetime decisions through GMalloc, so we made a freestanding little allocator, a dumb clone of FMallocBinned2, and sent requests through it by means of custom allocator types on TArray, TMap, (TLongTermArray, TLongTermMap) and such. When we saw allocations sticking around between matches, we changed the types to long term types. It was a beast to implement -- this was before Unreal had allocator templates on those container types -- but it did work.

Right now the cost centers for maintaining this are the high number of engine edits needed to support the custom types, which makes taking new engine versions harder, and the dumb clone of FMallocBinned2, which has changed shape a _lot_ between UE4 and UE5.

Has anyone else tried a similar thing with a better approach? Or is the idea just a non-starter?

Hello Alexander,

Allocation policies based on anticipated lifetime is not something we have leveraged. Introducing an arena-like heap scheme was researched which could have allowed for more surgical management, but it was concluded that the complexity and overhead placed on licensees was too great.

I have heard of some success injecting into the `GMalloc` chain (via `UnrealMemory.cpp`). It is a minimal engine divergence to maintain and something like thread-local scopes can then be used to add context to distribute allocations. Prefixing a header to each allocation or making use of process address space can be used to recover that distribution. By dropping down a layer as it were, one can avoid container template parameters which as you’ve found, can quickly cascade.

-Ridgers.

You’re talking about, like, making a class that pokes a scoped thread-local variable to say “this is a long-term allocation”? Raise that in the container allocator class to signal a divergence. Making a FMalloc descendant class that knows to look for the variable and choose a sub-heap allocator based on it?

I think I’m with you that far. I didn’t follow the part about how to route the realloc/free calls.

Yes, along those sort of lines. I was thinking that you may be able to do something like this to avoid the need to diverge/modify container templates;

// NeatherRealm/MemoryNr.cpp;
    static thread_local int32 LongTerm = 0;
 
    struct FLongTermScope {
        FLongTermScope() { LongTerm++; }
        ~FLongTermScope() { LongTerm--; }
    };
 
    void* FMemoryNr::Malloc(Size, Alignment) {
        if (LongTerm > 0)
            return LongTermHeap->Malloc(Size, Alignment)
        return PrevHeap->Malloc(Size, Alignment);
    }
 
// UE/UnrealMemory.cpp;
    GMalloc = new FMemoryNr(GMalloc);

You’d then be able to do the follow;

void FGame::EntryPoint() {
  FLongTermScope Scope();
  GameArray.Reserve(1024);
  UnmodifiedUEClass.ThisCalleeAlsoGetsLongTermAllocs();
}

It is admittedly a low fidelity solution so may not cover all cases you envisage, but has the benefit of discouraging divergence.

The tricky part is when it comes to Free() calls (and by virtue, Realloc() too). One probably can’t rely on scopes. Thus the hurdle to solves is; given a pointer, how do I know if it is a long-term or short-term allocation? There’s many ways to solve that; over-allocate and store a small header that can be used to regain context, or leverage address space and intersect the pointer to recover the heap that owns it, etc. Rather depends on your context and target platforms really.

-Ridgers.

Yeah, ok, I had a more accurate picture of it than I thought. I don’t love that approach. The trouble on that back side is if my allocator has to make its own markup headers, rather than piggybacking on ones already being made by the underlying allocator, we’d be doubling up on alignment waste. The intersect thing is an interesting idea, but it seems like it would only work if I had a contiguous address range, which I’m sure I can’t guarantee.

My reply was descriptive rather than prescriptive - a couple of off-the-cuff ideas among many as food for thought (I’d probably discount the header approach too). As for intersections, contiguity isn’t a must - it is analogous to spatial partitioning in 2D. The real answer depends your constraints, requirements, and any leverageable guarantees. And perhaps target OS’ and mmap APIs. Redirecting GMalloc may prove a useful place to investigate.

-Ridgers.