How to track down memory corruptions?

Hi!

I have a stability problem lately. The game crashes unexpectedly (on various iOS device) and everything points to the memory corruptions.

First of all, all crashes are caused by a segmentation fault

Exception Type:  EXC_CRASH (SIGSEGV)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note:  EXC_CORPSE_NOTIFY

Then when I was tracking the place where it crashes (both via crashlogs and via xcode debugging) and found two mostly crashing places:

In Engine\Source\Runtime\CoreUObject\Private\Serialization\AsyncLoading.cpp line about 133:

for (UObject* Obj : ReferencedObjects)
{
	check(Obj);
	Obj->AtomicallyClearFlags(AsyncFlags);
	check(!Obj->HasAnyFlags(AsyncFlags))
}

when performing Obj->AtomicallyClearFlags(AsyncFlags);

And Engine\Source\Runtime\Engine\Private\SkeletalRenderGPUSkin.cpp line about 1032:

if( Anim.VertAnim != NULL && 
	AnimAbsWeight >= MinVertexAnimBlendWeight &&
	AnimAbsWeight <= MaxVertexAnimBlendWeight &&
	Anim.VertAnim->HasDataForLOD(LODIndex) ) 
{
	NumWeightedActiveVertexAnims++;
}

When performing Anim.VertAnim->HasDataForLOD(LODIndex);

In both situations the uobjects exist but they seem to be corrupted. I’m doing a workaround by checking them using IsValidLowLevelFast() method, but this is not a solution.

I’m currently working on 4.10.2 version of the Engine and I’m aware that there might be fixes for this, but I had those kind of issues back in 4.6, 4.7 and now in 4.10. Looking on the 4.12 source code I can find comments (in the AsyncLoading.cpp) like this one:

// Temporary fatal messages instead of checks to find the cause for a one-time crash in shipping config

So I assume the problem still exists.

My question is - can You suggest a tool, strategy, anything that can help with tracking memory corruptions to find them in the source? How to track a place where the corruption happens? What is Your approach in dealing with those kind of crashes?

Thank You for help in advance.

Regards.

you are checking for null, but that is not enough. just because an object pointer is not null, does not imply that the object is valid. that’s what IsValid checks are for.

Yes, I know this. IsValid() checks if the shared pointer is valid but at this situation I have a raw pointer to the uobject inherited class. IsValidLowLevel() and IsValidLowLevelFast() are useful for checking if the UObject is initialized properly or if the object is correctly aligned in the memory but this situation like above should never happen. There are somewhere memory overrides and they need to be tracked down somehow :confused:

Or… The UObject is during the process of deletion. But I don’t think it would cause a segmentation fault…

“but this situation like above should never happen.” why? if you don’t check if something is valid, and you just check if it is a null pointer, and then you try to deference it, you can get seg faults, because non null pointers can still point to garbage data.

so what reason do you have to believe that this situation should never happen?

Because if at this point those objects are not valid it’s just a poor memory management inside the Engine. I don’t want to tell that programmers at Epic write poor code, because they made awesome job, but if they forgot to do checks for object’s validation here, well… that would be epic fail. That’s why I think the problem is somewhere else.

its your raw pointer causing the crash, why not use smart pointers?

and its your job to check for object validation, don’t blame your crashes on a system that works fine without your code breaking it.

No, it’s not my code. The places of crashes I paste are from the Engine code from AsyncLoading.cpp and SkeletalRenderGPUSkin.cpp :confused:

If the crash occurs because of my action the issue is somewhere else, still it’s difficult to find without a proper tool which I’m looking for right now.

I will edit the original question to point which files I’m talking about.

if its not your code causing the crash, why isn’t the engine crashing on my machine?

whose raw pointer is being dereferenced? is it epic’s or yours?

why are you not using a smart pointer, so it plays nicely with garbage collection?

is your pointer a UPROPERTY?

It’ Epic’s code where the dereference of raw pointer happens.

I’m using smart pointers where I can inside the game code. Not much on the UObject pointers, because I can’t make a SharedPtr of UProperty. Anyway…

Why the game crashes on mine machine? There might be many reasons. I can set up something really bad inside uassets or inside the UClass. There might be a 3rd party library which screw up the memory. There might be a custom runtime module or plugin which do something bad. Anyway, this looks to elusive bug to me. I was just asking what kind of tools guys at Epic uses for tracking such issues. On Linux there is Valgrind, on iOS Simulator there is malloc guard, but can’t find anything that suits me well when testing the game on iOS device.

i don’t know much about tracking down memory errors, but maybe this will help:

i don’t know if there is a one size fits all way to find memory errors, those are usually the most difficult problems to track down.

can you describe more about when it crashes? does it take a while like a memory leak, or does it sometimes happen immediately when you run the game? if you remove your skeletal meshes from the game, does it still crash? if not, maybe some data in you skeletal meshes LODs are corrupted.

Cool! I will check it out. Thanks :slight_smile:

Ok, this stomp allocator helped a lot. It took a while, but we’ve found few elusive bugs in the game code, which was causing crashes in strange places. Anyway, the update to the 4.12 seems to be a necessity, the crashfix list is enormous!

Thanks for advices, the topic is rather closed :slight_smile: