USkeletalMeshComponent::InitializeAnimScriptInstance crash on invalid Skeletal Mesh (5.5.4)

I’ve got a crash that I think is universal, but requires perhaps a very specific set of circumstances.

Naively, there’s some very apparent poor null ptr protection.

A shipping build crashes with roughly this at the end of the callstack.

> UnrealEditor-Engine.dll!USkeletalMeshComponent::InitializeAnimScriptInstance(bool bForceReinit, bool bInDeferRootNodeInitialization) Line 1220 C++

UnrealEditor-Engine.dll!USkeletalMeshComponent::SetOverridePostProcessAnimBP(TSubclassOf<UAnimInstance> InPostProcessAnimBlueprint, bool ReinitAnimInstances) Line 4678 C++

UnrealEditor-Engine.dll!USkeletalMeshComponent::execSetOverridePostProcessAnimBP(UObject * Context, FFrame & Stack, void * const Z_Param__Result) Line 5850 C++

UnrealEditor-CoreUObject.dll!UFunction::Invoke(UObject * Obj, FFrame & Stack, void * const Z_Param__Result) Line 7192 C++

UnrealEditor-CoreUObject.dll!UObject::CallFunction(FFrame & Stack, void * const Z_Param__Result, UFunction * Function) Line 1147 C++

There’s a check at 1144 that it hits in a dev build, and 1170 null checks, because apparently that was something someone thought to check for… But then 1220, it hits this crash.

As you can see in the repro steps, there’s a lot of specifics going on here.

I have a 5.6.0 project locally that seems to show the same unsafe code.

Steps to Reproduce
As far as my repro… It’s a multiplayer scenario at the edge of net cull distance.

  1. Have 2 players in a large space with a scenario that applies in a Post Process ABP on event
  2. Difficult - Have one player roughly on the edge of net culling distance for players away from the trigger.
  3. Difficult - Have the other player run out of range and then back in and quickly trigger the change in post process ABP.

This will crash the stationary player if the timing is right.

I used a combination of DisplayAll Character and an object with a button prompt in a test level to help myself find a local repro.

I suspect it might also be related to the default character being adjusted with customization options that we set up to async load in for the remote player based on replicated information.

Note: in Visual Studio, I can see that the player character is valid. It has flags for initialization and begun play and explicitly does not have the flag for destroy in progress.

Note: I have found that I can prevent crashing by removing the check and adding a null check into 1220.

I have proven that my change made a different by adding an ensureAlways inside the condition for SkelMesh != nullptr.

I typically can repro 20% of attempts.

However, I am confident this will lead to an artifact where the post process animation blueprint we requested to apply in will not be applied. For this particular scenario, missing that is low stakes, and probably worth it for this rare scenario.

I would like to know if a different solution would be recommended by epic.

Hi, yeah the suggested fix that you had with the null check on the mesh asset is the right way to go here. We should be dealing with this in the same way that we do in InitAnim since both of those can be called from BP at the moment. Similarly, InitializeAnimScriptInstance needs to handle the possibility that the parallel animation evaluation tasks are running when it’s called (like InitAnim does). The fixed-up function should look something like:

bool USkeletalMeshComponent::InitializeAnimScriptInstance(bool bForceReinit, bool bInDeferRootNodeInitialization)
{
	bool bInitializedMainInstance = false;
	bool bInitializedPostInstance = false;
 
	USkeletalMesh* SkelMesh = GetSkeletalMeshAsset();
	if (SkelMesh && IsRegistered())
	{
		// We may be doing parallel evaluation on the current anim instance
		// Calling this here with true will block this init till that thread completes
		// and it is safe to continue
		const bool bBlockOnTask = true; // wait on evaluation task so it is safe to continue with Init
		const bool bPerformPostAnimEvaluation = true; // That will swap buffer back to ComponentTransform, and finish evaluate. This is required - otherwise, we won't have a buffer.
		HandleExistingParallelEvaluationTask(bBlockOnTask, bPerformPostAnimEvaluation);
 
		if (NeedToSpawnAnimScriptInstance())
		{

Makes sense to me! I have implemented this into my test scenario, and placed an ensure to confirm that I would have crashed without this code, and I’ve hit that ensure a couple times now, and continued past and the game continued to function well.

Related question about line 1220...

if(!NewMeshInstanceClass || NewMeshInstanceClass == *AnimClass || (PostProcessAnimInstance && PostProcessAnimInstance->CurrentSkeleton != GetSkeletalMeshAsset()->GetSkeleton()) || !GetSkeletalMeshAsset()->GetSkeleton())Those GetSkeletalMeshAsset calls. Is there any reason for them to not use the SkelMesh acquired at the beginning of the function? To my knowledge, there’s nothing within spawning the Anim Script Instance that can alter that, but I’m uncertain if there’s an esoteric case I’m unaware of, or it’s just a vestige of when it was just a naive getter returning the pointer directly.

There is a potential for blueprint between because of…

AnimScriptInstance->NativeBeginPlay();
AnimScriptInstance->BlueprintBeginPlay();

So I guess we can’t really make promises… But I feel like that would mess up a lot of stuff.

Yeah, I wouldn’t want to make that change to use the cached ptr for the mesh because it might destabilize some use case in a project. But it would be quite odd to set the mesh asset from BeginPlay on an anim blueprint.

The previous change that I mentioned has been reviewed, I’m going to get it into the 5.7 release.