PhysicalAnimation Crash when EditableComponentSpaceTransforms gets Erroneously Emptied from a Parallel Thread

Hello!

After migrating our project from version 5.5 to 5.6, we’re seeing a 100% crash when we set our character’s SkeletalMesh physics blend to zero, then back up to a positive number, via USkeletalMeshComponent::SetAllBodiesBelowPhysicsBlendWeight.

We haven’t been able to achieve a minimal repro outside of our game’s player character, so I’m unable to provide the exact repro steps.

The root cause from what we could tell in our case was that USkeletalMeshComponent::SwapEvaluationContextBuffers being called from a parallel task that was clearing the contents of the EditableComponentSpaceTransforms.

I should note that this did not occur in 5.3, 5.4 or 5.5 for us, with the same setup, yet we were unable to find any consequential changes engine in the area between the 5.5 and 5.6 versions.

We are currently using the following engine modification in UPhysicalAnimationComponent::UpdateTargetActors to mitigate the crash to unblock our workflow:

void UPhysicalAnimationComponent::UpdateTargetActors(ETeleportType TeleportType)
{
  UPhysicsAsset* PhysAsset = SkeletalMeshComponent ? SkeletalMeshComponent->GetPhysicsAsset() : nullptr;
  if (PhysAsset && SkeletalMeshComponent->GetSkeletalMeshAsset())
  {
    const FReferenceSkeleton& RefSkeleton = SkeletalMeshComponent->GetSkeletalMeshAsset()->GetRefSkeleton();

    // Note we use GetEditableComponentSpaceTransforms because we need to update target actors in the midst of the 
    // various anim ticks, before buffers are flipped (which happens in the skel mesh component's post-physics tick)
    const TArray<FTransform>& SpaceBases = SkeletalMeshComponent->GetEditableComponentSpaceTransforms();

    // @third party code - BEGIN Early out to avoid physical anim crash until Epic can fix it
    if (DriveData.Num() && !SpaceBases.Num())
    {
      return;
    }
    // @third party code - END Early out to avoid physical anim crash until Epic can fix it

    FPhysicsCommand::ExecuteWrite(SkeletalMeshComponent, [&]()
...

We’d like to also know if you see our above mitigation as a potential trap for other issues?

We understand that without a sharable repro, direction will be challenging. We’d be interested to hear though, of any suggestions for what we may be doing incorrectly in our project to create this out-of-order parallel issue.

Here are some links to other licensees facing similar issues that we found added helpful context:

[Crash in [Content removed]

[Crash when applying physical animation [Content removed]

Many thanks!

Hello - I had a look at trying to reproduce this in a simple setup. Initially, I failed too, but then found success when I set the skeletal mesh component to tick during physics (a simple repro is likely to have it set to pre-physics, as that is the default).

I don’t know why this might have changed since 5.5. And of course, UE shouldn’t crash anyway! However, can you check when your SKM is set to update, because in order for physical animation to work correctly, I would expect it to have to happen pre-physics.

Hi Danny,

Thanks for your response - I’m glad that you’ve found a repro. Unfortunately in our case, our skeletal meshes are already set to tick in Pre-Physics. We do have some static mesh attachments on the skeletal mesh which do have a During Physics tick, however. Changing the attached static meshes to Pre-Physics also didn’t change the outcome though.

It may also be worth mentioning that we’re using the an animation leader component to bind apparel to the character (those following skeletal meshes in this case are also set to Pre-Physics tick)

My repro was a bit bogus - setting the skeletal mesh component to tick during physics, with the PAC ticking pre-physics, is “obviously” wrong (shouldn’t crash, but isn’t likely to be representative of a “real” problem).

However, after some experimentation, I was able to repro with both components set to pre-physics. I set up a BP to toggle the physics alpha between 0 and 1 a couple of times per second, and this crashes (or hits your early out), just occasionally (it can sometimes take a few minutes!), but only if physics is set to update async.

I’m trying to figure out why, but could you please let me know if you have physics set to async?

Hello - I just wanted to confirm that I believe (after much investigation!) that your change is safe, and in fact is probably the best/simplest way of solving the problem. I’ve just submitted changes that do essentially the same.

In case you’re interested, the underlying problem was this:

  • The Skeletal Mesh Component has a primary tick that runs the updates the animation graph.
  • This is assigned to the PrePhysics tick group (normally) BUT there is an optimisation where, if there is no physics on the character, it is allowed to run PostPhysics. So when the physics blend is zero, its tick function sets the EndTickGroup to PostPhysics
  • The PAC tick group is PrePhysics, but it has a prerequesite of the SKM tick. That means the PAC tick “inherits” this possibility to run PostPhysics , but of course only when the physics blend is zero (so it shouldn’t matter, right?)
  • However, the Blueprint graph runs after the components, in the PrePhysics tick group. When this sets physics blend = 1, it adds a new SKM secondary tick function to run during EndPhysics, that will process post-physics animation and write it back to the skeletal mesh.
  • So sometimes (depending on timing etc - happens about 1% of the time in my repros!) during the EndPhysics group:
    • The SKM secondary tick function runs. This initiates a parallel task to process the animation, and as part of this, the GetEditableComponentSpaceTransforms() array is transferred from the SKM to the animation context. This tick function then finishes (it will wait for the parallel animation processing to complete)
    • Now the PAC tick gets processed - and now in UpdateTargetActors it finds an empty array (as you spotted!)

The underlying problem is that tick functions are being registered/changed during the tick processing itself. It’s hard to deal with that, so the early out is the simplest and safest option.

The early-out is needed, but another solution is to set the cvar “tick.AnimationDelaysEndGroup=0”. This disables the optimisation that allows the SKM primary tick to run after PrePhysics.

Thanks for finding and reporting this problem!

Hello Danny!

Thanks for continuing the investigation - apologies that there’s been a delay to me responding to your past two messages.

If it’s still helpful - we are not currently ticking physics asynchronously.

That’s great news that you’re happy with the fix. Sounds like it’s been a particularly gnarly issue to track down. This makes sense to me, thank you for explaining in such detail! We’ll keep in mind the cvar approach too so that we can remove our engine modification.

I’ll take a look at USkeletalMeshComponent::TickComponent to see how this works and I’ll do some logging of that state to see when the EndTickGroup changes just to confirm

Thanks again