Hi,
My familiarity with Chaos code is lacking, so if anything seems wrong or need more context, please don’t hesitate.
Recently, in a scene we were working on, there was a large number of skeletalmesh being registered and unregistered over multiple frames. During some subsequent exploration of performance, we discovered that FPhysScene_Chaos::OnStartFrame could take as much as 5ms to complete. Diving deeper into the Superluminal trace associated with this, we discovered that most of the time was spent in Chaos::FChaosMarshallingManager::AddDirtyProxy (from Chaos::FRigidBodyHandle_External::SetR in the Chaos::PhysicsParallelFor). And most of the time spent was purely wait time.
From there, we looked at the invalidation pattern around the SkeletalMeshComponent in FPhysScene_Chaos::UpdateKinematicsOnDeferredSkelMeshes and it seemed that an invalidation of the proxies was already done for all of them before the PhysicsParallelFor (see ProxiesToDirty). From that point, we looked at the AddDirtyProxy and realized that it would do nothing if the dirty index was already set. We implied that this meant that, considering everything was invalidated previously, no work would be done there. We also implied that since the Proxy is passed in as a parameter, it can be looked at outside the scope lock on MarshallingManagerLock. From there, by extracting the check outside the write lock, we were able to shave a couple of milliseconds (2-3ms in our local tests) per OnStartFrame.
Sadly, I am unable at this time to provide a repro of this issue. I included two screenshots showing the results we are seeing locally.
Hope this will help improve performance of the Chaos frame!
[Attachment Removed]