I am building a custom ML Deformer model where the runtime part is identical to the VertexDeltaModel. I will refer my question to the source of that one. The function FVertexDeltaGraphDataProviderProxy::GatherDispatchData reads as follows:
void FVertexDeltaGraphDataProviderProxy::GatherDispatchData(FDispatchData const& InDispatchData) { const FSkeletalMeshRenderData& SkeletalMeshRenderData = SkeletalMeshObject->GetSkeletalMeshRenderData(); const FSkeletalMeshLODRenderData* LodRenderData = SkeletalMeshRenderData.GetPendingFirstLOD(0); const TStridedView<FVertexDeltaGraphDataInterfaceParameters> ParameterArray = MakeStridedParameterView<FVertexDeltaGraphDataInterfaceParameters>(InDispatchData); for (int32 InvocationIndex = 0; InvocationIndex < ParameterArray.Num(); ++InvocationIndex) { const FSkelMeshRenderSection& RenderSection = LodRenderData->RenderSections[InvocationIndex]; FVertexDeltaGraphDataInterfaceParameters& Parameters = ParameterArray[InvocationIndex]; Parameters.NumVertices = InDispatchData.bUnifiedDispatch ? LodRenderData->GetNumVertices() : RenderSection.GetNumVertices(); Parameters.InputStreamStart = InDispatchData.bUnifiedDispatch ? 0 : RenderSection.BaseVertexIndex; Parameters.Weight = Weight; Parameters.PositionDeltaBuffer = BufferSRV; Parameters.VertexMapBuffer = VertexMapBufferSRV; } }
The problem is with the bUnifiedDispatch parameter. I have found that, in this case, this flag is always true. The reason is that this data interface has a secondary binding to the MLDeformer component in the deformer graph (because the skeletal mesh component is the primary binding). According to FComputeGraphTaskWorker::SubmitWork:
// 1. Data interfaces sharing the same binding (primary) as the kernel should present its data in a way that // matches the kernel dispatch method, which can be either unified(full buffer) or non-unified (per invocation window into the full buffer) // 2. Data interfaces not sharing the same binding (secondary) should always provide a full view to its data (unified) // Note: In case of non-unified kernel, extra work maybe needed to read from secondary buffers. // When kernel is non-unified, index = 0...section.max for each invocation/section, // so user may want to consider using a dummy buffer that maps section index to the indices of secondary buffers // for example, given a non-unified kernel, primary and secondary components sharing the same vertex count, we might want to create a buffer // in the primary group that is simply [0,1,2...,NumVerts-1], which we can then index into to map section vert index to the global vert index
My understanding is that FVertexDeltaGraphDataProviderProxy::GatherDispatchData should then prepare a single unified invocation for the whole skeletal mesh, even if it has more than one component. The code of the function does that. However, it does not work correctly. When the skeletal mesh has more than one section:
- InDispatchData.NumInvocations is not 1, but rather the number of sections in the skeletal mesh, as determined by UMLDeformerComponentSource::GetDefaultNumInvocations.
- Each invocation will process a single section of the mesh, with its render vertex indexing starting at 0.
What this means is that FVertexDeltaGraphDataProviderProxy::GatherDispatchData will incorrectly set Parameters.InputStreamStart to 0 for all invocations, which will produce an incorrect results for all but the first section (see VertexDeltaModel.ush). My solution has been to define the following:
const bool bIsUnified = InDispatchData.bUnifiedDispatch && InDispatchData.NumInvocations <= 1;
And use that instead of InDispatchData.bUnifiedDispatch in the loop, effectively treating the dispatch as non-unified for meshes with multiple sections. This seems to work.
Is this a bug or am I misunderstanding something? Is it possible to have unified dispatchs with more than one invocation? Should the fix be different?