Niagara GPU sim crashing after update to 5.6

The crash occurs because the PerInstanceData_RenderThread member of the DI proxy hasn’t been set up, and this happens because IsUsedWithGPUScript() never gets set on the underlying DI, and *that* happens because the new tick code (added in 5.6) in FNiagaraSystemInstanceController::GetMaterialStreamingInfo() can cause FNiagaraComputeExecutionContext::CombinedParamStore to be updated without the corresponding call to FNiagaraSystemInstance::InitDataInterfaces() that would normally take care of setting IsUsedWithGPUScript() where needed.

I’m currently testing this change as a fix, but would appreciate advice/suggestions if there’s a better way -

void FNiagaraSystemInstanceController::GetMaterialStreamingInfo(FNiagaraMaterialAndScaleArray& OutMaterialAndScales) const
{
	if (!SystemInstance.IsValid())
	{
		return;
	}

	if (OverrideParameters)
	{
		OverrideParameters->Tick();
	}

	const bool bInterfacesWereDirty = SystemInstance->GetInstanceParameters().GetInterfacesDirty(); // <- ADDED
	SystemInstance->GetInstanceParameters().Tick();
	if (bInterfacesWereDirty) // <- ADDED
	{
		SystemInstance->Reset(FNiagaraSystemInstance::EResetMode::ReInit); // <- ADDED
		if (USceneComponent* Component = SystemInstance->GetAttachComponent()) // <- ADDED
		{
			Component->MarkRenderStateDirty(); // <- ADDED
		}
	}

	for (const FNiagaraEmitterInstanceRef& EmitterInst : SystemInstance->GetEmitters())
	{
		EmitterInst->GetRendererBoundVariables().Tick();



Steps to Reproduce
I can’t yet describe definitively what content setup will repro the problem, but it looks like anything that exercises the new ticks added in FNiagaraSystemInstanceController::GetMaterialStreamingInfo() with GPU sim can potentially trigger the crash.

Hi Jon,

Thanks for the report, I looked over our crash report and we have a very low numbers of crashes so possibly related.

Could you provide the callstack where you are seeing this? The only one I’ve come across is…

INiagaraRenderableMeshArrayInterface::ForEachMesh [Function.h:748]
UNiagaraMeshRendererProperties::GetUsedMaterials(FNiagaraEmitterInstance const*, TArray<UMaterialInterface*, TSizedDefaultAllocator<32>>&) [Function.h:748]
FNiagaraSystemInstanceController::GetMaterialStreamingInfo(TArray<FNiagaraMaterialAndScale, TSizedInlineAllocator<16u, 32, TSizedDefaultAllocator<32>>>&) [NiagaraSystemInstanceController.cpp:180]

And could you elaborate on what “the new tick code” refers to?

Thanks,

Stu

Thanks for proving the information.

Do you have a project that can repro this issue at all?

Thanks,

Stu

No but I’ll see if I can get one together today

> No but I’ll see if I can get one together today

That didn’t work out unfortunately - the most minimal repro I can construct within our project still references a significant quantity of custom code and seemingly needs the exact sequence of events and triggers to successfully reproduce the problem (but worth mentioning: none of that custom code is Niagara customizations, it’s all for level authoring and game logic).

I tried migrating the most minimal repro to a vanilla 5.6 project but of course that failed because of the missing customizations.

So I don’t think there’s a realistic prospect of a repro in this case.

I changed it around a little to read like this, when it comes from concurrent threads we should be able to ignore the flushing.

if ( IsInGameThread() )
{
	FNiagaraParameterStore& InstanceParameters = SystemInstance->GetInstanceParameters();
	const bool bInstancesDirty = InstanceParameters.GetInterfacesDirty() || InstanceParameters.GetUObjectsDirty();
	const bool bOverridesDirty = OverrideParameters && (OverrideParameters->GetInterfacesDirty() || OverrideParameters->GetUObjectsDirty());
	if (bInstancesDirty || bOverridesDirty)
	{
		SystemInstance->WaitForConcurrentTickDoNotFinalize();
		if (bOverridesDirty)
		{
			OverrideParameters->Tick();
		}
		if (bInstancesDirty)
		{
			InstanceParameters.Tick();
		}
	}
}

The fact you see it from FRenderAssetInstanceState::PreAddComponentIgnoreBounds is interesting, I got the concurrent call from EOF updates where we know the sim has finished so can safely ignore. If it was flushing from this task while the sim was async that could certainly explain what you are seeing.

We are also tracking down a state data interface inside the parameter store, although we aren’t sure if it’s related to some reworking we’ve been doing locally or not right now. Hopefully I’ll know more soon.

Thanks,

Stu

Let me know how you get along with that change, i.e. does it fix it or not.

We were chatting about this the other day as a team, and we were wondering about adding GC awareness for the instance parameters to avoid this kind of problem. I seem to recall we tried this a few years back but it introduced a whole new class of issues, although how GC works has changed a little since then so could be worthwhile experimenting with.

Thanks,

Stu

Hi Stu,

> Could you provide the callstack where you are seeing this?

The bug appears to have been introduced in FNiagaraSystemInstanceController::GetMaterialStreamingInfo() in 5.6 but the crash that occurs as a consequence of the bug is elsewhere. For some reason I can no longer see the callstack on the details for this question, but here it is:

[Inline Frame] TMapBase<unsigned __int64,FNDIArrayInstanceData_RenderThread<FLinearColor>,FDefaultSetAllocator,TDefaultMapHashableKeyFuncs<unsigned __int64,FNDIArrayInstanceData_RenderThread<FLinearColor>,0>>::FindChecked(const unsigned __int64) Line 716	C++
FNDIArrayProxyImpl<FLinearColor,UNiagaraDataInterfaceArrayColor>::SetShaderParameters(INDIArrayProxyBase::FShaderParameters * ShaderParameters, unsigned __int64 SystemInstanceID) Line 1053	C++
FNiagaraGpuComputeDispatch::SetDataInterfaceParameters(FRDGBuilder & GraphBuilder, const FNiagaraGPUSystemTick & Tick, const FNiagaraComputeInstanceData & InstanceData, const TShaderRefBase<FNiagaraShader,FNiagaraShaderMapPointerTable> & ComputeShader, const FNiagaraSimStageData & SimStageData, const FNiagaraShaderScriptParametersMetadata & NiagaraShaderParametersMetadata, unsigned char * ParametersStructure) Line 2514	C++
FNiagaraGpuComputeDispatch::DispatchStage(FRDGBuilder & GraphBuilder, const FNiagaraGPUSystemTick & Tick, const FNiagaraComputeInstanceData & InstanceData, const FNiagaraSimStageData & SimStageData) Line 1755	C++
FNiagaraGpuComputeDispatch::ExecuteTicks(FRDGBuilder & GraphBuilder, TStridedView<FSceneView const ,int> Views, ENiagaraGpuComputeTickStage::Type TickStage) Line 1293	C++
FNiagaraGpuComputeDispatch::PreInitViews(FRDGBuilder & GraphBuilder, bool bAllowGPUParticleUpdate, const TArrayView<FSceneViewFamily const *,int> & ViewFamilies, const FSceneViewFamily * CurrentFamily) Line 2062	C++
FNiagaraGpuComputeDispatch::ProcessPendingTicksFlush(FRHICommandListImmediate & RHICmdList, bool bForceFlush) Line 495	C++
[External Code]	
[Inline Frame] UE::Core::Private::Function::TFunctionRefBase<UE::Core::Private::Function::TFunctionStorage<1>,void __cdecl(FRHICommandListImmediate &)>::operator()(FRHICommandListImmediate &) Line 471	C++
FRenderThreadCommandPipe::EnqueueAndLaunch::__l5::<lambda_1>::operator()() Line 1547	C++
[Inline Frame] UE::Core::Private::Function::TFunctionRefBase<UE::Core::Private::Function::TFunctionStorage<1>,void __cdecl(void)>::operator()() Line 471	C++
[Inline Frame] TFunctionGraphTaskImpl<void __cdecl(void),1>::DoTaskImpl(TUniqueFunction<void __cdecl(void)> & Function, ENamedThreads::Type) Line 1111	C++
[Inline Frame] TFunctionGraphTaskImpl<void __cdecl(void),1>::DoTask(ENamedThreads::Type) Line 1104	C++
TGraphTask<TFunctionGraphTaskImpl<void __cdecl(void),1>>::ExecuteTask() Line 706	C++
UE::Tasks::Private::FTaskBase::TryExecuteTask() Line 527	C++
[Inline Frame] FBaseGraphTask::Execute(TArray<FBaseGraphTask *,TSizedDefaultAllocator<32>> &) Line 505	C++
FNamedTaskThread::ProcessTasksNamedThread(int QueueIndex, bool bAllowStall) Line 779	C++
FNamedTaskThread::ProcessTasksUntilQuit(int QueueIndex) Line 668	C++
RenderingThreadMain(FEvent * TaskGraphBoundSyncEvent) Line 318	C++
FRenderingThread::Run() Line 478	C++
FRunnableThreadWin::Run() Line 159	C++
FRunnableThreadWin::GuardedRun() Line 71	C++
[External Code]	

> And could you elaborate on what “the new tick code” refers to?

Sure - here is the 5.5 code:

void FNiagaraSystemInstanceController::GetMaterialStreamingInfo(FNiagaraMaterialAndScaleArray& OutMaterialAndScales) const
{
	if (!SystemInstance.IsValid())
	{
		return;
	}
 
	for (const FNiagaraEmitterInstanceRef& EmitterInst : SystemInstance->GetEmitters())
	{
		EmitterInst->ForEachEnabledRenderer(
			[&](UNiagaraRendererProperties* Properties)
			{

Here is the same code in 5.6 with the added Tick() calls:

void FNiagaraSystemInstanceController::GetMaterialStreamingInfo(FNiagaraMaterialAndScaleArray& OutMaterialAndScales) const
{
	if (!SystemInstance.IsValid())
	{
		return;
	}
 
	if (OverrideParameters)
	{
		OverrideParameters->Tick();
	}
	SystemInstance->GetInstanceParameters().Tick();
 
	for (const FNiagaraEmitterInstanceRef& EmitterInst : SystemInstance->GetEmitters())
	{
		EmitterInst->GetRendererBoundVariables().Tick();
 
		EmitterInst->ForEachEnabledRenderer(
			[&](UNiagaraRendererProperties* Properties)
			{

Hey,

Without a repro I’m not confident about the order that is going wrong here, but I have a couple of thoughts.

1 - We might be missing a sync with concurrent work, i.e. we should add SystemInstance->WaitForConcurrentTickAndFinalize(false) on entry to that function.

2 - We might want to avoid pushing data around the stores unless SystemInstance->GetAreDataInterfacesInitialized() is true.

Not sure if your able to test out those theories at all?

Thanks,

Stu

> We might be missing a sync with concurrent work, i.e. we should add SystemInstance->WaitForConcurrentTickAndFinalize(false) on entry to that function.

I think we can rule that one out, my own workaround just hit a failure case because WaitForConcurrentTickAndFinalize() got called from a non-game-thread, with this callstack -

FWindowsErrorOutputDevice::Serialize() [.\Engine\Source\Runtime\Core\Private\Windows\WindowsErrorOutputDevice.cpp:84]
FWindowsErrorOutputDevice::Serialize() [.\Engine\Source\Runtime\Core\Private\Windows\WindowsErrorOutputDevice.cpp:84]
FOutputDevice::LogfImpl() [.\Engine\Source\Runtime\Core\Private\Misc\OutputDevice.cpp:81]
AssertFailedImplV() [.\Engine\Source\Runtime\Core\Private\Misc\AssertionMacros.cpp:169]
FDebug::CheckVerifyFailedImpl2V() [.\Engine\Source\Runtime\Core\Private\Misc\AssertionMacros.cpp:705]
FDebug::CheckVerifyFailedImpl2() [.\Engine\Source\Runtime\Core\Private\Misc\AssertionMacros.cpp:728]
FNiagaraSystemInstance::WaitForConcurrentTickDoNotFinalize() [.\Engine\Plugins\FX\Niagara\Source\Niagara\Private\NiagaraSystemInstance.cpp:2520]
FNiagaraSystemInstance::InitDataInterfaces() [.\Engine\Plugins\FX\Niagara\Source\Niagara\Private\NiagaraSystemInstance.cpp:1570]
FNiagaraSystemInstanceController::GetMaterialStreamingInfo() [.\Engine\Plugins\FX\Niagara\Source\Niagara\Private\NiagaraSystemInstanceController.cpp:194]
UNiagaraComponent::GetStreamingRenderAssetInfo() [.\Engine\Plugins\FX\Niagara\Source\Niagara\Private\NiagaraComponent.cpp:2446]
UPrimitiveComponent::GetStreamingRenderAssetInfoWithNULLRemoval() [.\Engine\Source\Runtime\Engine\Private\Components\PrimitiveComponent.cpp:624]
FRenderAssetInstanceState::PreAddComponentIgnoreBounds() [.\Engine\Source\Runtime\Engine\Private\Streaming\TextureInstanceState.cpp:528]
`FDynamicRenderAssetInstanceManager::IncrementalUpdate'::`25'::<lambda_1>::operator()() [.\Engine\Source\Runtime\Engine\Private\Streaming\DynamicTextureInstanceManager.cpp:129]
`ParallelForImpl::ParallelForInternal<TFunctionRef<void __cdecl(int)>,`ParallelFor'::`2'::<lambda_1>,std::nullptr_t>'::`2'::FParallelExecutor::operator()() [.\Engine\Source\Runtime\Core\Public\Async\ParallelFor.h:117]
LowLevelTasks::TTaskDelegate<LowLevelTasks::FTask * __cdecl(bool),48>::TTaskDelegateImpl<`LowLevelTasks::FTask::Init<`ParallelForImpl::ParallelForInternal<TFunctionRef<void __cdecl(int)>,`ParallelFor'::`2'::<lambda_1>,std::nullptr_t>'::`2'::FParallelExecutor>'::`13'::<lambda_1>,0>::CallAndMove() [.\Engine\Source\Runtime\Core\Public\Async\Fundamental\TaskDelegate.h:171]
LowLevelTasks::FTask::ExecuteTask() [.\Engine\Source\Runtime\Core\Public\Async\Fundamental\Task.h:627]
LowLevelTasks::FScheduler::ExecuteTask() [.\Engine\Source\Runtime\Core\Private\Async\Fundamental\Scheduler.cpp:364]
LowLevelTasks::FScheduler::WorkerLoop() [.\Engine\Source\Runtime\Core\Private\Async\Fundamental\Scheduler.cpp:724]
`LowLevelTasks::FScheduler::CreateWorker'::`2'::<lambda_1>::operator()() [.\Engine\Source\Runtime\Core\Private\Async\Fundamental\Scheduler.cpp:188]
FThreadImpl::Run() [.\Engine\Source\Runtime\Core\Private\HAL\Thread.cpp:69]
FRunnableThreadWin::Run() [.\Engine\Source\Runtime\Core\Private\Windows\WindowsRunnableThread.cpp:159]
FRunnableThreadWin::GuardedRun() [.\Engine\Source\Runtime\Core\Private\Windows\WindowsRunnableThread.cpp:79]

So I expect calling WaitForConcurrentTickAndFinalize() directly from FNiagaraSystemInstanceController::GetMaterialStreamingInfo() would run into the same problem

Ah OK, well in the short term that IsInGameThread() check should help my immediate issue (with WaitForConcurrentTickAndFinalize() objecting to non-game-thread) - thanks.

> We are also tracking down a state data interface inside the parameter store, although we aren’t sure if it’s related to some reworking we’ve been doing locally or not right now. Hopefully I’ll know more soon.

FWIW, “data interface updates getting skipped as a result of parameter store changes, resulting in render thread crashes” is pretty much how I’d summarize the original issue behind this support question.

So far, it seems gating the new sequence of Tick() calls (the ones added in 5.6) with IsInGameThread() is preventing the newest crash from occurring. As for the original problem, we’re still running with a version of my original change that resets the system instance if the ticking made the interfaces dirty - I had to narrow that down to calling SystemInstance->ResetDataInterfaces() (as opposed to a full Reset() on the system instance) because the full reset failed on unrelated prerequisites in some cases.

As of today, I’m not aware of any outstanding issues for us running with our changes.