RWBuffer and Buffer in compute shaders when they binding with the same resource

Abstract Situation:

I have two compute shaders in two different passes, and one Buffer resource. I need to write to the Buffer in Pass 1 and then read it in Pass 2.

Practical Engine code:

TConeTraceScreenGridObjectOcclusionCS is a FGlobalShader, which is defined in UnrealEngine\Engine\Source\Runtime\Renderer\Private\DistanceFieldScreenGridLighting.cpp.

LAYOUT_FIELD(FRWShaderParameter, ScreenGridConeVisibility) shows us a parameter in that compute shader, at line 193, DistanceFieldScreenGridLighting.cpp.

But I find the it called RWScreenGridConeVisibility in the actual shader file UnrealEngine\Engine\Shaders\Private\DistanceFieldScreenGridLighting.usf. What’s more, I got nothing in CPU code when I search the key word “RWScreenGridVisibility”.

When using the Buffer in “Pass 2”, FCombineConeVisibilityCS in this case, it declares LAYOUT_FIELD(FShaderResourceParameter, ScreenGridConeVisibility). And the actual variable in shader files is just ScreenGridConeVisibility.

I think it confuses me…

BTW, when should I use FShaderParameter and when to use FShaderResourceParameter?

Thanks a lot for reading the chaos description.

here are some code-snippets to define a Compute Shader which writes into a RW-Buffer

To define a Compute Shader like in your .h:

class FWriteBufferCS : public FGlobalShader
{
	DECLARE_GLOBAL_SHADER( FWriteBufferCS );
	SHADER_USE_PARAMETER_STRUCT( FWriteBufferCS , FGlobalShader );

	BEGIN_SHADER_PARAMETER_STRUCT( FParameters, )
		SHADER_PARAMETER( int, BufferSize )
		SHADER_PARAMETER_UAV( RWStructuredBuffer<float>, BufferTarget )
	END_SHADER_PARAMETER_STRUCT()

public:
	static bool ShouldCompilePermutation( const FGlobalShaderPermutationParameters& Parameters ) {
		return IsFeatureLevelSupported( Parameters.Platform, ERHIFeatureLevel::SM5 );
	}

	static void ModifyCompilationEnvironment( const FGlobalShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment ) {
		OutEnvironment.SetDefine( TEXT( "SOMECONSTANT" ), 123 );
	}
};

dont forget to “implement” the shader file:

IMPLEMENT_SHADER_TYPE( , FWriteBufferCS , TEXT( "/Plugin/YourPlugin/WriteBuffer.usf" ), TEXT( "MainCS " ), SF_Compute )

Setting the Shader Parameters and Dispatch the Compute Shader in .cpp (in the RHI Thread):

FIntVector ThreadGroupAmount;
ThreadGroupAmount.X = FMath::DivideAndRoundUp( BufferSize .X, 64 /*ThreadGroupSize*/ );
ThreadGroupAmount.Y = 1;
ThreadGroupAmount.Z = 1;

FWriteBufferCS::FParameters ShaderParameters;
ShaderParameters.BufferTarget = BufferTarget;
ShaderParameters.BufferSize = BufferSize;

TShaderMapRef< FWriteBufferCS> ComputeShader( GetGlobalShaderMap( GMaxRHIFeatureLevel ) );
FComputeShaderUtils::Dispatch( RHICmdList, ComputeShader, ShaderParameters, ThreadGroupAmount );

in the .usf

RWStructuredBuffer<float> BufferTarget;

[ numthreads( 64, 1, 1 ) ]
void MainCS( uint ThreadId : SV_DispatchThreadID )
{
	if( ThreadId >= BufferSize )
		return;

	BufferTarget[ ThreadId ] = sin( (float)ThreadId * 0.123f );
}

to create a Buffer/UAV you can do kinda this (i think this also works on the Game-Thread) (setting the Resource-Array is optional, since you write the buffer on the gpu anyway, otherwise a MemCpy is best):

FRHIResourceCreateInfo CreateInfo( TEXT( "WriteBuffer" ) );

TResourceArray< float > ResourceArray;
ResourceArray.SetNumZeroed( BufferSize );
CreateInfo.ResourceArray = &ResourceArray;

FStructuredBufferRHIRef BufferSource = RHICreateStructuredBuffer(
	sizeof( float ),
	sizeof( float ) * BufferSize ,
	BUF_UnorderedAccess | BUF_ShaderResource,
	CreateInfo
);

FUnorderedAccessViewRHIRef BufferTarget = RHICreateUnorderedAccessView( BufferSource , false, false );

when you just read data in the gpu-buffer, you can use an SHADER_PARAMETER_SRV( StructuredBuffer<float>, BufferSource ), skipping the ‘RW’ prefixes and just pass the BufferSource into the shader parameters.

Anyway you may call some TransitionResource, WaitForDispatch, or FlushRenderingCommands.

hope that helps, sorry for any typos xD

3 Likes

That helps a lot. Thank you~

I do think this is a terrible “skipping”. Do you know how it works in low level?

Thats nice!
To read the buffer on the gpu its just these minor changes.

where you define the Compute-Shader use:

SHADER_PARAMETER_SRV( StructuredBuffer<float>, BufferSource )

When setting the resource in the Shader-Parameters, just pass the FStructuredBuffer instead of the FUnorderedAccessView:

ShaderParameters.BufferSource = BufferSource;

in the .usf

StructuredBuffer<float> BufferSource;

besides this, you can read from the RWBuffer too anyway^^

1 Like

Thank you again~~

Here I have a question:

Since we create a FRDGBufferDesc and FRDGBufferRef every frame(in the pass body), how can it gurantee the data sync between frames? The RDG resources are bound with GraphBuilder lifetime, so we cannot store the resources somewhere.

Imagine that I need to do some calculations based on the exsit data in some buffer.

After searching infomation for a while, I think I got it. Ty~

1 Like

hoping to get the point i am familiar with these two approaches:

Blocking the Game-Thread:

void AMyClass::Tick( float DeltaTime )
{
	ENQUEUE_RENDER_COMMAND( DoDispatch ) ( [this] ( FRHICommandListImmediate& RHICmdList )
	{
		// ... RHI Compute-Shader Dispatch Stuff ...
	} );

	FlushRenderingCommands();
}

Not blocking the Game-Thread:

FThreadSafeBool bAsyncWorking = false;

void AMyClass::Tick( float DeltaTime )
{
	if( bAsyncWorking == false )
	{
		bAsyncWorking = true;

		ENQUEUE_RENDER_COMMAND( DoDispatch ) ( [this] ( FRHICommandListImmediate& RHICmdList )
		{
			// ... RHI Compute-Shader Dispatch Stuff ...

			bAsyncWorking = false;
		} );
	}
}

Though i am not sure about the benefits of the RDG Render-Dependency-Graph in regards to basic compute-shader dispatching. (for UE5 i just adapted some code within the rendering-pipeline to match the RDG scheme).
@agreatworld would you recommend to use the RDG-system for Compute-Shader dispatches when resources depend on each other?

@ThD_ManiaC

In my case, I don’t have AActor class and Tick().

Invoke GraphBuilder.QueueBufferExtraction(Source, Destination) to keep a TRefCountPtr<FRDGPooledBuffer>, this one is independent with GraphBuilder between frames(In my comprehension, GraphBuilder is different in two frames).

When using the kept Buffer in later frames, invoke GraphBuilder.RegisterExternalBuffer() to do something like rebuilding the connection between GraphBuilder and the RDGResources.

In my test, this works and I think it is a elegant way to sync resources.

But here comes another problem, how can I initialize a FRDGBuffer with custom data such as an array? I didn’t find any way through exist codes in the engine.

in the non-RDG example above, instead of calling ResourceArray.SetNumZeroed( BufferSize ) you can fill this array with your CPU data to initialize the buffer with it.
or use RHILockStructuredBuffer() and RHIUnlockStructuredBuffer() to copy your data in between from CPU to GPU and vice-versa.

I converted all code to RDG style and initialize some buffers with a single pass, which will be invoked for absolute once.

It’s a stupid way, but I did not find any API about FRDGBufferRef to fill it with some initial data.

I found this to initialize buffer in RenderGraphUtils.h

/** Creates a vertex buffer with initial data by creating an upload pass. */
RENDERCORE_API FRDGBufferRef CreateVertexBuffer(
	FRDGBuilder& GraphBuilder,
	const TCHAR* Name,
	const FRDGBufferDesc& Desc,
	const void* InitialData,
	uint64 InitialDataSize,
	ERDGInitialDataFlags InitialDataFlags = ERDGInitialDataFlags::None);

It works well.

2 Likes