Implementing new render pass questions

rgreene_kint · November 3, 2022, 2:09am

Hi all,

I am hoping that some members of the community have some suggestions to help me get a new render pass implemented. I’m new to Unreal, but have many years of experience as a rendering engineer working on bespoke engines. I’ve been able find a number of resources online which touch on some important points, however I’m finding it very slow to understand how all the pieces holistically actually fit together.

I’m attempting to implement a new render pass which precedes the base pass that does a visibility buffer style render technique and can blend multiple materials together into the GBuffer. At this stage I’m just trying to get a prototype put together for the standard main view (I realize that I will have more work to do in time to support Lumen etc.). I’ve gotten pretty far with my technique - I implemented a new EMeshPass, a new mesh processor, and all the necessary global shaders to generate all the data I need to apply my visibility buffer to the fullscreen passes to blend the various materials together.

The trouble is in the final pass to invoke the material shaders. I really don’t want to make any changes to BasePassPixelShader.usf if I can avoid it. Instead, I’m trying to write a new .usf that is a trampoline into the base pass pixel shader which decodes my visibility buffer on entry into the shader and applies the appropriate weighting to the results of the base pass in FPixelShaderOut. On paper, it all makes sense but this is proving really difficult as the shader code and bindings feel really intertwined with the flow of FBasePassMeshProcessor. In my case, the material application is just a fullscreen pass - so I’m trying to just get something bootstrapped that is more of an “immediate” style rendering execution.

Effectively, what I’m trying to write is:
GraphBuilder.AddPass(…
{
// lambda pseudo code…
FGraphicsPipelineStateInitializer PSO; // ← set this guy up
SetGraphicsPipelineState(RHICmdList, PSO);
SetShaderBindings(); // ?
FPixelShaderUtils::DrawFullscreenTriangle();
});

It feels so straight forward that I should just be able to “extract” my shader bindings from the material and I’m good to go…

I run into problems, however, if I were to attempt to replicate the logic inside FMeshDrawCommand: where if I were to just do something like put a FMeshDrawShaderBindings on the stack and attempt to call GetShaderBindings() I immediately run into problems where there is no render proxy, because again I want to just draw a fullscreen pass.

I’ve also tried to re-implement this by making my pixel shader a FMaterialShader instead of a FMeshMaterialShader, however that doesn’t work either as again some bindings population is tightly coupled to FMeshDrawCommand.

Another attempt was to pattern match how the post process materials work, but because I am using the BasePassPixelShader I run into problems where I don’t have a vertex factory and I can’t compile my shaders.

My technique is very much like how Nanite does it’s GBuffer population so I took a look in that code, but I got a little scared away there as it is a huge system and I’m just trying to learn how to appropriately setup my shader bindings.

Again, I’m still struggling to see all the connections in the render pipeline so that I can chart an appropriate course, and I’m hoping somebody here might have a suggestion for how to proceed. Some high level questions:

Should my pixel shader be a FMeshMaterialShader? Does this mean that I need to go full Nanite and pre-build draw commands and somehow invent render proxies? Again, this felt super overkill to just draw a pass…

Should my pixel shader be a FMaterialShader? If so, how do I appropriately get shader bindings? How do I resolve the issue with the vertex factory?

Is there a simpler example than Nanite that I can pattern match against which kind of does what I want: use a fullscreen pass with the base pass pixel shader?

EliasWick · November 3, 2022, 9:47pm

Bump, I am curious about this as well!

rgreene_kint · November 3, 2022, 11:47pm

I spent the whole day taking notes on how things work, but I’m no closer to a solution. I really, really wish some of this was documented.

I feel like there are only two choices here, both of which seem pretty unpleasant to me:

Fully embrace the FMeshMaterialShader and have my own vertex factory and do the draws as part of a FMeshDrawCommand as opposed to just rolling my own fullscreen pass. I guess the natural evolution here is to just do exactly what Nanite is doing have my vertex factory do the decode of the visibility buffer and have a few special hooks in BasePassPixelShader just like the IS_NANITE_PASS. I kind of hate that, but feels like the least of all evils? I also don’t really know where I want to store my mesh draw commands. Nanite keeps them in the scene when a Nanite enabled proxy enters the scene so it registers the material pass. I realize that ultimately I would need something like that to cache my commands, but man… I just want a prototype of a single draw on the screen… This feels like another week of work just to see a single screen pass… Uhg.
Abandon the idea of trying to re-use BasePassPixelShader.usf. I hate this idea because I don’t want to try to keep a custom shader implementation at parity with the core material shader that Unreal uses. Especially since that thing is a kitchen sink of everything, it just feels like fail.

Here are my notes, trying to understand things from the bottom up. I have a few unanswered questions in here. Maybe others in the community can tell me where I got things wrong:
Let’s figure out how the system fits together, starting from the bottom up - using FMeshDrawCommand as the guide.

How does the RHI address shader bindings?

Looks like it’s a DX11 style API where resources are bound to a specific slot, not a DX12 approach where it’s a potential root signature slot + descriptor table offset. This might exist under the hood in the DX12 RHI, I don’t think it’s important for this research to learn how the root signature is organized and constructed.

What resource types does the RHI interact with?

Uniform Buffer (constant buffer)
Sampler state
SRV. There is a distinction here between “FRHIShaderResourceView” and “FRHITexture,” yet both are managed in the same bucket in the FReadOnlyMeshDrawSingleShaderBindings as “SRVs.”

Where are the actual resources stored?

FShaderBindingState manages some arrays of pointers which are contextual based on what the shader needs. Wow, this thing is huge.

How do we know what slot things should be bound at?

FReadOnlyMeshDrawSingleShaderBindings manages this. I don’t know yet how to make this thing, but it understands how to distribute bindings, presumably based on what the shader dictates. It has homogeneous sections for uniform buffers, sampler states, SRVs, and “loose parameters.” Looks like a loose parameter is a constant buffer value which can overlay onto the other memory for a uniform buffer? Is the intention of this to support root constants in DX12? This seems like a really weird approach.

Why do loose parameters exist?

Looks like the DX12 implementation is a constant buffer update, not a root constant. What is a reason to actually do this? This seems terrible.

How is FReadOnlyMeshDrawSingleShaderBindings filled out?

This is just a wrapper of FMeshDrawShaderBindingsLayout to make sure you don’t modify data.

How is FMeshDrawShaderBindingsLayout filled out?

These exist in FMeshDrawShaderBindings and are initialized inside FMeshDrawShaderBindings::Initialized() given a specific FMeshProcessorShaders. Looks like the ctor is passed a FShader and it initializes FShaderParameterMapInfo based on FShader::ParameterMapInfo.

How is FShader::ParameterMapInfo filled out?

FShader::BuildParameterMapInfo() - this appears to happen as the final part of shader compilation. The map which is passed in comes from CompiledShaderInitializerType::ParameterMap.GetParameterMap().

How is CompiledShaderInitializerType’s Parameter map filled out?

The constructor gets it from FShaderCompilerOutput.ParameterMap

How is FShaderCompilerOutput.ParameterMap filled out?

These come from the backend shader compiler, so D3DShaderCompiler.inl has ExtractParameterMapFromD3DShader()

Wait, so how does that relate to FShader? How are param bindings on the C++ side actually matched against the shader compiler?

So, I guess that’s in FMeshDrawCommandStateCache? So FMeshDrawShaderBindings has a data pointer and it looks like that is casted to the appropriate thing based on the ParamMapInfo.

So, how does FMeshDrawShaderBindings::GetData() work?

FMeshDrawShaderBindings::Finalize() does validation, so it must be before that. It’s inside FMeshMaterialShader::GetShaderBindings(). The calls to ShaderBindings.Add() are what write to the data pointer.

So, what is the point of the global param structures that we pass into the graph builder?

Looks like the static uniform buffers are pulled from the params which are pulled in… Nothing else!

What defines a static uniform buffer, non-static one, or global?

Can FMeshMaterialShader::GetShaderBindings() fill in static uniform buffers?

Don’t think so… why is the shader not declarative over this?

Some of the stuff filled in FMeshMaterialShader::GetShaderBindings() need stuff like FMeshMaterialShaderElementData, where does that get filled out?

Those are just put on the stack and filled out based on what the mesh processor wants. See FEditorPrimitivesBasePassMeshProcessor::ProcessDeferredShadingPath() where it just makes a TBasePassShaderElementData ShaderElementData(nullptr); on the stack.

How is FShaderBindingState filled out?

Why is LAYOUT_FIELD() so pervasive, what does it actually do?

Makes a FFieldLayoutDesc which is a linked list of metadata in that struct.

Why is the FFieldLayoutDesc useful?

rgreene_kint · November 8, 2022, 5:40am

I spent more time reading how Nanite does it’s gbuffer application pass and decided to give up on that approach. My intuition is that it’s just the wrong approach. I’ve gone back to what I strongly feel should just be a super simple operation, yet I am still missing something…

My current code looks like this:

BEGIN_SHADER_PARAMETER_STRUCT(FMyShaderPassParameters, )
   SHADER_PARAMETER_STRUCT_INCLUDE(FViewShaderParameters, View)
   SHADER_PARAMETER_RDG_UNIFORM_BUFFER(FOpaqueBasePassUniformParameters, BasePass)
   ...
   RENDER_TARGET_BINDING_SLOTS()
END_SHADER_PARAMETER_STRUCT()

class FMyShaderPS : public FMeshMaterialShader
{
   DECLARE_SHADER_TYPE(FMyShaderPS, MeshMaterial);
   using FParameters = FMyShaderPassParameters;
   SHADER_USE_PARAMETER_STRUCT(FMyShaderPS, FMeshMaterialShader);
};

Then later when I get to the RDG part of things I attempt to fill out a FMyShaderPassParameters and then set that onto the command list in the lambda. Seems straight forward, right?

The problem I’m currently hitting is that I’m being told that a uniform buffer named “Material” doesn’t exist in my pass parameters. I can only guess this is introduced in FMaterial::SetupMaterialEnvironment where it does this:

	// Add the material uniform buffer definition.
	FShaderUniformBufferParameter::ModifyCompilationEnvironment(TEXT("Material"), InUniformBufferStruct, Platform, OutEnvironment);

Which naturally injects that into what the shader will compile for the material parameters. That makes sense. What I don’t see, however, on the runtime side is how I should avoid this particular failure. In FMaterial::SetParameters() I see that it’ll appropriately bind the “Material” uniform buffer, but the fact that I am failing where I am implies to me that I’m just doing something fundamentally wrong.

It feels like there are different styles of binding shader parameters, and if I take BasePassRendering as an example, it only sets up some parameters up through RDG, and some get baked as part of the mesh draw command. I also notice that the FMeshMaterial shaders used by BasePassRendering don’t actually use the SHADER_USE_PARAMETER_STRUCT() macro, how do they fill out their FShaderParameterBindings?

Again, it’d be great if any of this were documented as it’s really difficult to read this code since so much is done in macros, worse some of which stringify names together so you can’t even search the code to find what you’re looking for!

rgreene_kint · November 8, 2022, 9:02pm

Okay, next roadblock… It sure feels like this is impossible.

As I see it, I have two options for my PS:

FMeshMaterialShader - this doesn’t seem possible to use because if I use DECLARE_SHADER_TYPE(…, MeshMaterial); then it won’t actually get to FShaderParameterBindings::BindForLegacyShaderParameters() and I’ll be left with a shader that has no parameter bindings. In that situation, I am pretty much broken because either:

a) I use the SHADER_USE_PARAMETER_STRUCT() macro and the “Material” entry isn’t filled out, and I can’t get past startup

b) I don’t use SHADER_USE_PARAMETER_STRUCT() and instead attempt to just fill that out manually by allocating it off the RDG and then bind it in the submission lambda with SetParameters(), which then fails in ValidateShaderParameters() (ShaderParameterStruct.cpp) because the shader bindings haven’t been populated yet (due to not using the appropriate shader type).
FMaterialShader - this won’t work because there is a deep assumption that if you have a vertex factory then you must be a FMeshMaterialShader, and I need a vertex factory in order to use BasePassPixelShader.usf.

Ligazetom · April 3, 2023, 8:14pm

Hey, I’m currently working on something similar and obviously all of it is confusing as hell and not at all behaving like you’d expect.

I’ve spent two weeks day and night on this and only thing I’ve been able to do is expand the GBuffer and do my own thing in SetGBufferForShadingModel within BasePassPixelShader.usf which was okay since I didn’t do anything regarding depth but translucency screwed me over and nothing makes sense ever since.

You definitely made bigger research than me, but I can see it only led to desperation. Now with strata on the way it all gets way worse so I kept to CustomOutput nodes and BasePass to get all values from editor I need at least for opaque/masked materials.

With translucency it all goes out the window since bindings make no sense, outputs get overwritten, gbuffer is used only partially and god forbid you could do blending on your own.

I just feel probably the same way like you did few months ago, so I just wanted to let you know about my “solution” which is basically to give up and keep extending those monoliths even further.

Starkium · September 19, 2024, 1:41am

is that because translucency is a forward pass?

Starkium · December 14, 2024, 5:47pm

Back here again, still not finding much info on custom render passes. There’s essentially no info at all regarding the forward render either.

I assume everyone else here is modifying the engine directly, but I figured I would share a tip I stumbled upon. Made a separate post for it. Essentially, you can make a local copy of the engine shaders into a plugin where you override via virtual file paths. So yeah, you wouldn’t need to necessarily keep up with engine changes as you could localize at least this much.

So, I’m here looking at custom render passes because I have no way to mask boat hulls out of water on mobile LDR. We know how expensive render targets are still. I don’t have access to distance fields, custom depth, post process, etc. Scene View Extensions look interesting, but they lack what I would need in most places. Plus, I really only want to gather specific items into the render pass to make the mask.

This is essentially what a render target (or custom depth) would be good for, but it probably isn’t a good move till I can use something like this:
https://github.com/EpicGames/UnrealEngine/commit/af32d8267357ff69c8856fcbcc8b2683dd654406

5.5 is still a mess, and doesn’t have a Meta fork version yet, so got to wait and see.

Did anyone get any closer to custom render passes or getting more control over the renderer in 5.4?

BananableOffense · December 14, 2024, 9:06pm

You may not have access to global or mesh distance fields, but that doesn’t mean you can’t use custom distance fields. You can create a distance field mathematically or by creating a volume texture that represents the hull of your boat. The latter, for example, can be sampled in the water material as a volume texture sampler using a material parameter collection to pass the boat’s position and rotation along for alignment. The downside to this technique is it can only handle a predefined number of boats and scales in cost by how many there are, since each mask needs is its own sampler / set of instructions compared to a single global distance field read.

Starkium · December 15, 2024, 7:38pm

that’s one boat, I have several things that will need this treatment at once.
also, volume textures don’t run well on mobile / standalone vr. I can do a flat SDF for my shorelines, but otherwise I would need something more like global distance fields, custom depth buffer, or my own render pass / render target at the screen resolution… which is quite high.

BananableOffense · December 15, 2024, 8:23pm

It should be able to handle as many boats as you can pass through the MPC, as long as you don’t need to sample more than one volume in the same bounding box. The global distance field literally is just one giant low res volume texture, so the cost of sampling a manually generated volume texture should be the same (lower, probably). Distance field volume textures can be extremely low resolution, since distance can be reliably interpolated from sparse data.

There is a hard limit of 1024 parameters, and I assume each boat will need two minimum. Once for position, one for rotation - So theres plenty to go around.

In other words, you can reuse the same sampler with multiple coordinate systems simultaneously as long as you don’t try to make one position in space have two coordinates at the same time. You constrain your boats locally aligned coordinate system (UVW for a volume texture) to the bounding box of its hull. Then, you can add all your coordinate systems for every boat together into a single UVW to input into the volume.

Although if you have only a few boats, it may be cheap enough just to use a separate sampler for each.

There’s also the mathematical route. If you can create a close enough approximation of your hull with math, then you don’t need a texture at all. Here are a bunch of primitive and simple shapes with their associated functions.

Here are multiple (mathematically generated, not texture based) fully ray marched distance fields running on an 8 year old Samsung phone. Pretty sure modern mobile hardware can handle what I’m proposing just fine. Albeit a sphere is the simplest distance field.

Starkium · December 15, 2024, 8:47pm

We still can’t ray march on meta quest, it’s toxic for some reason. I’ve tried doing two meta balls (or whatever they’re called) in attempt to make a slime before.

I can understand using MPC, could also use dynamic material since it’s one water mesh. What I don’t understand is how I could possibly have that many individual samplers for volume textures. Plus, you’re talking about sizably increasing complexity in the shader this way.

This type of thing makes more sense as a form of live updated buffer to me.

BananableOffense · December 15, 2024, 9:29pm

You only need one sampler per texture regardless of how many boats are present. If they each have unique hull shapes it gets tricky though - see below.

I made a 2D version to demonstrate.

Here is the end result. Two coordinate systems sampling the same texture with one sample at different scales and positions simultaneously (the entire screen depicts just the one material applied to a plane).
You do need logic per ship to construct and mask the coordinates, which has a fixed cost per boat but it would very likely be better than sampling numerous times. In my 2D example, each “boat” cost 26 instructions.

This will work the same in 3D using volume, if needed. But as shown, this will just instance the same texture in each spot. If we want unique volumes it gets trickier as mentioned.

Technically you can have multiple hull shapes in one texture sample if you create an atlas volume. In 2D this can be done easily with a texture 2D array but unfortunately there is no texture 3D array, so you would need to manually atlas to volumes into a single 3D atlas texture. It would increase the memory cost of the volume texture, but would allow you to sample more hulls with one texture by also giving each ship a parameter to select the correct atlas index. But if your hull shapes are close enough to reuse the same texture (which can be rescaled) then this can be skipped entirely or kept to a minimum.

I’m not proposing that you do. The point was to demonstrate multiple distance fields on ancient mobile hardware. This needs only a simple distance comparison for an alpha test.

Obviously I don’t know the scope of your project, but this will work out of the box with no investment in engine modification and should be easy enough to implement that if it turns out to be too expensive you would probably have spent less time and effort finding that out than exploring the forum.

Starkium · December 15, 2024, 9:42pm

This was what I was referring to.

Perhaps there is a way with geometry script to construct such an atlas.

Either way this is more set up, and likely more costly, than a single pass that gathers the depth of specific objects. You can do this in a render target, but the slowdown is writing to texture. Also, the hidden costs of the pass that haven’t been stripped properly that I’ve seen others complain about. Writing an extra render pass seems like a very steep mountain. If only I could use custom depth buffer or have more wiggle room in the scene view extensions.

I’ve played with another alternative. Hijacking the single layer water shader directly. Since it is a two pass, anything with the same material domain invalidates what is in front of the other. I could technically mark a hull as single layer water, make it invisible, do some extra hlsl injection, and the object should be considered not underwater. The problem here is that any external object within that bounds is still rendered as underwater. So, it inevitably comes back to some sort of masking technique, though I do think there might be a way to hijack single layer water still by messing with the result of the depth pass.

BananableOffense · December 15, 2024, 11:29pm

Probably. I would do it offline, in engine or outside. Here’s a blog about generating them.

That alternative is really interesting and promising. I don’t have an issues with external objects appearing as though they are underwater. But if the surface of the blocking mesh isn’t pixel perfect to the top of the water, then you can get into a gap between them and see a sliver of the water.
Here’s the material I’m using for the hull’s blocking volume.

And the result:

Because SLW writes to scene depth, it messes with the velocity buffer though. Maybe not a problem for mobile, but would make it unusable for desktop applications with TAA, motion blur, etc.

Starkium · December 16, 2024, 12:39am

I don’t use any of those effects. Even on desktop I try to keep things as stripped down as possible for the cleanest render. I can’t exactly duplicate my entire ocean as that would be a ton of geometry. I’m constrained to about 350 draw calls and 400k tris max at any given point due to platform limitations. I’m aiming for open world too, so it’s a TIGHT space to work in.

You also have to consider that I’m use Gerstner waves, so the mesh is moving quite a bit. Try viewing things from inside the box as a third person controller, not just from the top.

You could also try setting the “masking plane” you’re using to only render in depth buffer and not main pass.

I have a feeling I could make an invisible mesh that reports a fake depth that the single layer water would consider a “no water” space

can you show me the rest of your material settings? your set up is not working for me.

BananableOffense · December 16, 2024, 1:29am

There is no duplicate ocean in my example.

There is an ocean mesh (the blue cube obviously). Nothing special here.
The sphere inside. Nothing special here.
The hull mesh - simply a smaller cube here (not a plane, but I suppose it could be), with the materials shown.

The proxy mesh takes up the interior volume of the hull up to the water level.

One solution is to have the blocker mesh sample the wave function, so it can displace its own verts to align with the water. This would require it to have the same polygon density as the water though for top surface, since if it had fewer polygons then tiny gaps would still appear.

Here is the gap, exaggerated a bit here. The camera is clipping into the surface of both the water mesh and the hull proxy so that the top surface of the water is visible:

Here is the gap removed with pixel perfect alignment:

Essentially it needs to be just far enough above the surface that it doesn’t z-fight with the water but close enough that there is no visible gap when your cameras clipping plane intersects both surfaces. There’s probably a better way to solve that gap but that’s more work than I’m able to spend on this.

Although now that I think of it, if your water material is two sided, you need a second layer just below the surface to block it when looking up from inside.

It would be better to swap to a one sided version of the ocean material when the player is on the ship, as they can presumably never be on the ship and underwater at the same time anyway.

There is literally nothing else to this material. Other than what is shown, and being set to SLW it is completely default.
It should make whatever is using it totally invisible due to the “color scale behind water” setting at 1. When placed in front of another single layer water material that is farther away, it will overwrite make it disappear.