So Blurred glass material is impossible in Unreal Engine 4?

It’s a bit hard to tell what you’re doing as the screenshot has been scaled down, but I assume you’re using a material or a render target. In that case what’s even more effective than sampling with an offset like that is to use the lower res mipmap if it’s available. Try that and you should incur no speed penalty, since if they exist they’re being created regardless.

For a LONG time I’ve wanted to be able to access whatever mips of the postprocess materials exist by the way, I wish Epic would open them up for us.

@HungryDoodles: very cool result! And thanks for posting this.
Do you have a higher resolution screenshot? I think I could guess and reconstruct it but it would be more easy to study it.
How could you make this blur dependant from depth?

@Antidamage: does the engine generate mips for the postprocess materials? I already asked this question and had no clear answer about it.

You know what? I think we could fake it with a rendertarget in the same location as the camera doing a low-res snap of the scene. With some settings switched off it’ll be much faster than doing a 10x10 blur iteration on the full image.

I’ve wrote material with HLSL, that’s was not difficult at all I should mention.
As before image needs to be passed twice: vertically and horizontally. And separately. But it makes cool effect if these passes used simultaneously.
For vertical pass (horizontal commented with /**/):


int TexIndex = 14; // Can be either 13... I think
bool bFiltered = true; // Can be false
float3 blur; //= SceneTextureLookup(UV, TexIndex, bFiltered);

//Vertical pass
for (int i = 0; i < cycles; i++)
{
  float c = 1.0f / (3.14159f * amount);
  float e = -(i * i) / (amount);
  float falloff = (c * exp(e));
  blur += SceneTextureLookup(UV + float2(i * rx, 0), TexIndex, bFiltered) * falloff;
  blur += SceneTextureLookup(UV - float2(i * rx, 0), TexIndex, bFiltered) * falloff;

}
//Horizontal pass
/*for (int j = 0; j < cycles; j++)
{ 
  float c = 1.0f / (3.14159f * amount);
  float e = -(j * j) / (amount);
  float falloff = (c * exp(e));
  blur += SceneTextureLookup(UV + float2(0, j * ry), TexIndex, bFiltered) * falloff;
  blur += SceneTextureLookup(UV - float2(0, j * ry), TexIndex, bFiltered) * falloff;

}*/

//blur /= 2 * cycles + 1;
return blur;

And material nodes:


I also used Gaussian falloff (normal distribution) which defined with next function:


float gauss(float x, float amount) {
    double c = 1.0 / (2.0 * 3.14159265359 * amount);
    double e = -(x * x) / (2.0 * amount);
    return (float) (c * exp(e));
  }

Where x - loop iteration number, amount - exponential amount (the less this number - the less blurriness will be).
Wikipedia: Normal Distribution
If talk about physical reinterpretation, then mostly it is how blur materials will distribute light, so it might look very believable.
There is a little hack with image brightness, because integral of normal distribution is… well…
Result:


NOTE: If you want to use this with transparent material, the you need to use Texture2DSample(Tex, TexSampler, UV) instead of SceneTextureLookup… Or something that contains image behind object and works with one of these samplers.
Needs optimization! Need to downscale input resolution by 2 or 4 times, which will greatly increase performance, because 50 iterations is now gets 5-6 ms in 1080 - it’s bad.
I wish there will be CUDA, because then I can precalculate normal distribution once and use it as const array, which will cut 90% of calculation time (power is very expensive operation). Is there any buffers or something like that?

Actually THE BEST optimization will be Fast Fourier Transform, but I don’t find it possible anyhow.

We easily can make blur dependant from depth by “regulating” variable “cycle” with depth. The other question is how we can make it.

@Antidamage You know, we can even make blur under UMG using rendertarget! That’s is a cool idea! I’am gonna make it, but little bit later, university takes a lot of time.

Material is free to use and edit :o

1 Like

You can cut amount of samples to half with this bilinear sampler trick.

Another smaller optimization:
page 27.


exp(x) is implemented as exp2(x * 1.442695)

So you can save one multiply per iteration by hoisting that constant to new constant that is calculcated outside of loop(or compile time).


float inverseAmountWithExp2Trick = 1.442695 / amount.

Untested code for taking advantage of bilinear sampling and halfing sample amounts.



float3 blur = 0.0; // Always remember to initialize all variables.
//Vertical pass
for (int i = 0; i < cycles; i += 2)
{
  float c = 1.0f / (3.14159f * amount);
  float e = -(i * i) / (amount);
  float falloff = (c * exp(e));
  float e2 = -((i+1) * (i+1)) / (amount);
  float falloff2 = (c * exp(e2));
  
  float combinedFalloff = falloff + falloff2
  float offset = falloff2 / combinedFalloff; 
  blur += SceneTextureLookup(UV + float2((i + offset) * rx, 0), TexIndex, bFiltered) * combinedFalloff;
  blur += SceneTextureLookup(UV - float2((i + offset) * rx, 0), TexIndex, bFiltered) * combinedFalloff;

}


Tested it, there is absolutely no visual difference (end-user will not suspect anything):


And I’ve got 45 FPS instead of 35. (Without blur - 54). That’s very cool optimization!

But we still need to scale down resolution by 2 or 4 times, because it will significantly increase performance.
And I thinking what is better:
Using another post process to fake resolution downscale, which will took 4 texture samples per pixel overhead, but for blur we will only use half or quarter of resolution?
Or use a render target to render actual half resolution, but increase drawcalls and make image more dependant from triangles count?

Can you share the updated code and material.
So the cost of post process blur is now: 1000ms/54fps - 1000ms/45fps = 3.7037037ms. This is too cheap to cover extra pass overhead so don’t try to use render target with extra render pass. You could try to lower sample count even further by skipping samples using spatial and temporal dithering.

One small optimization is to pull out constant math from Custom node to material level so then material shader compiler can precompute those values at CPU level. Example: If you calculate inverse of Amount at material and then feed that to Custom node you save Division(which is more expensive than Multiplication). Material compiler can’t do any math folding for custom nodes and because Amount is Parameter it isn’t constant for shader compiler.

Edit: Notice that c is constant so you can move it outside of loop and just multiply blur value outside of the loop.

Why not use Gaussian two-pass filter with pre-calculated weights as vector? no need to do any math except calculating UV and blend

New material for vertical pass:

&stc=1

Updates:
Loop iterations halfed.
c is now outside loop. (-0.5 ms of calculation time) But I assume it need to drag it to node level…
Notices:
Temporal AA can cut by half calculation time, but it materials looks very-very strange if there is e.g. FXAA. But there is some variable to handle both Temporal AA and FXAA, right? The most thing I’ve done with GPU is progressive draw Mandelbrot set with CUDA, I barely know aspects of shaders pipeline…
There is not only constants can be calculated on CPU, I wrote it before:

Not 90%, maybe only 20%, but nevertheless if it can somehow be done that would be cool.
I’am creating some HUD for my game, and mostly I thinking about how to create blur behind HUD elements. But have no ideas…

Can you test to put Blurriness parameter as constant. Because your loop iteration count is constant HLSL compiler will unroll the loop. If amount is constant then compile can pre calculate those expensive exp calls to your unrolled loop. Adding attribute


UNROLL

directly before loop should add


[unroll]

attribute on platforms where it’s supported. Does that make it faster at all? If it does’t then only choise is to try to reduce texture samples.

Unrolled loops, now there is 47-49 FPS (jumping for some reasons), but this is already a great success!
Just checked if there is anywhere post process execution time… Maybe this:


And it’s probably true, because I downed graphical settings from Epic to Low and now both blur passes takes around 1-1.5 ms, which is already pretty nice, including the fact, that there is 25 iterations.

Nice. After unrolling there isn’t much that you can do. Now one iteration just calculates UV’s(2xMAD) and blur with fallof.(2x3xMAD). So it’s 8 ALU’s and 2 TEX samples per iteration. So it’s all about texture bandwith now.

I did not know about the unroll directive until now. Nice!

I’m using the example above as a blur in a post processing layer with blur offsets of around a thousand. It looks good even then, but I added a couple more samples in the corners to help avoid the horizontal and vertical streaking. Overall the effect only has about a 10% fps penalty. I also tried to handle the lighting variations. It’s not perfect, but it’s fairly stable for blur amounts between 2 and 1000.

I also merged as many different statements as I could into one single execution line. Epic have said that there’s a small speedup on the GPU from doing this.


float Cycles = 25;
float c = 1.0f / (3.14159f * BlurAmount);
float4 Blur = float4(0, 0, 0, 1);
if (BlurAmount < 2) return SceneTextureLookup(UV, floor(BufferIndex), true);
[unroll(Cycles)]
for (int i = 0; i < Cycles; ++i) {
    float e = -(i * i) / (BlurAmount);
    float falloff = (c * exp(e));
    float e2 = -((i + 1) * (i + 1)) / (BlurAmount);
    float falloff2 = (c * exp(e2));
    float combinedFalloff = falloff + falloff2;
    float Offset = falloff2 / combinedFalloff;


    float2 UVOffset = PixelSize * (i + Offset);
    Blur += 
        (SceneTextureLookup(UV + float2(UVOffset.x, 0), BufferIndex, true)
        + SceneTextureLookup(UV + float2(0, UVOffset.y), BufferIndex, true)
        + SceneTextureLookup(UV + float2(UVOffset.x, UVOffset.y), BufferIndex, true)
        + SceneTextureLookup(UV + float2(UVOffset.x, -UVOffset.y), BufferIndex, true)) * combinedFalloff;
}
return (Blur * clamp(13 - BlurAmount, 2, 12) * BlurAmount) / (32 + BlurAmount / Cycles);

I am very much interested in your video tutorials for setting custom blur glass materials. Thank you.

Hey everyone!
So i’m trying to use this code with a material exactly like you pictured (had no luck putting it into material function, it complains about SceneTexture node). And what i get as a result is that editor crashes when i enter post process volume with this mat assigned. Can anyone help?

Bump for greater justice! We need blurred/privacy glass solution!

Just wear pants

Is there a youtube tutorial where this is explained?

Cheers.

1 Like