Ray Marching a Volume Texture

Heya,

I got a pull request from the release branch that adds support for volume dds texture and I am trying to do a ray march through them for some cloud shading. This is what I got:

Edit 2:

I removed the old code and images to keep the clutter low since I’ve updated the code since making the thread.

I just realized I wasn’t using my StepLarge and StepSmall properly, they need to be vectors, not scalars (i.e. StepLarge * Direction, as opposed to just StepLarge). Fixing the code yields this:


float Opacity = 0;

float3 Ray = Direction;
float3 Step = StepLarge * Direction;

for(int i = 0; i < 100; i++)
{

	if(Opacity >= 1 || Ray.z >= HeightMax || Ray.z <= HeightMin) 
	{
		return Opacity ;
	}

	float NormalizedZ = clamp((Ray.z - HeightMin) / (HeightMax - HeightMin), 0.0, 1.0);
	float3 CloudUVW = float3(Ray.x * Tiling.x, Ray.y * Tiling.y, NormalizedZ * 128 * Tiling.z);

	float4 SampleAtPoint = CloudTexture.SampleLevel(CloudTextureSampler, CloudUVW, 0);

	float HeightSample = HeightTexture.SampleLevel(HeightTextureSampler, float2(0.0, NormalizedZ), 0).x;

	float OpacityAtPoint = SampleAtPoint.x * Multiplier * HeightSample;

	if(OpacityAtPoint > 0.0) 
	{
		if(Opacity == 0.0) 
		{
			Ray -= Step;
		}

		Step = StepSmall * Direction;

		
		Opacity += OpacityAtPoint;
	}
	else 
	{
		Step = StepLarge * Direction;
	}

	
	Ray += Step;
}

return Opacity;

This funnily enough results in this:

It’s inverted (only the bottom hemisphere shows anything) and instead of clouds it’s kind of a fuzzy mist…

Edit 3:

The inversion happened because CameraVectorWS is pixel to camera rather than camera to pixel. Inverting that fixed that issue. And it’s not a fuzzy mist it’s just stupidly high tiling. Dropping the tiling vector to 0.0001 and X and Y resulted in this:

Definitely a step forward! There’s still some weird banding and removing the multiply by HeightSample removes the inane huge tiling which makes no sense as the HeightSample should just act as a Z mask it should not modify the actual look of the clouds.

Any ideas are greatly appreciated.

It could be an issue with the volume sampling method you are using. Internally it may be doing a single sample rather than correctly interpolating between all of the neighbors to form a single point.

ie, to sample at a UVW value of 0,0, 0.5, you don’t simply round that W to either 0 or 1, you have to sample both the 0 and the 1 cell and then blend between then using the phase of 0.5. Same for the U and V if you want to get better filtering quality for the noise.

It also could be related to your viewing ray sampling method.

You are not dividing by local Z, so the glancing angle slices just miss the next intersection. And this happens because you are taking individual slices rather than sampling a true continuous noise field like you would get using the noise node.

So maybe something like this for the divide by Z:
float3 Ray = Direction;
Ray.xy /= ray.z;

Doing that may help hide some of the issues with sampling the volume texture incorrectly.

Thanks for the quick reply. I’ve updated my code to (bolded changes):


float Opacity = 0;

float3 Ray = Direction;
**Ray.xy = Ray.xy / Ray.z;**

float3 Step = StepLarge * Direction;

for(int i = 0; i < 100; i++)
{

	if(Opacity >= 1 || Ray.z >= HeightMax || Ray.z <= HeightMin) 
	{
		return Opacity ;
	}



	float NormalizedZ = clamp((Ray.z - HeightMin) / (HeightMax - HeightMin), 0.0, 1.0);
	**float W = NormalizedZ * 128 * Tiling.z;

	float3 CloudUVWBelow =  float3(Ray.x * Tiling.x, Ray.y * Tiling.y, floor(W));
    	float3 CloudUVWAbove =  float3(Ray.x * Tiling.x, Ray.y * Tiling.y, ceil(W));

	float4 SampleAtPointBelow = CloudTexture.SampleLevel(CloudTextureSampler, CloudUVWBelow, 0);
	float4 SampleAtPointAbove= CloudTexture.SampleLevel(CloudTextureSampler, CloudUVWAbove, 0);


	float OpacitySample = lerp(SampleAtPointBelow.x, SampleAtPointAbove.x, frac(W));**

	float HeightSample = HeightTexture.SampleLevel(HeightTextureSampler, float2(0, 1 - NormalizedZ), 0).x;

	float OpacityAtPoint = OpacitySample * Multiplier * HeightSample;

	if(OpacityAtPoint > 0.0) 
	{
		if(Opacity == 0.0) 
		{
			Ray -= Step;
		}

		Step = StepSmall * Direction;

		
		Opacity += OpacityAtPoint;
	}
	else 
	{
		Step = StepLarge * Direction;
	}

	
	Ray += Step;
}

return Opacity;

However the banding is still there. The multiplication by 128 in the W calculation is because my volume texture has 128 depth layers and the W isn’t normalized.

Result is largely the same:

Edit:

Since I am normalizing the camera → pixel direction vector, don’t I have to take into account the camera position at some point in the calculation too?

Edit 2:

I’ve added the CameraPosition to the ray before the loop. I can now actually move through the clouds and enter them, but they’re woefully low. Moving the high up makes them disappear, most likely due to my small loop count my rays don’t reach them.


float3 Ray = Direction;
Ray += CameraPosition;
Ray.xy = Ray.xy / Ray.z;

Hmm at that point I suspect the volume texture code you pulled.

I would try to debug a single Z slice at a time for now. And I would probably do something like make the Z of that slice hooked up to the function “Time with Speed Variable” so you can watch the phase slowly increase. maybe you will reveal a seam or pop as the Z crosses each integer threshold indicating some problem there. If not, it would indicate the volume lookup is correct.

its weird but this is the same artifact I got when I used viewing angle to perform less steps at glancing angles without accounting for the gradient of the noise.

missing camerapos wont cause any artifacts, but it also won’t give you parallax when the camera moves.

If you are moving your clouds far away then you probably also want to offset the starting position to avoid tracing through empty space, or possibly consider using some kind of lowpoly shell geometry.

When I first pulled the code I created a function that just took a UVW parameter that was W appended to TexCoord and just ouputted

VolumeTex.SampleLevel(VolumeTexSampler, UVW, 0);

The slices displayed properly as I scrubbed the W scalar.

Could it be that I have to do the /z each step?

I am out of the office now so can’t test.

no, the divide by Z shouldn’t even really be necessary but it might have pointed out the lack of filtering the samples, and technically you probably want to do it anyways, since for clouds you are trying to trace a certain Z height within the upper atmosphere. Without the divide by Z, your clouds along the horizon will not sample the whole atmosphere so they will kind of thin out.

I’d try removing that large/small step and just go with a single step size. also try reducing the step size. most likely the artifact is simply from the step size but I wouldn’t expect it to be so crisp like that when sampling a continuous gradient function.

for the plus pattern maybe your noise simply does not tile?

There is indeed an issue with tiling but I discarded that as the culprit because it does’t shift with UV scaling. I’ll redo the noise tomorrow to make sure.

I also want to increase the amount of steps but I had “loop cannot unroll” errors trying to do that.

really? It should work to simply make max steps a param like this

for(int i = 0; i < MaxSteps; i++)

then just make MaxSteps a variable input with scalar param. thats how POM does it.

Which pull request did you get?

I only found this one , but Texture2DArrays are mipmapped in 2D, and the filtering for the W component is always nearest, rather than trilinear. Texture3D support would fix this issue.

That’s the one.

hmm I thought about that initially as well, which is why I suggested the manual Z phase blend. It looks like you already tried exactly what I suggested though and it did not help, with this line:

float OpacitySample = lerp(SampleAtPointBelow.x, SampleAtPointAbove.x, frac(W))

So that should have added in the missing blending of the Z axis but of course I cannot verify that it was implemented 100% correctly when you consider the un-normalized height etc. Could still be something there perhaps that isn’t fully debugged?

best suggestion I can come up with at this point, is to completely separate out your Z height gradient texture Z value from the 3d texture UVs. It is a bit confusing since you are normalizing things just for the height gradient and mising that together with the 3d coords.

If you guys want I can upload the engine source with the pull request implemented togethet with the example project.

Either way I will screw around with the step size and number of steps etc. to see if it helps tonorrow morning.

RyanB - I might be doing something wrong with the sampling now that I think of it considering W isn’t normalized…

Hey guys,

Turns out the banding was because of StepLarge after all:


Unfortunately I can’t abandon that because otherwise it takes tons of steps to reach the clouds. The remaining lines are due to issues with tiling the texture. However, I think this was the biggest hurdle.

Here’s my latest WIP. Before I start adding more details and shaping the clouds (and fixing the noise tiling) I want to get rid of the noise swimming seen at the end of the video. For some reason the noise “shifts” as I move through it. It’s most likely due to my large step size. Rays catch some noise at weird angles. However for whatever reason I can’t get the math right on “start from HeightMin, do large steps until you hit something, then step back and retrace path with small steps”.

hey,

I solved that problem for the ‘protostar’ demo by using shell geometry so that the ray trace could start very close to the surface, otherwise the trace had to start at the camera plane for every pixel.

You could start with a giant plane mesh at the correct elevation and eventually upgrade to a sky dome that is actually spherical to get some nice perspective. check out the zero dawn cloud paper if you have not. Keep in mind even with ridiculously awesome optimizations, they still had to use a special buffer that only rendered 1/16 pixels each frame and then reprojected old frames for the rest. So their sky was animating very slowly a few pixels at a time. So it is very unlikely that anybody will make a fullscreen 3d volumetric cloud shader that runs very fast without special optimizations like that, or at very least using severely downsized translucency.

And maybe you can look into using variable step size based on viewing angle. That should be fairly simple and you may find that it only takes ~3-4 steps at shallow angles and more like 64 or so at steep angles.

Hey Ryan,

This is all actually inspired by the Zero Dawn paper, and I’ve briefly exchanged emails with Andrew Schneider regarding the texture array creation (the tiling problem was that Houdini’s worley noise doesn’t tile unlike perlin, won’t be able to get around it without wrapping the UV sampling manually it looks like).

The current material is applied to the default SkySphere. When you say to use shell geometry, do you mean actual geometry or just define the shape via coordinates in the ray trace code? I am not sure I am wrapping my head around this correctly.

But I had no idea Zero Dawn was using the “every 16th” pixel optimization. That’s a shame, probably means that this won’t be a viable solution for my project unless I go the same road as TrueSky and implement the whole thing as a pre-compiled plugin.

This means of course that I’ll have to pursue alternatives, which brings me to a tangent - you mentioned in a twitch stream that you’d eventually be able to talk a bit about the sky and clouds you did for Paragon? Any chance you could share some of it?

Best regards,
Damir H.

I did some cloud shaders for the announce cinematic that were basically raytraced heightfield materials that traced towards the light. Since they were still basically cards, the number of traces didn’t have to be that high to look good. I even got them receiving distance field shadows.

For use ingame we were actually taking those and using in editor merge tool to bake them down to simply unlit shaders with lighting baked in the textures.

For that last step, we had to briefly change the cloud materials from emissive to regular lit, and had to hook the Emissve to the BaseColor and the Opacity to the Specular so that the merge tool would capture it (doesn’t support translucency properly yet). Then of course we had to reverse that step and convert the merged material back to translucency.

One part of it that DID work out quite well though, is that even though the clouds were individual blueprints per cloud, the merge tool still worked to combine the selected ones into a single draw call just fine.

And I will add that the work I started doing for the sky in the actual level was very early and I have been working on other projects the last few months. Some other env artists have done work in there so I am not 100% sure what they are using currently to be honest. I heard it was a mix of the raytrace clouds and some simple cards. And of course a mix of photosource and hand painted tweaking for the actual HDRI behind the translucent clouds.

Ah, well that actor merging won’t help me much unfortunately since we’re having a dynamic time of day. So looks like I’m back on square one, hah.

I must admit that I’ve tried creating card clouds several times but every time I fall short right near the start as I have no idea how to even begin. Papers seem weirdly scarce on the subject.

When I tried this about a couple months ago, I made a tiling 3D Worley noise material that saves 256, 256x256 layers to a 4K CanvasRenderTarget2D. I pre-unrolled the loops so that it doesn’t take forever to compile when you make changes.



float scale=nc/tilesize;
float3 sx=x*scale;
float3 p = floor( sx );
float3 f = frac( sx );
float ncti=nc/ti;
float id = 0.;
float2 res = float2( 100.0,100.0 );
float3 dd=float3(1.0,57.0,113.0);
float3 b=float3(-1.,-1.,-1.);
float3 pb=p+b;
float3 pbm=pb-ncti*floor(pb/ncti);
float3 pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
float3 r=float3(b)-f+pbmh;
float d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,-1.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,-1.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,0.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,0.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,0.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,1.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,1.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(-1.,1.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,-1.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,-1.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,-1.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,0.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,0.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,0.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,1.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,1.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(0.,1.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,-1.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,-1.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,-1.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,0.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,0.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,0.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,1.,-1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,1.,0.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
b=float3(1.,1.,1.);
pb=p+b;
pbm=pb-ncti*floor(pb/ncti);
pbmh=frac(sin(float3( dot(pbm,float3(127.1,311.7, \
74.7)),dot(pbm,float3(269.5,183.3,246.1)),dot(pbm,float3(113.5,271.9,\
124.6))))*43758.5453123);
r=float3(b)-f+pbmh;
d=dot(r,r);
id=lerp(id,dot(p+b,dd),float(d<res.x));
res=lerp(res,float2(d,res.x),float(d<res.x));
res=lerp(res,float2(res.x,d),float(d<res.y));
return float3( sqrt( res ), abs(id) );

As far as the optimization goes, I had some success raymarching to multiple render targets, rendering a full view sphere instead of just visible pixels and updating distant steps at lower resolution and less often. Still can’t seem to get rid of some tiling artifacts though.

Gonna dig through that code as soon as I can thanks a lot!

For what it’s worth in my case it only took ages to compile the first time around.

A raymarching loop is only one deep, but a 3D Worley noise loop is 3 deep. It was taking 20 minutes every time. I based it off these two shaders. 1] 2]