I was able to save 10 instructions from the main loop by removing the need for the ‘lastoffset’ variable, and also by Multiplying ‘stepsize’ by ‘UVDist’ outside of the custom node (which is plugged into UVDist, stepsize is also still needed for the z offset).
It may be possible to save a few more by simplifying how the ‘yintersect’ variable is passed. That variable is only needed for PixelDepthOffset (without it, the pixel depth offset has harsh step seams) but so far I cannot figure out another way to simplify that without adding an unwanted branch to the function. Any more inputs and it becomes dauntingly unsuable IMO.
Updated code for parallax only node:
float rayheight=1;
float oldray=1;
float2 offset=0;
float oldtex=1;
float texatray;
float yintersect;
int i;
while (i<MaxSteps+1)
{
texatray=(HeightMapChannel, Tex.SampleGrad(TexSampler,UV+offset,InDDX,InDDY));
if (rayheight < texatray)
{
float xintersect = (oldray-oldtex)+(texatray-rayheight);
xintersect=(texatray-rayheight)/xintersect;
yintersect=(oldray*(xintersect))+(rayheight*(1-xintersect));
offset-=((xintersect)*UVDist);
break;
}
oldray=rayheight;
rayheight-=stepsize;
offset+=UVDist;
oldtex=texatray;
i++;
}
float3 output;
output.xy=offset;
output.z=yintersect;
return output;
Also it would be nice to get working for the rayheight<=texatray case since that will be a whole step faster for all white heightmap pixels.
So far when I try to get that working, its causing the first layer either to get invalid results (it attempts to divide by 0) or it doesn’t get the offset to match up with the next layer if I use an offset to keep it from dividing by 0. If anybody can help figure that out that would be awesome and make another tiny bit faster.
A separate if statement works but the branching overhead is probably not worth it unless there are a significant amount of white pixels.