float4 vs float, which is faster?

Cathco3 · March 19, 2024, 1:16pm

Hi Guys. I’m Just curious about a hlsl performance issue. Operations on float4 or float variables, which is faster? Or are they almost the same?

e.g.
float4 * float4 vs float * float
pow(float4, float4) vs pow(float, float)
lerp(float4, float4, float4) vs lerp(float, float, float)
sin(float4) vs sin(float)

This may help us write better materials. Anyone knows the underlying mechanism? Thanks.

3dRaven · March 19, 2024, 2:02pm

you understand that Float4 has 4 inbuilt floats (underlying structure is either a queue or vector) so:
float.x, float.y, float.z, float.w

its not a matter of speed float and float4 are totally different, it’s not like int32 vs int64 where they have different min/max values.

float4 may be able to use more parallel instructions on the GPU simultaneously, if you need to send many to the GPU in one batch.