Community Tutorial: Ocean Simulation

Do you also need to reduce the BUTTERFLY_COUNT to 5 for the CPU logic?

Edit: I believe the answer is: yes. The butterfly algorithm takes 5 passes to cover the 32 inputs to the PingPongArray.

Also, beware the GroupMemoryBarrierWithGroupSync call in the shader. The CPU logic will need to emulate this behavior which ensures the PingPongArray is initialized before the butterfly passes occur. It will also ensure all the butterfly passes have completed before the final butterfly pass on the 4 cascades.