Community Tutorial: Ocean Simulation

mjegen · December 12, 2023, 5:47am

CPU iFFT Calculation UE 5.3 (requires C++)

I have implemented the CPU logic that matches the GPU logic in this tutorial. This allows the user to sample the surface displacement at a specific location and can be used to implement things like buoyancy calculations.

The is purely in C++ and I haven’t implemented any hooks to allow it to be called from Blueprints but it shouldn’t be hard for someone to add themselves.

First, I’d like to thank DeathreyCG for their effort on the ocean tutorial. It’s a great piece of work and very well laid out. The addition of the links to the background material and theory behind the implementation was very useful.

There were 3 main hurdles to address when building the CPU logic:

Synchronizing random number generation between GPU and CPU.
Matching displacement magnitude at different grid sizes.
Achieving an acceptable accuracy of displacement between GPU and CPU
Achieving an acceptable level of performance on the game thread.

I’ll break down each of these issues in separate sections.

Random Number Generation

The Random node used in the ocean tutorial is non-deterministic. I replaced it with the deterministic version named “Seeded Float Random”. This requires using an identical seed on both the GPU and the CPU. The DispatchThreadId on the GPU and the iteration indices on the CPU are used to generate the same random value.

It’s worth noting that the “Seeded Float Random” also relies on a 4th seed implemented as a static variable in the shader which is incremented each time a random number is generated. This behavior was replicated in the CPU logic, as well.

I left Seed 2 and Seed 3 set to 0. I’m not sure if this will affect the entropy of the random numbers generated but I didn’t get a chance to dig into that.

Matching Displacement Magnitude

The Niagara System will generate different displacement magnitudes for different grid sizes. For example, dropping the grid size from 256 to 64 results in the following:

Dividing the displacements by 16 results in values that match the 256 grid size. I believe this is due to the fact that the Phillips spectrum model used has a 1/k^4 term which is the length of the wave vector used to populate the spectrum. The k value depends on the grid size.

I added a displacement factor to the Niagara System as a user parameter and modified the GPU code to divide the displacement by this factor. Alternatively, you could adjust the Amplitude values when modifying the grid size instead but I found this made it easier to play with different grid sizes. Here is the grid size at 64 with the displacement factor of 16 and the original parameters from the tutorial:

Displacement Accuracy

Originally, I was following the advice of the tutorial and running a grid size of 32 on the CPU and 256 on the GPU. It became obvious that the displacements calculated at the same location resulted in a material margin of error. I was seeing differences as large as a couple of meters using the parameter values from the tutorial.

For use cases with large floating objects this might be acceptable, but my use case requires calculating the surface of the water to allow a player to swim in high seas. So this wasn’t going to work for me.

In order to achieve an accurate displacement at a specific location the GPU and CPU were going to have to run with matching grid sizes.

It is also worth keeping in mind that getting the actual height of the surface at a specific point is not trivial. My implementation will generate the displacement (horizontal and vertical) at a given location. Additional logic will be required to find the height of the surface at a specific location. This video describes the issue at around 13:40.

Performance

This brings us to the meat of implementation. I tried increasing the grid size on the CPU to 64 without any optimizations. The performance was awful. It was consuming about 20ms on the game thread.

To address this required leveraging task parallelism and vector parallelism. Unreal has mechanisms that make this fairly trivial. A ParallelFor handled the task parallelism and ISPC handled the vector parallelism.

The OceanCalculate counter measures the impact to the game thread. It runs at about 0.7ms on my AMD Ryzen 9 5900X 12-Core processor. The VectorRowPass and VectorColPass iterations run on multiple threads so their total time is higher but it is a concurrent workload.

Using the Project

In order to change the grid size of the GPU logic, you need to:

Modify the GridSize and HalfGrid Size in the FX_OceanWater_SetInitials module
Modify the size of all the render targets in the FX Ocean Water Set Render Targets section of the WaterSim Emitter
Modify the Dispatch parameters in the RowPass and ColPass stages of the WaterSim Emitter
Modify the LENGTH and BUTTERFLY_COUNT defines (BUTTERFLY_COUNT should be equal to log2(LENGTH)) in OceanWaterFFT.ush

In order to change the grid size of the CPU logic, you need to:

Modify the GPU_GRID_SIZE, GRID_SIZE and BUTTERFLY_COUNT defines in OceanFFTData.h
Modify the GRID_SIZE define in OceanFFTCalculator.ispc

The GPU_GRID_SIZE define needs to match the GridSize used in the GPU calculations. This allow the random numbers to generate on the CPU to match those on the GPU.

The project contains a reference to the Unreal Water plugin. It’s not technically required, but it was an easy way to get a surface on which to apply the Ocean Material.

The project was created based on the empty game template in the Unreal Editor. I replaced the default map with a new map. It is a C++ project and was setup to use VSCode as the IDE. You can use the OceanSampleEditor (DebugGame) configuration to launch the editor.

The ispc compile step generates some warnings about performance that can be ignored.

I added a shader directory to the primary game module in order to allow relative references to the shader files in the Niagara System.

Console Commands

Enter “ocean.ShowDisplacement 1” console command to show the surface points calculated on the CPU in the editor.

Enter “stat ocean” console command to show the performance counters associated with the CPU calculations.

Here is a link to the project files:
OceanSample.zip (3.9 MB)

@Deathrey - Feel free to use this to supplement your tutorial if you feel it would add value.