Real-Time Marching Cubes by CUDA on Unreal Engine 4

Kakushi_52 · November 3, 2017, 5:09am

I managed to use NVIDIA CUDA and Thrust library on Unreal Engine 4, so I’ve implemented real-time marching cubes by CUDA and visualized on Unreal Engine 4.
https://youtube.com/watch?v=SVlsjbilW3w

My brain MRA (Magnetic Resonance Angiography): 512 x 512 x 184 = 48,234,496 voxels

My thoracic CT (Computed Tomography): 512 x 512 x 463 = 121,372,672 voxels

120 million voxels, more than 8 million meshes can be handled in real time.

The bottleneck is the data transfer between GPU and CPU, as mentioned below.
https://forums.unrealengine.com/community/work-in-progress/114470-physics-forests-a-new-real-time-fluid-solver?p=1358202#post1358202
I am facing exactly the same problem.

After calculating vertices and triangles on GPU, it is necessary to transfer the data from GPU to CPU, to pass the data to Procedural Mesh Component.
And (I think) the Procedural Mesh Component transfers the data back to GPU at some point.

Data transfer between GPU and CPU is quite slow and it often takes more than 10 msec in one way transfer: that means more than 20msec for GPU to CPU to GPU.

The best solution might be analyzing the source codes of Procedural Mesh Component or Runtime Mesh Component, and to make something new mesh component like “CUDA Mesh Component”, which does not need TArray for mesh data but need only GPU memory pointers by CudaMalloc.
However, it is probably too tough to implement for me…

At least, it would save some time if it is possible to set the raw data pointer of a TArray, like TArray::GetData() = a (CPU memory) pointer,
but it seems impossible like std::vector::data() is not changeable.

I would like to “move” or “reference” the data on a page-locked memory by CudaMallocHost to TArray instead of “copy”.

The next step is to implement the direct volume rendering like the following?
https://forums.unrealengine.com/development-discussion/rendering/91596-your-thoughts-on-and-comments-to-volume-rendering-in-unreal-engine-4

Anyway, the combination of CUDA’s performance and the strong power of Unreal Engine 4 has resulted in amazing beauty!

Vin · November 5, 2017, 10:28pm

…looks great. Will you be posting a guide or tutorial?

Kakushi_52 · November 6, 2017, 11:23am

Thank you so much for your comment!

I’ve already written the introduction tutorial of how to use CUDA kernel functions from Unreal Engine 4, but currently only in Japanese…
http://www.sciement.com/tech-blog/c/cuda_in_ue4/

I’ve attached several images in English, and also pasted some basic codes.
I hope it will help you!

anonymous_user_3375a5e9 · November 6, 2017, 2:30pm

Regarding that GPU-CPU-GPU transfer:
It would be nice if it was possible this way: https://answers.unrealengine.com/que…-from-gpu.html
But apparently, such solution tries to write into rendering buffers, while the other thread renders it, and then it flickers (on average maybe once per sec)
Anyone has any idea how to lock the buffer?

Vin · November 8, 2017, 7:24pm

Thank you for taking the time to post.

mittense · November 8, 2017, 9:05pm

Was it relativity painless to get CUDA integrated?

Kakushi_52 · November 10, 2017, 9:59am

As for Windows, if you know well about static link library, it’s not so difficult, but if you also want to use Thrust library, it is rather difficult.
In fact, there were some compiling errors when I tried to use Thrust from Unreal Engine, because some macros used in Unreal Engine and Thrust were exactly the same name.
So I had to customize the Thrust library a little.

mittense · November 11, 2017, 4:52am

That’s not bad at all; thanks. CUDA is so huge that I wasn’t sure if it would be as easy as what I’ve integrated thus far.

Kakushi_52 · November 28, 2017, 11:32pm

This project has been elected as one of the NVIDIA Edge Program Winners - November 2017 NVIDIA Edge Program 2017 年 11 月の受賞作品発表！ - Unreal Engine

Thank you so much!