Yeah, so my approach ended up being writing the data to a render target.
There’s a good amount of possible optimization that can be done to make writing to the render target faster, but the general gist of the issue is:
render target needs to be written to from render thread / game thread (if I remember correctly).
but we want to perform the minimal amount of computation within the renderthread/gamethread. Ideally we’d want to perform everything in a background thread.
So what we do is we create a new memory buffer that’s the same size as the render target, and we perform all computations on a background thread.
Then on the render thread it’s just a matter of locking the render target, memcpy, and then unlocking.
I don’t have any of my old code, so hopefully this helps enough. But feel free to ask any questions