I think so, yes. Your dense array would be filled with pointers to static voxel types (or just read them from disk, as suggested earlier). To be more specific the leaf should probably contain:
physical node bounds
parent reference
sibling references
reference to voxel array
reference to loose object array
voxel group supertype (filled, empty, modified)
etc.
Ryvar’s minecraft link mentioned that some of the optimisations we’ve already talked about don’t have any effect often enough to worry about, but I’d still leave that door open in case the scene you’re creating can benefit from them.
You might find that it’s not an ideal implementation for what you’re doing and it could be worth creating your own scheme using more optimal types, etc, but it’ll still be good to look at.
Hopefully not off topic, but I can vouch for the one flat array and creating only the outer visible faces.
It takes me 2.6 seconds to create 81 of 646464 chunks of sine wave meshes, while the game running at 50fps and never exceeding 20 ms. http://puu.sh/jssa0/f17b9a1bc3.jpgon my gt 730m
Doing a simple dictionary check on chunks (not voxels) brings it up to 3.4sec…
Also I would suggest not creating the physics on the meshes. There are ways to fake it quite effectively.
Just having a lod to only have physics on the closest chunks would be ok I think.
But what I did in Blender game engine, was to spawn real physical cubes inside the visual mesh.
By doing it in the 3by3by3 space around the player, the cube count is very low.
Account for the velocity vector and you are done.
TArray<uint8> LocalBrickMaterials;
LocalBrickMaterials.SetNumUninitialized(BricksPerRegion.X * BricksPerRegion.Y * BricksPerRegion.Z);
// ...
FGraphEventArray XYStackCompletionEvents;
for(int32 LocalY = 0;LocalY < BricksPerRegion.Y;++LocalY)
{
for(int32 LocalX = 0;LocalX < BricksPerRegion.X;++LocalX)
{
// Create a task for each stack of constant XY.
XYStackCompletionEvents.Add(FFunctionGraphTask::CreateAndDispatchWhenReady(&,LocalX,LocalY]()
{
// ... do some work here that is common to each stack of constant XY; evaluate 2D heightmap noise, etc.
for(int32 LocalZ = 0;LocalZ < BricksPerRegion.Z;++LocalZ)
{
// ... compute the desired MaterialIndex here.
LocalBrickMaterials((LocalY * BricksPerRegion.X) + LocalX) * BricksPerRegion.Z + LocalZ] = MaterialIndex;
}
}, TStatId(), NULL));
}
}
// Wait for all the XYStackCompletionEvents tasks to complete.
FTaskGraphInterface::Get().WaitUntilTasksComplete(XYStackCompletionEvents,ENamedThreads::GameThread);
Grid->SetBrickMaterialArray(MinRegionBrickCoordinates,MaxRegionBrickCoordinates,LocalBrickMaterials);
These tasks are probably smaller than optimal; regions are pretty large, so it’s creating a lot more tasks than hardware threads here, when it’s probably more efficient to create some small multiple of the number of hardware threads. However, this approach (one task per stack of constant XY) was a decent win.
This is a complicated tradeoff. You want the actors to be small enough that you can generate them on demand without causing a hitch. You also want the actors to be large enough that the per-actor rendering overhead isn’t substantial. When I added task parallelism to the BrickGame terrain generation and chunk setup, it allowed me to make the terrain generation code more complicated, generating more different materials (e.g. different stone types, metal strands, etc). However, that dramatically increased the number of draw calls needed, as each unique material within a chunk adds a draw call. Fortunately, with the task parallelism I had saved enough time that I could make the chunks much larger without causing hitches, so I was able to more than compensate for that increase in draw calls through having fewer, larger chunks.
A side note is that you probably want to have a single actor/component that stores the brick data, even though you’ll be rendering it with multiple actors/components. That lets you decouple the granularity you want to render at from saving/loading, generation, visibility, collision, etc.
Yeah, I don’t mean to say you shouldn’t have any sparsity, just that having per-voxel sparsity is pretty wasteful. In BrickGame, I have per-region sparsity, which is something like a 128x128x256 grid of voxels. In the code I posted above, notice how it uses a linear local array, and then passes it to SetBrickMaterialArray: in this case, the SetBrickMaterialArray should end up basically just doing a memcpy, but it also handles writes to volumes that overlap multiple regions. That case is handled not as a single TMap lookup per voxel, but as effectively one TMap lookup for each region that the volume overlaps. Same thing with GetBrickMaterialArray in the rendering and collision setup code: it costs a memcpy, but means that code doesn’t need to worry about sparsity.
This is more or less what BrickGame does, but with a large enough radius that it doesn’t need to worry about velocity. It was constrained by a maximum number of collision boxes supported by PhysX, so it can’t create collision boxes nearly as far as you can render the voxels.
My approach to terrain generation/mesh building is somewhat different than a bunch of the stuff outlined thus far. The actual generation of terrain-type values - as opposed to calculation of final mesh in a semi-uniform hull system - is fairly rapid and can just use a timesliced queue in the game thread. So for general ease of use for designers I actually perform all terrain data generation in Blueprint, and then feed the generated Region’s data into the mesh builder threadpool.
One of the more interesting things this approach enables is tinkering with your terrain generation algorithm in Blueprint, and then when the compilation delegate for that Blueprint fires, flushing the voxel system and rebuilding (this assumes you’ve built your system to work in-editor, not just PIE/in-game). End result: you can immediately see the resulting output from your new terrain algorithm within a second or two of clicking compile on the Blueprint, allowing for extremely rapid iteration with a multi-monitor setup. You do need to be careful about throwing as many early-opt-out opportunities up front as possible, though.
Anyone interested in this approach may find the following snippet from my module .cpp useful (mostly this is ConstructorHelpers::FClassFinder, outside a constructor - it works fine but there’s probably a better way). Mostly I’m posting this because it was a frustrating PITA that I couldn’t find anything like an answer or guide to hooking onto a specific Blueprint’s compile event anywhere.
Hey guys! I’ve made a bit of progress on the Octree and I would want a “review” an answer to a small question.
There is *UVoxelNode *(Header/Source) which is kind of an interface (OOP term, abstract class) with basic functionality. Then there is *UVoxelBranch *(Header/Source) which is a branch in the tree (duh). And finally there’s *UVoxelLeaf *(Header/Source) which stores voxel’s data in a dense array of 16[SUP]3[/SUP] uint16.
The idea is that empty branches (where all voxels data equal 0) should be Null so I save a bit of RAM usage. Am I doing things right? I know it’s simplistic right now, but I will be imporving upon this so I kind of need it to be relatively solid.
My question is, after reading UVoxelLeaf’s code, how would I convert the Position (in World coordinates) to an array index in the function SetVoxelAt?
P.S. : I’m using UObjects as base classes for these because I read somewhere on the forums that UObjects have “negligible overhead”. Can somebody confirm?
We’re reaching the limit of what I know now as I’m still figuring UE and c++ out. I would have asked the same thing about UObjects though.
At worst, you could use a UObject to manage the octree as a whole but not segregate it into sub-UObjects for each branch. If you set it up right you should never need to see the branches and nodes from the blueprint graph, only the various ways of getting results from the index. Keep in mind that branches are just data structures inside other data structures. They’re literally eight float values and an array of child branches. They don’t need anything UObject can offer and never need to be exposed other than as a stop-point during a search.
I could totally be wrong about this, but doesn’t “uint16 Data[16][16][16]” give you a 16x3 multidimensional array, as opposed to 16^3? Someone correct me if I’m off the mark, because I came from a PHP background and I had to unlearn all their bad ways and this kind of notation was one of them.
Anyway, traversing the octree gets you TO the leaf, which is only different in that it’s the smallest iteration of sorting and compartmentalization we’ve done on all of our objects. It has nothing special going on other than ITS children should be a list of voxels or actors.
From there, depending on what you’re doing, you can:
For non-world-voxel things, walk the contents of the leaf testing everything
For world voxels, take into account that you’re working on a regular grid and do some math to find the elements you want.
I know how I’d do that in Javascript, but I suspect there’s a more optimal way, but here’s how I’d do it in theory:
Take an axis from your world positional data
Divide it by the size of your voxels in world positional units.
Floor it
You have an int pointing to an axis point, which we’ll call x. Repeat for y and z.
Now, depending on whether you’re using a multidimensional array or just a linear one, you then either:
Get Voxels[int][int][int]
Get Voxels[x + y * widthOfChunk + z * widthOfChunk * heightOfChunk] // Grids start at zero!
Hey there guys! Just posting my progress and asking a question here.
With the help of Andrew, I managed to make my octree work and generate some noise! It’s event procedurally generating geometry! Screenshot :