Generate Procedural Mesh

SiggiG · July 8, 2015, 1:00pm

[=J. J. Franzen;326605]
how does it work? It’s just an array of integers, so I have to assume that it relies on the order of the indices in the array to define a triangle, so the first three indices define the first triangle in the mesh
[/]

And the order you define the triangle dictates which way the surface normal of that triangle faces A-B-C vs C-B-A

J.J.Franzen · July 8, 2015, 6:00pm

Hello . Thanks for the explanation. I do understand that forcing unique vertices makes it more efficient from an engine standpoint. I just wish there was a way to extrapolate the triangles into their own entity that I could operate on so that’s easier for me to understand similar to how the old custom mesh operated. Would you be willing/able to share an example of said rolling landscape? That’s essentially what I’m trying to do, but based on hex tiles which I then want to dynamically subdivide depending on the surface features (hills get more tris, plains get less, etc.). I can build the hex tile easily enough but trying to wrap my head around the subdivision algorithm using this vertex buffer approach is making my brain hurt. I guess that’s why I’m not a programmer… I was able to get it working using the old Custom Mesh since I could just recursively operate on each triangle in the mesh array. Maybe if I can see it working for someone else it would help me get in the proper mindspace. Cheers,

J^2

Dig_Squid · July 8, 2015, 7:59pm

For what you are doing, maybe you want to keep using a triangle struct, subdivide it all down, and then turn it into a vertex/index buffer at the end!

Ryvar · July 9, 2015, 6:02am

: with Scheidecker’s directional culling pull request accepted ( https://.com/EpicGames/UnrealEngine/pull/1279#issuecomment-119207227 ) - have you considered adding support for this to ProceduralMeshComponent?

For both the uniform and semi-uniform hull voxel implementations I maintain, plus an abandoned one, this has consistently dropped draw time to ~60-65% of base (or: 60% rendering performance boost) going back to 4.3 (and holy cow it’ll be nice to not have to patch it in with every new point release going forward).

Might make more sense in a dedicated subclass (ProceduralUniformMeshComponent ?), but given the level of interest in voxel tech, tightness of the performance/design constraints feedback loop in voxel-based games, and sheer magnitude of the increase… seems like it might be worthwhile.

Edit: changed performance number to use reciprocal for sake of clarity

Dig_Squid · July 9, 2015, 10:13am

[=Ryvar;327760]
: with Scheidecker’s directional culling pull request accepted ( https://.com/EpicGames/UnrealEngine/pull/1279#issuecomment-119207227 ) - have you considered adding support for this to ProceduralMeshComponent?
[/]

I’m not actually familiar with that code, would you consider making a pull request that adds support? I’m not sure how much work would be required. Is this basically doing frustum culling on each section?

Xilverback · July 9, 2015, 7:41pm

One thing I’ve been trying to figure out, Is it possible to do this mesh generation outside the game thread? There, the performance is less of an issue and lower priority meshes can be generated asynchronously. I’m using a runnable to do all the possible calculations in a separate thread but trying to call the the CreateMeshSection() from outside the gamethread will cause an assertion. I know with CUDA multiple threads can access the GPU but is it even possible in UE4?

Ryvar · July 9, 2015, 8:46pm

Is this basically doing frustum culling on each section?

It allows you to pre-emptively cull backfacing mesh CPU-side, by creating six VertexFactories for each SceneProxy (+/- X/Y/Z).

In any Minecraft-style cube world like BrickGrid the performance boost is change-your-design-constraints huge. My primary project is more semi-uniform (cubes topped with 5x5 vert heightmaps), but since camera and primary shadowcaster are guaranteed +Z & -Pitch it yields equivalent benefits in practice.

I don’t know whether off-axis geometry would see a benefit from attempting a combinatoric approach, and I’m fairly certain there’s nothing here for marching cubes/isosurface implementations.

The secret sauce is in the VertexFactory, looks like this (just going to paste lightly modified BrickRenderComponent.cpp):


	void Init(const FExampleVertexBuffer& VertexBuffer,const FPrimitiveSceneProxy* InPrimitiveSceneProxy,uint8 InFaceIndex)
	{
		PrimitiveSceneProxy = InPrimitiveSceneProxy;
		FaceIndex = InFaceIndex;

		// Bog-standard VertexFactory Init code here
	}

	// Elsewhere in VertexFactory...
#if UE4_HAS_IMPROVED_MESHBATCH_ELEMENT_VISIBILITY
	virtual uint64 GetStaticBatchElementVisibility(const class FSceneView& View, const struct FMeshBatch* Batch) const override
	{
		return IsStaticBatchVisible(View.ViewMatrices.ViewOrigin,Batch) ? 1 : 0;
	}
	virtual uint64 GetStaticBatchElementShadowVisibility(const class FSceneView& View, const FLightSceneProxy* LightSceneProxy, const struct FMeshBatch* Batch) const override
	{
		return IsStaticBatchVisible(LightSceneProxy->GetPosition(),Batch) ? 1 : 0;
	}
#endif //UE4_HAS_IMPROVED_MESHBATCH_ELEMENT_VISIBILITY

private:
	const FPrimitiveSceneProxy* PrimitiveSceneProxy;
	uint8 FaceIndex;

	bool IsStaticBatchVisible(const FVector4& ViewPosition,const struct FMeshBatch* Batch) const
	{
		const uint8 FaceIndex = Batch->Elements[0].UserIndex;
		const FBox BoundingBox = PrimitiveSceneProxy->GetBounds().GetBox();
		const FVector MinRelativePosition = ViewPosition - BoundingBox.Min * ViewPosition.W;
		const FVector MaxRelativePosition = ViewPosition - BoundingBox.Max * ViewPosition.W;
		switch(FaceIndex)
		{
		case 0:	return MaxRelativePosition.X < 0.0f;
		case 1:	return MinRelativePosition.X > 0.0f;
		case 2:	return MaxRelativePosition.Y < 0.0f;
		case 3:	return MinRelativePosition.Y > 0.0f;
		case 4:	return MaxRelativePosition.Z < 0.0f;
		case 5:	return MinRelativePosition.Z > 0.0f;
		default: return false;
		}
	}

Some other changes for Static relevance and per-facing mesh batching are required SceneProxy-side to support, but it’s pretty straightforward stuff. Ctrl-F “void InitMeshBatch(” and “struct FFaceBatch” in BrickRenderComponent.cpp.

I’m not sure how much work would be required.

The actual code would be a matter of a few hours, but the primary issues I can see would be:

This is specific to static SceneProxies (ProceduralMeshComponent is bDynamicRelevance = true), and
requiring the user to submit sections with an axis-facing enum, or something similar.

The former may introduce complications I’m not aware of (Static works just fine for all of my projects, but I adhere to a fixed world grid), and for the latter I’m not sure how comfortable you are with exposing something so potentially fragile to Blueprint, which was part of the reason I asked.

I’m not actually familiar with that code, would you consider making a pull request that adds support?

I’m honestly more of a Tech Designer, but I’d be happy to take a stab at it once 's pull request is in a Release build, if you think it would be worthwhile in light of the caveats.

J.J.Franzen · July 10, 2015, 2:16am

So, I’ve been able to work just about everything out, including triangle subdivision and even displacement via a height map, but it is SLOW. With a simple recursive algorithm, just doing 3 iterations of the subdivision process can take a full second. Just to generate 1152 triangles. It’s not a super complex blueprint so I’m rather surprised how slow it is. Is this a known problem or am I doing something wrong? It also seems highly variable. I have 3 instances of the same BP actor in my scene, with the sub div steps set to 2, 3, and 4 respectively and the sub div function prints out how long it took to do the subdivision. I’m seeing times from 0.984 seconds for 2 steps, to 0.984 for 3 steps, and 0.69 for 4 steps. It doesn’t seem directly related to the number of steps, which I am surprised by. I’d had hoped that I would be able to do a few more iterrations then 3 before seeing a major impact. As it stands now, generating a normal sized hex map with 1000 tiles would take 500-1000 seconds. That’s a rather lengthy load time!

Also, interacting with these actors in the editor seems laggy. It takes a few seconds after trying to move one of these actors before it does and it’s very slow to react. And it’s slower when interacting with the actors that have more steps. For example, I can move the 2-step mesh easily enough, but it feels laggy. Interacting with the 3-step mesh takes several seconds to respond. It’s not regenerating the mesh each frame is it? By contrast, moving a regular static mesh in the same scene seems normal. Even with the subdivisions turned off, interacting with the actor is slower then with a regular static mesh. Oddly enough, turning off both the subdivision and the displacement seems to make the interaction normal. However, since I’m doing all this in the construction script, that shouldn’t matter. Should it? Is there something unique to using the construction script that affects interactivity? This is my first attempt at creating an actor using the construction script so if there’s a trick do it, please let me know.

Also on the subject of Editor interactions, when I hit the F key to frame one of my actors, it zooms to the extent of the scene. It does not frame the mesh as expected. This would imply to me that the actor’s bounding box isn’t being set properly. I don’t see any way to do that. Is that no possible yet?

Now, before anyone goes nuts, yes I know this is an experimental feature. But the whole point of an experiment is to gather data. Which is what I am providing here. I’m not complaining or anything, just letting it be known what my experiences are and what I would hope them to be in the future. I’d really like to see this feature, and many more procedurally generated content oriented features, really get the polish they need to shine. I earnest believe that the future of gaming will depend on procedurally generated content and would love to see UE’s support of it grow. It’s tools like that which will enable us all to create more with less which is a win for IMOHO. Cheers,

J^2

P. S. I’ll follow up with some images tomorrow.

Euden · July 10, 2015, 1:05pm

I was wondering if you could possibly provide an example of a procedural terrain / Planet generation using this code? Not sure how difficult it would be to do so but I would be interested in seeing it.

Thanks!

Euden

J.J.Franzen · July 10, 2015, 6:11pm

As promised, 4K words worth…

So, here are some images of the procedural hex tile mesh I’ve created. The idea is to use this system with 's Map Gen system to make a Civ V type game world with everything procedurally generated (except the texture maps, for now…). These show 6 instances of the same BP actor. Bottom left is with subdivisions and displacements off. The rest have displacements on, and go from top left to bottom right going from 2 steps of subdivision to 6 steps. The next step will be to combine the displacement and subdivision algorithms so the only the triangles that have a large amount of details in the displacement map will be subdivided.

The time it takes to do the subdivisions seems to be logarithmic though. Sort of. Which makes sense as the more steps, the more each step has to process. But it’s not exactly predictable. For example, in the situation below, 2 subdivision steps takes 0.9 seconds on avg. 3 steps takes less then 0.4 seconds, 4 steps takes 0.4 seconds, 5 steps takes 1.4 seconds, and 6 steps takes over 17 seconds. So, WAY too long for usage in a real world example. If anyone has advice on how I might optimize how I’m doing the subdivision, I’d love to hear it.

I’m going to time the displacement algorithm as well and see what impact that has, and then try to optimize them both before I start in on the feature based subdivision but I am hopeful that once it’s complete it will be much quicker then my current solution.

For reference, the tri counts are:

2 steps = 288
3 steps = 1152
4 steps = 4608
5 steps = 18432
6 steps = 73728

Lit:

Wireframe:

Here’s what the Construction Script looks like:

And here’s the Subdivision Script:

As you can see, it’s none too complicated. I still need to organize/optimize things a bit, including spinning the displacement to its own function. I also am using pre-baked values for the initial hex tile, which could also be generated on the fly, but I don’t think it would necessarily gain me anything. I don’t have any experience with profiling BPs so this has been and will be educational. If anyone has suggestions/feedback or questions, please let me know. I realize I’m probably ****ing in the wind trying to use this experimental feature to do anything with, but didn’t every feature start out as experimental? Cheers,

J^2

J.J.Franzen · July 10, 2015, 8:50pm

To update the update of my update, I’ve found that the displacement function is taking more then twice as long as the subdivisions, which seems weird since it’s doing a lookup and then updating a value in the vertex array. I’ve tracked it back to’s function that allows you to read a texture file and access the pixels in the image directly as color values. Not sure why that function seems to slow things down so much but hopefully can help me figure that one out. I really wish it were possible to natively access a 2DTexture object in a blueprint and read the individual pixel values within. Would stream line things a bit I would expect it would run faster as well. Ah well, maybe some day. Again, if anyone has some tips/questions/feedback/etc. please let me know. Cheers,

J^2

The_Shane · July 12, 2015, 8:21pm

[=J. J. Franzen;329278]
To update the update of my update, I’ve found that the displacement function is taking more then twice as long as the subdivisions, which seems weird since it’s doing a lookup and then updating a value in the vertex array. I’ve tracked it back to’s function that allows you to read a texture file and access the pixels in the image directly as color values. Not sure why that function seems to slow things down so much but hopefully can help me figure that one out. I really wish it were possible to natively access a 2DTexture object in a blueprint and read the individual pixel values within. Would stream line things a bit I would expect it would run faster as well. Ah well, maybe some day. Again, if anyone has some tips/questions/feedback/etc. please let me know. Cheers,

J^2
[/]

One thing I’ve been doing w/ 's load image from file node is loading the image in the construction script and storing the pixels into a pixel array variable. Then I just have to drop the actor into a level and I can set it as the actor default (or you can just copy the array and then paste its contents into your base blueprint). That way you don’t have to go through the process of loading everything up everytime. Not sure if this will help or not.

J.J.Franzen · July 14, 2015, 10:24pm

Well, I am doing the same thing, dropping the pixels loaded into an array variable and then referencing them from there. I was able to isolate the performance issue to the fact I was using a 512x512 texture which was taking much longer to load then strictly necessary so I dropped it to a 64x64 res image and it’s moving along better now. Still not fast enough to be considered usable, but better.

I’m moving on to the next logical step which is building a full grid of hexes. However, I’m finding that the Add Unique function for vector arrays is not 100% reliable. Just making a simple 2x2 grid shows 3 vertex array entries with the same value. Since I used AddUnique for each and every add to that array, I was confused as to how that is happening. I even tried using find first to check for existing matches and that also fails to find these identical vertices. I then tried comparing each of these identical entries and they match at a precision level of 0.0001, but do NOT at a precision level of 0.0001. Since the addUnique and find functions don’t have any kind of controls for tuning the precision how can I get this to work? This is causing breakups in my mesh when subdividing since instead of using an existing vertex, it adds a new one, which then can get displaced differently, etc. Has anyone else had this issue? Cheers,

J^2

J.J.Franzen · July 15, 2015, 12:14am

To throw some more questions out there, I decided to see what kind of performance I would get making a grid of simple 6 triangle hexes, with each hex being a separate element in the proc mesh so I could assign different materials to each. It can take a bit of time to generate the grid, once it’s done the performance seems good. A simple 10x10 grid gets well over 80 fps. However, as that grid size grows, performance plummets. At 50x50, I’m seeing less then 5 FPS. I know it can’t be the triangle count since a single hex with 6 steps of subdivisions is over 70K tris and I still get over 120 FPS with one of those in the scene. Since a 50x50 grid of 6 tri hexes is only 15K tris, I have to assume it must be the number of disparate elements. So, what is considered an optimal number of elements in a mesh? 2? 8? 32? I’m going to go out on a limb and guess 2500 is non-optimal. So how would I approach this? Do I have to keep each hex as its own actor? Do I break it down by materials, and have all the hexes with a given material be in the same element? How would you tackle this issue?

And speaking of meshes, I’m assuming these procedural meshes are not being treated as static, which I would expect would also negatively impact their performance. Is that something that is planned as an option? I’m assuming there are all number of optimizations are still left to be done on this system, and would love to hear more about what might be planned for this feature. I know it has not been obvious, but I’m rather enthused with the idea of this feature and am eager to see it developer further. Cheers,

J^2

SRombauts · July 15, 2015, 5:47am

Hi JJ, yes you are probably limited by the high number of ‘draw calls’ (based on the number of meshes assuming they use only one material each). On a medium PC a thousand draw calls is a good target.

But before talking of optimization, you have to do profiling to be entirely sure (seedocs.unrealengine.com/latest/INT/Engine/Performance/index.html) and do this on the final game, not in a the Editor.

Assuming your performance problem is on the cpu side (Draw thread) , you should look into ‘batching’, that is grouping meshes using the same material into one unique mesh.
Unreal Engine has a ‘Instanced Static Mesh’ feature to do that dynamically for static meshes, but what I understand is that it is not intended for dynamic Custom/ProceduralMesh

Please let us know what you find!

Codeweav3r · July 15, 2015, 1:05pm

Here is a update on what I used procedural meshes for, hope you don’t mind the video
Thanks again for this thread, helped a lot to get started. (this is built in C++ btw)

https://.com/watch?v=yTRrv4lBVyc

Dig_Squid · July 17, 2015, 11:06am

Ryvar: Thanks for the info on 's pull request. That does seem a bit specialized for the ProceduralMeshComponent though, which is more of a general purpose tool. I think if you are building a voxel engine within UE4, you would want to write your own component/scene proxy.

J. J. Franzen: Blueprints are unfortunately a bit slow for big processing of 2D and 3D data, what you are doing would probably be a LOT faster in C++.

Zeustiak · July 18, 2015, 12:08pm

[=JamesG;334535]
Ryvar: Thanks for the info on 's pull request. That does seem a bit specialized for the ProceduralMeshComponent though, which is more of a general purpose tool. I think if you are building a voxel engine within UE4, you would want to write your own component/scene proxy.

J. J. Franzen: Blueprints are unfortunately a bit slow for big processing of 2D and 3D data, what you are doing would probably be a LOT faster in C++.
[/]

Would it be possible to get a few nodes made that could compile a lot of that work down to make it faster?

Ryvar · July 19, 2015, 12:44pm

[=JamesG;334535]
Ryvar: Thanks for the info on 's pull request. That does seem a bit specialized for the ProceduralMeshComponent though, which is more of a general purpose tool. I think if you are building a voxel engine within UE4, you would want to write your own component/scene proxy.

J. J. Franzen: Blueprints are unfortunately a bit slow for big processing of 2D and 3D data, what you are doing would probably be a LOT faster in C++.
[/]

Word. “The tale grew in the telling” - the more I wrote the more I got that sinking realization that what I was doing was super-specialized. Still, anyone reading this who happens to be generating lots of axis-aligned geometry should really, really check it out.

Ryvar · July 19, 2015, 1:11pm

J.J. re: your AddUnique issue - you’re probably going to have zero success using floating point values this way. Personally my mesh resolution needs are fairly low so I store all vert locations as a struct of 3 uint8s and have my VertexFactory submit them with VET_UByte4N. This has a nice little side benefit of allowing me to use vertex position as the key for a TMap of vertex data structs.

If you’re willing to delve into C++ you might try using the FInt3 struct from BrickGrid here: https://.com/AndrewScheidecker/BrickGame/blob/master/Plugins/BrickGrid/Source/BrickGrid/Classes/BrickGridComponent.h, and flooring your FVector positions into those, and storing/comparing them that way (you’d need to convert them back to FVectors for submission to the ProceduralMeshComponent function, of course). It may be possible to pull off something roughly comparable in pure Blueprint, but I’ve never tried.