NVIDIA GameWorks Integration

SteveElbows · July 19, 2016, 10:01am

I note with excitement that Nvidia Flow is starting to get more of an official mention on the nvidia site now. eg its actually linked to from the visual FX menu on the GameWorks site, with mention of a beta in July and UE4 integration in Q3.

https://developer.nvidia/nvidia-flow

anonymous_user_7e946687 · July 19, 2016, 9:03pm

flow seems to be already integrated. just have a look at the new nvidia vr app ://store.steampowered/app/468700/?snr=1_7_7_230_150_1 its build with ue4

DiagaDumah · July 20, 2016, 2:38am

azoellner;566762:

Hi DiagaDumah,
I actually authored the wushu demo so I can answer your questions. Ultimately the asset was broken down into multiple parts, the hood, shirt, skirting, and pants. I did so that I could have different settings per section of clothing. I got 90-95% of the collision effect between those asset with a careful balance between distance, backstop and collision shapes. We knew the hood, and shirt were unlikely to collide with anything so those stayed very independent. The pants and the skirting has are where the tricks are. The pants are mainly small distance near the bottoms and used backstop for the inner leg collision. the only collision shape that was used was for the foot, and it was exaggerated in size if my memory serves me well. The skirting had enlarged collision shapes. The size of the collision shapes was about at the distance of the distance for the pants, give or take a nudge either way. The distance on the skirting was greater near the bottoms and tighter closer to the waste line. The hood, skirting, and sleeves all used their own self collision. helped the fold size on the hood and sleeves and helped with the 4 pieces of skirting from interpenetrating. The other two big notes are to use local space sim, it handles fast animation much better, and turning the inertia blend up a bit. Also use the connected spheres for collision shapes. They can conform to the body better and more efficient with clothing. As an alternative, you could always combine all your cloth into one asset and use self collision if your vert count isn’t to high and doesn’t over complicate other authoring matters too much. I hope helps to get you the results you are looking for.

Hi azoellner,
Thank you so much for all of information, I really appreciate it! I’ve started testing right away, and already I’m seeing better results! I just wanted to clarify two things, how do I activate local space sim? I don’t seem to see it anywhere in the node settings in Maya. And for the skirting in the wushu model, were all the overlapping cloth pieces (4 in total it looked like?) of the skirt, together on the same node? and you just activated self collision to keep them apart? Thank you again for your help, is huge!

anonymous_user_c96dc955 · July 20, 2016, 11:46am

I’m working on a project that is a mostly-static “experience” sort of thing. IE, the user has freedom to move around but we only rarely update lighting/geometry/materials. VXGI has been amazing, and we are absolutely shipping with it (we control the hardware). My understanding of VXGI is far from complete, but I’m hoping to increase performance and had a couple of ideas…

Since we only update things that affect lighting rarely, it seems like we could decouple some of the heavy lifting. Page 26 mentions using arbitrary shaders for building lightmaps. Has anyone experimented with in UE4? Rather than baking with lightmass, we could do a fairly quick re-bake whenever we want to at runtime.

But what about something simpler- could we split the VXGI pipeline so that we only run parts when lighting needs to be updated? Would even give a meaningful performance improvement?

My background is mostly computer vision and CUDA, so a lot of the is really unfamiliar territory for me.

anon2247386 · July 20, 2016, 11:46am

Sounds like that Nvidia Flow is going to appear in the preview of 4.13.

SteveElbows · July 20, 2016, 12:53pm

None of the other nvidia GameWorks stuff has been integrated into the standard official UE4 branch or launcher version of UE4 before now. So without specific, explicit confirmation that it will be different time, I’d assume that nvidia flow release schedule doesnt have much to do with UE4’s own release schedule at all. And we will still have to compile our own version from source.

Having said that I think nvidia have said they will release the VR Funhouse UE4 project to encourage people to use UE4. But again unless I hear otherwise I’m going to assume that will also involve a custom branch of UE4 being used, eg one with Flow and FleX included or whatever GameWorks modules were actually used in Funhouse.

anon2247386 · July 20, 2016, 1:10pm

It’s called Nvidia . It it part of the Unreal engines for a very long time now. Nvidia and Epic Games do have a closer relation to each other than one may think. The question is what Nvidia is really thinking. Because if Nvidia played according to the rules of Epic Games, all of the GameWorks stuff may end up being a core part of UE4.

UE4 is the perfect engine for all that GameWorks stuff. And that’s also good for Epic, because they won’t have to implement everything all over again. Epic Games is very rich. Maybe they should go for some shopping? Micro$oft likes to go shopping, a lot…

I just can’t help but to wonder about CUDA though. Because I think Flow requires a GP-GPU grade acceleration. Meaning if one (Epic Games) wanted to run it on AMD cards, one would require to support OpenCL.

anonymous_user_2e3787d7 · July 20, 2016, 2:22pm

Hi, Local space I believe is a check box in Ue4 and I think it is defaulted off. You will have to check to confirm. In the DCC plugins, local space is on by default and not exposed. You use the Inertia Blend control to blend in world motion influence. And the 4 flaps on the skirting were all in the same asset. I gave them some breathing room in the bind pose as well. If they are too close and self collision radius is larger than the separation of the cloth segments, it may not behave so well. So yo have to strike a balance there. I hope helps.

DiagaDumah · July 20, 2016, 2:59pm

Perfect! Thank you so much! helps a ton, and I will get to testing it right away. Thank you!

SteveElbows · July 20, 2016, 3:57pm

;567734:

It’s called Nvidia . It it part of the Unreal engines for a very long time now. Nvidia and Epic Games do have a closer relation to each other than one may think. The question is what Nvidia is really thinking. Because if Nvidia played according to the rules of Epic Games, all of the GameWorks stuff may end up being a core part of UE4.

UE4 is the perfect engine for all that GameWorks stuff. And that’s also good for Epic, because they won’t have to implement everything all over again. Epic Games is very rich. Maybe they should go for some shopping? Micro$oft likes to go shopping, a lot…

I just can’t help but to wonder about CUDA though. Because I think Flow requires a GP-GPU grade acceleration. Meaning if one (Epic Games) wanted to run it on AMD cards, one would require to support OpenCL.

Even though now comes under their GameWorks brand, its in a different league of integration with many engines, not like all the other GameWorks stuff that thread is about. And I’m not sure the GPU-accelerated version of (as opposed to CPU) has been so sucessfully integrated into engines like UE4?

Some of the GameWorks modules only work on nvidia hardware, eg FleX, so I can see why those arent going to be integrated into the main branches of UE4. As far as I can tell from the nvidia site flow is not like that and will work on other manufacturers hardware. So for Flow at least I think the issues preventing main branch integration are often something other than purely technical.

anon2247386 · July 20, 2016, 6:07pm

You are right. Since we don’t have Batman Arkham series grade soft bodies and dynamic smoke, UE4 is CPU only then. Seems destructibles and cloth (I never used before) can go without GPU acceleration.

Bleh, my fault. I forgot that is CPU only in Unreal, or at least not fully supported in the official branch. Well, Flow will likely remove the dynamic smoke part of the main . Yeah, Ultimately it comes down to CUDA. Because for a moment I thought like a game dev, and not somebody who would use GP-GPU accelerated APIs to eradicate the competition via a UE4 github branch… -.-

Meaning while AMD still thinks that open source is going to save their sorry human behind…

My popcorn is ready…

Blakblt · July 20, 2016, 6:24pm

;566784:

I tried VXGI on a marketplace demo scene (specifically Pirates Island, because I had it lying around). I used lowest amount of cones possible, and lowest voxel resolution possible (32). With VXGI off I get 8.33ms on GPU (120 FPS, capped basically), with it on I get 28ms on GPU (37FPS). Meaning a 20ms cost on gtx970, on a scene that isn’t really that full.

Am I missing something ? CryEngine SVOTI is seems to be accurate to their doc claims of 2-4ms on gtx780. I don’t expect to get their level of performance as they are using other techniques like light probes, but nearly 5 times the cost seems a bit off.

Are there any other settings I’m not aware off? If I halve the range, it maybe shaves off 6-7ms at most, but it becomes extremely noticeable immediately, so that’s a no go it seems. What gives?

Personally I’ve found that by far the largest impact on the performance is the stack levels, but only if you know how they work. Essentially it’s similar to cascaded shadow maps. The amount of stacks is similar to the amount of cascades, and the range increases the distance of the effect, but lowers the resolution. So if you have one stack level and a range of 2000 you will have long range gi, but it will be low resolution. So if you want the highest quality possible, you need a low range and high stacks. But if you want performance, it starts acting strange. If you change the stack levels from 1 to 2, you will notice a massive dip in performance, supposedly because the system calculating the stacks is disabled when set to 1. But if you increase the stack levels from 2 to 3, then you will not have a very noticeable performance hit. From there it scales worse and worse. Also when you are at a low stack count (1-3) range has almost no effect on performance. The final piece to is that the less stack levels you have, the less of a performance impact you will have from increasing the map size. So if you only have 2 stack levels, increasing the map size to 256 is a viable option.

So, is what I’ve found works best for optimizing VXGI.
First, set your stack levels down to one, and increase the range until the voxelization reaches as far as you need for it to.
Once you get the range you want, start whittling down the size of the voxels until you are happy with the results. First mess with the map size. Default is 128, so while you are at 1 stack with a high range, try increasing the map size to 256 and see if that is good enough. If it’s still not high enough resolution, then set the map size back down to 128, and add a stack level. With the extra stack level, you can lower the range a little bit more to shrink the voxels. If you still aren’t at a high enough quality, try increasing the map size to 256 again, then try lowering it and adding a stack again. Keep doing process until you reach a good balance between quality and performance.

What I’ve found from doing is that the sweet spot is usually 2 stack levels, a map size of 256, and a range of 1200 to 1600 on my 980ti. Lowering the map size to 128 and keeping those settings the same scales all the way down to a 770, and possibly a bit further. Keep in mind that is also with multibounce and specular tracing enabled. Turning those off can give an even further boost and still hold moderately high quality. There are of course many, many more settings that you can use to further optimize VXGI, but is what I’ve found to be the fastest and most effective method.

Blakblt · July 20, 2016, 11:04pm

Yeah that would probably be a smart idea. I’ve been following the development of VXGI in Unreal for at least a year now, and finding what you need in can be hell. My trick for optimizing VXGI is just something that I figured out after messing with it as a hobby for months on end, and personally what I’ve found is that VXGI has the potential to either run as fast as SVOTI or mimic the quality of baked lighting. For better or worse, the default aims towards the latter, and the only way to fix that is to do some digging into the console commands. Don’t let anyone fool you; VXGI is game ready. But without proper documentation, finding simple tricks like the one I posted comes down to having tons of time to kill, and a little bit of luck. That being said, another trick/bug I found a few months ago is that the in-editor performance can be significantly worse than the standalone performance. I’ve found that even in simple scenes running a standalone version of your project can shave off 2ms, and in some projects that seem unplayable in editor (reflection subway is a fantastic example), the difference can be far more drastic.

anonymous_user_bcf66e2c · July 21, 2016, 5:57am

I have setup a new thread for the VXGI integration as it has been requested numerous times now, you can find it : https://forums.unrealengine/showthread.php?117817-NVIDIA-VXGI-Integration&goto=newpost

I have also notified of thread, so hopefully new updates/responses will go there.

AE_3DFX · July 21, 2016, 9:44am

is one reason why i use AMD cards as they do not use “North Korea” like control and even block using open drivers by signing.

anonymous_user_7c71c8ec · July 21, 2016, 10:42am

dkloving;567705:

I’m working on a project that is a mostly-static “experience” sort of thing. IE, the user has freedom to move around but we only rarely update lighting/geometry/materials. VXGI has been amazing, and we are absolutely shipping with it (we control the hardware). My understanding of VXGI is far from complete, but I’m hoping to increase performance and had a couple of ideas…

Since we only update things that affect lighting rarely, it seems like we could decouple some of the heavy lifting. Page 26 mentions using arbitrary shaders for building lightmaps. Has anyone experimented with in UE4? Rather than baking with lightmass, we could do a fairly quick re-bake whenever we want to at runtime.

But what about something simpler- could we split the VXGI pipeline so that we only run parts when lighting needs to be updated? Would even give a meaningful performance improvement?

My background is mostly computer vision and CUDA, so a lot of the is really unfamiliar territory for me.

The VXGI lightmap baking option is purely hypothetical at point, as far as I know. is a known request, but the implementation in UE4 is probably nontrivial, although VXGI has the facilities required to implement it. Doing that in runtime is even more complicated because lightmaps are static in UE.

If your scene is mostly static, VXGI integration can be modified to skip most of the scene voxelization on every frame. For simplicity, the current UE integration configures VXGI to invalidate and update everyrhing on every frame, but that’s not absolutely necessary. For a static scene with no multi-bounce lighting, both opacity and emittance textures can be preserved between frames, with only incremental updates caused by camera movement. Implementing mode requires building a list of objects and lights that have been modified since the previous frame, and applying finer culling to the voxelization passes.

Regarding performance improvement, if you observe that “VXGI WS” time in “stat unit” output is a significant part of the frame, then yes, using incremental voxelization should give a noticeable improvement.

anonymous_user_c96dc955 · July 21, 2016, 12:08pm

So, with multibounce enabled it is required to update everything on every frame?

Why does camera movement necessitate updates? Is it any movement, or only some conditions?

Thanks for your help- I really appreciate it!

anonymous_user_7c71c8ec · July 21, 2016, 12:37pm

Multibounce lighting is achieved by voxelizing all geometry and adding indirect lighting from the previous frame to direct lighting. So, it only works correctly when you voxelize all geometry on every frame, and each frame adds a bounce. If you make incremental updates, there will be noticeable changes in lighting: some areas will have more bounces than others. And by areas I mean axis-aligned boxes in voxel space, so there will be sharp boundaries between areas updated more or fewer times.

Not every camera movement requires updates. The reason why updates are required at all is that the voxel clip-map covers a limited range of world space. When you move the camera (or rather, the anchor point which is normally somewhere in front of the camera) enough to change the position of the clip-map, regions in world space that were not covered before will need to be filled by voxelization. Moreover, there are multiple levels of detail in the clip-map, each covering a different range of world space, and they move together, so a single change in clip-map position triggers updates to all the levels.

anonymous_user_98a782d3 · July 21, 2016, 3:17pm

Hello I have tried with no luck to get your merged branch v4.9.2 working , every thing complies correctly but when i go to launch the editor i get error:

Access violation - code c0000005 (first/second not available)

“”

nvcuda
nvcuda
nvcuda
nvcuda
APEX_TurbulenceFSPROFILE_x64
APEX_TurbulenceFSPROFILE_x64
APEX_TurbulenceFSPROFILE_x64
APEX_TurbulenceFSPROFILE_x64
APEXFrameworkPROFILE_x64
APEXFrameworkPROFILE_x64
UE4Editor_Engine!FPhysScene::InitPhysScene() [d:\unreal_n\engine\source\runtime\engine\private\physicsengine\physscene.cpp:1688]
UE4Editor_Engine!FPhysScene::FPhysScene() [d:\unreal_n\engine\source\runtime\engine\private\physicsengine\physscene.cpp:153]
UE4Editor_Engine!UWorld::CreatePhysicsScene() [d:\unreal_n\engine\source\runtime\engine\private\world.cpp:3357]
UE4Editor_Engine!UWorld::InitWorld() [d:\unreal_n\engine\source\runtime\engine\private\world.cpp:866]
UE4Editor_Engine!UWorld::InitializeNewWorld() [d:\unreal_n\engine\source\runtime\engine\private\world.cpp:1068]
UE4Editor_Engine!UWorld::CreateWorld() [d:\unreal_n\engine\source\runtime\engine\private\world.cpp:1144]
UE4Editor_Engine!UEngine::Init() [d:\unreal_n\engine\source\runtime\engine\private\unrealengine.cpp:799]
UE4Editor_UnrealEd!UEditorEngine::InitEditor() [d:\unreal_n\engine\source\editor\unrealed\private\editorengine.cpp:431]
UE4Editor_UnrealEd!UEditorEngine::Init() [d:\unreal_n\engine\source\editor\unrealed\private\editorengine.cpp:586]
UE4Editor_UnrealEd!UUnrealEdEngine::Init() [d:\unreal_n\engine\source\editor\unrealed\private\unrealedengine.cpp:49]
UE4Editor!FEngineLoop::Init() [d:\unreal_n\engine\source\runtime\launch\private\launchengineloop.cpp:2101]
UE4Editor_UnrealEd!EditorInit() [d:\unreal_n\engine\source\editor\unrealed\private\unrealed.cpp:63]
UE4Editor!GuardedMain() [d:\unreal_n\engine\source\runtime\launch\private\launch.cpp:133]
UE4Editor!GuardedMainWrapper() [d:\unreal_n\engine\source\runtime\launch\private\windows\launchwindows.cpp:126]
UE4Editor!WinMain() [d:\unreal_n\engine\source\runtime\launch\private\windows\launchwindows.cpp:200]

any help would be appreciated

Thank you

anonymous_user_c96dc955 · July 22, 2016, 11:29am

Okay, makes a lot of sense and explains the effect seen when turning multibounce on. So, when we make lighting changes, we could simply run a few updates of the scene and then stop to get multibounce?

There is only one possible anchor point per scene right? We place a VXGI anchor in the middle of the scene, which I presume overrides the normal camera-located anchor? Our scenes are interior / room-scale, so is workable for us.

I’m still interested in lightmaps as well. I have some strange ideas on things to do with that, and I think long-term its the best route for our project. Hopefully I can get the go-ahead to invest some time into it.