Yeah, sure.
MyBlueprintFunctionLibrary.cpp (4.5 KB)
MyBlueprintFunctionLibrary.h (974 Bytes)
ProjectileStruct.cpp (117 Bytes)
ProjectileStruct.h (511 Bytes)
Awesome, thank you. Will share my BP only solution once I got it finished. I think I’ve narrowed down what in the ForEach macro is causing the performance loss.
Only way I could get it more performant, but still not anywhere near C++ was with a timer that doesn’t tick very often. It’s just horrible performance in BP. It’s really strange.
At any rate I think I’m going to go back to my particle based collision system despite what the staff said in my other topic. It was extremely performant with the only drain being on the CPU and even then the drain was fine. Having 50,000 projectiles with collisions that report back to BP only took 30% CPU for a AMD Ryzen 9 3900X 12-Core Processor. I will never even have 10,000 projectiles on screen so this is fine.
Lots of good findings here. If anyone else finds out how to improve this looping in BP only please share your findings!
Just curious, have you also tried working with the ProjectileComponent? It does everything out of the box, but I haven’t seen a performance comparison.
Yes. I was not able to match the performance of managing structs… and I mean not even close.
The cpp function is there for anyone to try. Would be awesome if someone could compare or share a better alternative.
[quote=“Krileon, post:9, topic:555761”]
Some things look odd.
-
Image 1: Are you keeping track of the Attacks array length and are values properly added and removed, or is it adding into infinity which is what it looks like?
-
Image 3: Your “Attack” location seems to move per tick without multiplying its forward vector by delta seconds, which means that right now they will move faster or slower depending on framerate, not constant speed.
To give some context, here is a quick test.
Spawning 64 actors vs traces in same frame every 0.15s. Projectes / traces destroy after 2 seconds:
-
Average active traces: 832
-
FPS: 60~
-
Average active projectile actors: 896
*FPS: 58~
Spawning 64 same frame and maintaining projectiles. Projectiles / traces do not destroy and are spawned every 1s:
-
Active traces: 4,800
-
FPS: 30~
-
Active projectile actors: 2,432
-
FPS: 30~
Seems like there no difference spawning / destroying both actors and structs but structs seem cheaper if you want to maintain a larger active number.
Thanks for testing
I don’t need to keep track of the attacks array length. I just need to keep track of the attacks array. Yes, new projectiles are added to it based off a cooldown as it’s an infinitely pulsing attack.
The problem isn’t the array, but BPs behavior of looping an array. BP array handling is incredibly slow.
The delta isn’t in that screenshot, but it is now. Regardless that doesn’t really matter in context of the problem we’re trying to solve.
Only alternative I could find is using Niagara collision events, which will be fine for a SP game but not if you need MP. With it the collisions are on the CPU and graphics on the GPU. The load seams to split well and can basically have as many particles as your CPU can handle. There’s some other limitations to it like culling though.
It’s shocking how terrible BP is at dealing with arrays though. I would hope there’s plans to fix it, but am curious if nativizing the BP at build time would fix the issue when the game is built and this being a situation of “bad performance in just the editor”.
Think I found another option. Just good ol’ object pool with an exception. I’m spawning in an Actor with Tick disabled and all it contains is a Niagara system. That’s it. I gave up after 17k since my FPS wasn’t budging. The spawner will manage the movement as usual. I haven’t added the movement yet, but the act of spawning them in WITHOUT having to use a data representation of their appearance is a huge help to me at least. Will try adding movement in the spawner this week.
Niagara collisions was working to extent, but it’s noticable with the projectile misses when it shouldn’t due to frame delay.
Edit: Seams to work fine. The biggest hit to performance is the rendering to the GPU. That can be improved some by going back to data driven approach and feeding that data to a Niagara system, but it’s unlikely you’ll ever need to. 500 projectile objects being live indefinitely (recycling them) causes no performance hit and it’s unlikely I’ll ever even have that many.
Just to give an update: I updated the entire thing and did the following:
- I spawn the singleton when world loads. Every weapon actor has to request a projectile.
- In constructor the manager allocates a fixed size to the array of structs. This struct stores location, direction, pointer to actor that is requesting, float for speed, damage and the time the projectile will remain actor if it doesnt hit, and finally a bool to know if the projectile is active or not.
- Manager also has the option on BeginPlay to spawns and populate an array of projectile actors that are set to hidden by default: these are purely cosmetic with tick disabled, no collision nor physics, etc.
- When a request for projectile is recieved, the manager searches for first index that has the bool set to false, if it cannot find a struct that is available, the request will be ignored. If it finds a struct, it is updated with the info of the actor that is making the request (location, direction, pointer, the rest is data driven) and sets the corresponding index of the projectile actor array to visible setting its location and rotation.
- The main loop goes through the struct array in order and updates accordingly skipping those indexes that have the bool set to false.
- When a trace hits, there is a chance for ricochet depending on hit angle. If this chance returns false, bool in struct is set to false and it will be ignored on next loop. It also checks if it hit a physics actor to apply force corresponding to its speed.
- If there is not hit, bullet life gets decreesed on every tick until <=0, then the bool is set to false and ignored for next loop.
So now I’m able to maintain 64 actors requesting simultaneously every 0.05 seconds, that’s roughly 2000-2100 active projectiles with an average of 48fps (including cosmetics)… this is worst case scenario when every bullet lives its full life. In this test they are set to live 1.64 seconds, time it takes them to travel 1 km.
Performance is a lot better just by removing everything that is cosmetic: this is 160 actors making requests every 0.05 seconds.
I’m out of ideas on how to sqeeze more out of this.
Also… please ignore typos lol
I wonder if we can use Mass for this. It’s experimental, but it’s the ideal tool to be used for something like this. Might try playing with it in a new project and see if I can at least get it working for my AI.