A couple of useful macros: Loops Per Tick

Monokkel · June 6, 2015, 11:41pm

Blueprints are awesome and can do almost everything you can imagine. Their biggest drawnback are that they are significantly slower than C++ code. This does not matter in many cases, and they are fast enough for most things, but sometimes you might encounter problems that are hard to solve without using lots of processing power, and where the relatively slower speed of blueprint scripting will be noticable.

When I began making A* pathfinding in blueprints this was a constant worry, and my early algorithms would often cause a noticable drop in fps as they were running. Asking around on these forums I was sometimes adviced that I should try to split up my calculations over several ticks, but when I was just starting out I was uncertain how best to do this. So instead I focused on making my algorithm as fast and efficient as possible, and in the end I was able to create pathfinding in blueprints that it could calculate any path I wanted in less than a frame, making it less pressing for me to split it over multiple ticks.

As I have continued working on the toolkit, however, I have started working on more complex algorithms to improve my AI, and am also adding support for slower devices such as mobiles. This has brought back the thought of splitting up loops, and so I’ve made a couple of macros that can do this quite easily. I haven’t had to use them yet, but they will be great to fall back on if I encounter a problem that I cannot optimize further.

I thought I should share these with the community as this is something I would have found very useful when I was starting out. I have created macros both for ForLoops and ForEachLoops. The macros have an exec input for initializing the loop which opens up a gate for a tick event. After that the tick input runs the loop a number of loops equal to the Loops Per Tick input until it reaches the Last Index. The macro outputs Loop Body and Index just like regular loops as well as an exec that fires every tick and one that fires when the loop is completed.

I cannot create proper pastebins for the macros as inputs and outputs can’t easily be copied, so here are both the pastebins in Blueprintue form and screenshots to let you build it easily enough. I apologize that they are not the prettiest blueprint graphs. I’m usually pretty good at keeping the graphs tidy, but it’s hard to do with all the execs going back and forth.

(Edit: I’ve made some improved versions that can be found further down in this thread)

Screenshots of the loops (ForLoop | ForEachLoop)

BlueprintUE (ForLoop | ForEachLoop)

A few things to be aware of:

The macros need to be contained within a master blueprint that contains the IndexTemp variable.
Multiple loops should probably not run at the same time using the same macro (haven’t tested this).
The macros should be set up in the master blueprint something like this, with a custom event that can be called from outside:

I hope these will prove useful to someone else who wants to do some heavy lifting with blueprints. Happy blueprinting!

Zeustiak · June 7, 2015, 3:34am

Awesome! This is coming up on my to do list, so I am sure you will save me loads of time otherwise spent reinventing the wheel!

Monokkel · June 7, 2015, 8:10pm

I’m sure you’ll find some use for them In my toolkit I’m planning to use them for pregeneration. I’ll run it in the background before it is needed; say pregeneration for the next unit while the previous unit is moving so that it is already calculated when the next unit is activated. If you want to be fancy it is also pretty easy to set up a queue of slow loops by checking if the “ongoing slow loop” variable is true before starting a loop and storing the input in an array until it is false again.

Here is a quick video I made a while ago demonstrating a slow loop in action, using my visibility algorithm. Both instantaneous and slow generation is shown. Having the loop be this slow is certainly overkill, but I’ve slowed it down as much as I can for demonstration purposes:

Zeustiak · June 23, 2015, 11:10am

Started implementing your nodes. Definitely saved me a ton of time with your work as a starting point! I was running into an issue where it wouldn’t output a Completed string, and that was causing problems, but it went away and I am not sure why. Could be a bug, or just my foggy brain these days.

It is going to be awhile before I can share any results though. There is a lot blueprint that will have to be retorfitted before I can have the whole generation using it and am able to run some tests.

What is the best way you have found to throttle the number of loops per tick?

Also, do you use a sequence to pipe the tick into multiple loops or is there a better way?

Chooka · June 23, 2015, 2:06pm

set timer is your friend and tick is your enemy.

Zeustiak · June 24, 2015, 12:34pm

I don’t think you understand what tick is being used for here.

Monokkel · June 30, 2015, 9:43pm

Hi Zeus, sorry for the very late reply. I hadn’t subscribed to the thread, and didn’t notice you had answered until I read through my old threads. Don’t know what the sting output bug could be, I’m afraid. As for throttling the number of loops per tick, I don’t completely understand what you mean. There is an input in my function where you can choose the number of loops you want to run each tick, but you are of course aware of that. I haven’t had to run multiple loops in succession in my own work yet, but I think a sequence is probably what makes the most sense in your case, where you are always going to run the world generation algorithm in the same sequence. I gues you could have a multi-gate with a tick input that outputs to multiple slow loops, and open and close the gate as appropriate each time a loop has finished.

I’m also a bit uncertain how that would work, Chooka. Could you give an example of how you would achieve a similar effect with set timer?

Zeustiak · July 1, 2015, 7:39am

The multi gate idea might work, though perhaps not necessary. Using a sequence to spread the tick out over the various loops seems to work well enough. I may look at various other options later.

For now, I have most of the generation working on Tick using your loops and it has been going pretty well. It is good to be able to watch a performance monitor as the map generates so I can see areas that are extremely performance heavy. Generally though, generation of a 5000 tile map has increased from 26 seconds of balls to the wall processing, to about 30 seconds using tick, and that is without any optimization effort and random loop-per-tick values chosen. I will laugh if I can get it running faster than it was before. I figured that spreading it over a tick would actually be significantly slower. Maybe there was more of a traffic jam going on than I realized. Bodes well for AI indeed, especially since you don’t have to do that much to use this method.

The only part I don’t like is that you can’t put events or tick directly in a function, which meant I had to drag everything out of the functions and paste it on the main event graph. Small price to pay, unless you know how to insert the tick into the functions. Don’t really want to make them into macros, so if not then it is fine as is.

What I meant by throttle the number of loops, is how do you adjust for people with faster/slower computers? I know it should self correct to an extent, but I feel like 100 loops per tick on a given function might be ok for my machine, but destroy someone else’s. I might need to work out a test that checks their FPS under heavy load, and then set a master variable that throttles the loops per tick to some %. Or maybe not, we will see how it goes.

Monokkel · July 1, 2015, 11:30am

Zeustiak;322317:

The multi gate idea might work, though perhaps not necessary. Using a sequence to spread the tick out over the various loops seems to work well enough. I may look at various other options later.

For now, I have most of the generation working on Tick using your loops and it has been going pretty well. It is good to be able to watch a performance monitor as the map generates so I can see areas that are extremely performance heavy. Generally though, generation of a 5000 tile map has increased from 26 seconds of balls to the wall processing, to about 30 seconds using tick, and that is without any optimization effort and random loop-per-tick values chosen. I will laugh if I can get it running faster than it was before. I figured that spreading it over a tick would actually be significantly slower. Maybe there was more of a traffic jam going on than I realized. Bodes well for AI indeed, especially since you don’t have to do that much to use this method.

The only part I don’t like is that you can’t put events or tick directly in a function, which meant I had to drag everything out of the functions and paste it on the main event graph. Small price to pay, unless you know how to insert the tick into the functions. Don’t really want to make them into macros, so if not then it is fine as is.

What I meant by throttle the number of loops, is how do you adjust for people with faster/slower computers? I know it should self correct to an extent, but I feel like 100 loops per tick on a given function might be ok for my machine, but destroy someone else’s. I might need to work out a test that checks their FPS under heavy load, and then set a master variable that throttles the loops per tick to some %. Or maybe not, we will see how it goes.

Like I said, I think a sequence makes most sense in your case, but I’ll let you know if I have any bright ideas.

It’s very surprising to hear that using my loops is almost as fast as doing everything in one tick. That’s very good news to be sure! I’m excited to see how far you can push it. But I have to wonder how you’ve set it up. If you have loops per tick set to a sufficiently large number the slow loop should be identical to a regular loop. This is with 100 loops per tick? Does the framerate remain high through the entire world generaton? By the way, is your map generated in such a way that all the different procedural generation steps are visible, so that you can see the world generate procedurally in front of your eyes? In that case I’d love to see a video!

I don’t think it’s possible to put the slow loop in a function. I believe functions have to be resolved during a single tick. Why do you feel it is problematic to use macros instead? That’s what I have done in my own project. Make a new macro that holds all the procedural generation code for one aspect of your system and put the slow loop macro inside this macro, connecting the input and output of the slow loop to similar input and output of the containing macro. I don’t see why you would want to have everything in one large event graph again. We both know that has some big tradeoffs.

For throttling the number of loops I’ve given it some thought before, but haven’t implemented a solution yet. The simplest and laziest solution is to have the number of loops per tick be multiplied by some percentage depending on the graphic settings set by the individual user. Another might be to look at the fps and dynamically increase the amount of loops per tick depending on the current fps. That might cause problems in games where fps has been brought down due to several other processes, though. I’ll let you know if I can think of something else.

Zeustiak · July 2, 2015, 6:20am

Monokkel;322381:

Like I said, I think a sequence makes most sense in your case, but I’ll let you know if I have any bright ideas.

It’s very surprising to hear that using my loops is almost as fast as doing everything in one tick. That’s very good news to be sure! I’m excited to see how far you can push it. But I have to wonder how you’ve set it up. If you have loops per tick set to a sufficiently large number the slow loop should be identical to a regular loop. This is with 100 loops per tick? Does the framerate remain high through the entire world generaton? By the way, is your map generated in such a way that all the different procedural generation steps are visible, so that you can see the world generate procedurally in front of your eyes? In that case I’d love to see a video!

Yeah, much of the generation is significantly below the red line, so I have room to fine tune things and make it even faster. I just chose 100 per loop as a general test case that turned out to be pretty close to what I actually need.

The map just generates the information a step at a time, and then spawns everything in at once. It could be pretty cool to see the map grow before your eyes, so maybe one day if I have time I could do something like that. I don’t have a gameplay case for it, but could be useful for debug.

It isn’t really a problem, just a preference I guess. I still have things split among several blueprints, so it isn’t too bad really. Most of the event graphs can still be viewed on a single screen when zoomed all the way out anyway. I will worry about organizing it sometime later.

Yeah, I was thinking about using FPS to set a % variable that drives each loop-per-tick, so 60+ might be 100%, 40 might be 75%, etc. Then on top of that you could still have game settings that drive that even lower if a user knows his PC can’t generate a map normally and doesn’t mind waiting an hour for it to finish… Maybe with settings like that you could even do it on an iPad or something.

Monokkel · July 2, 2015, 9:59am

Yeah, combining the two solutions seems like a decent enough way to do it. There might be better metrics than fps, though, but it’s the best I can think of at the moment. By the way, you’d probably want to vary the number of loops per tick for each particular slow loop. The threshold for when fps starts to be affected will naturally vary depending on the amount of stuff you have within each loop. I can’t think of a good way to figure out these values besides trial and error, though. You might time how long a single loop iteration takes in real time and adjust loops per tick depending on that.

Zeustiak · July 2, 2015, 1:27pm

Yeah that is the plan. There is a performance monitor that plots a real time graph on screen with Game Time, GPU Time, Draw Time, etc. Since many of my loops take several seconds to complete it gives me a clear view of what loop needs tweaked and in which direction.

Monokkel · July 2, 2015, 6:19pm

Sounds good! Keep me posted on your work going forward

Monokkel · July 12, 2015, 3:25pm

I’ve made some slight changes to the loops. I’ve removed the “Index Temp” variables as I found out they weren’t really necessary. Since there are no references to stored variables in the loops anymore this means that they can be held in macro libraries, so that they can more easily be called in different blueprints. It should now also be possible to run the loops in parallel without any interference. Here are pictures of the new loops:

For Each Loop (screenshot | pastebin)
For Loop (screenshot | pastebin)
For Loop With Break (screenshot | pastebin)

Zeustiak · July 13, 2015, 10:11am

You just had to make a macro library version after I got 75% of the generator working with the other version. I was actually wondering why you had those variables set up like that, but wasn’t going to question something that was working well as it was. I put your new macro to work in a library and so far so good. I will implement the library for the rest of the loops as well. Should save tons of time going forward.

In order to smooth out some spikes that are still showing up in the generation I will most likely convert the main outer functions to macros so I can insert tick into them and keep things clean. Generation time is at about 36 seconds right now, with room for optimization in the loops per tick. Certain functions do exponentially higher work on larger map sizes though, so I will definitely have to work out a way to throttle it based on a multitude of factors.

Monokkel · July 13, 2015, 11:13am

Yeah, sorry about that My initial tests seemed to indicate that I had to use an external variable if I was to use a loop counter that would not reset each time the tick went through the macro. Either Epic changed something in 4.8 so that this was no longer needed or (more likely) I overlooked something. Anyway I’m glad I took a second look. It’s a lot easier to implement like this.

Let me know if you find a good way to have loops per tick be adjusted automatically depending on the requirements of the specific loop. I’ve implemented these loops in my toolkit the last few days in preparation for my mobile update. Turns out it might not really be needed, as even my pretty old smartphone is able to run the pathfinding pretty smoothly without having to split up the loops. I guess one should always make sure whether you really need to optimize before you begin doing so Ah well, I’m sure I’ll find some use for them in any case.

Zeustiak · July 13, 2015, 12:11pm

Oh well, at least I got to benefit from your work!

Anyway, it sounds like you have room to put more robust logic into your AI and/or have more AI on the field at once.

Monokkel · July 13, 2015, 12:33pm

Indeed I’m already well on the way to implement some greatly improved AI that benefits from the new loops. I have given the AI a second from they are activated until they start acting to stop things from feeling extremely hectic in any case. If I use all this time for AI computations I can increase the amount of information processed by a factor of at least 60. Also, the loops are still useful for calculating huge distances, like in large Civ-style maps.

Zeustiak · July 18, 2015, 12:20pm

Here is the basic throttle that I made for the per tick function:

Loop Per Tick Baseline is defaulted at 1, and is something I will use to allow users to choose high or low power to manually reduce the load(as well as for dev testing). Otherwise, dividing into .025 you get maybe 2.5 times more loops at 100 FPS and 50% less loops at 20 FPS. You can play with the ratio as needed.

It isn’t what I would call a final product, but it works a bit at least. It cut my generation time from 52 seconds to 44 seconds by itself, so it basically works. It will only go so far though, so fine tuning of the input number of loops per tick is still needed.

Ideally, you could create a function that would attempt to keep it at your desired FPS at all times, but I will work on that later.

Monokkel · July 19, 2015, 9:43am

Thanks! That’s a simple and elegant solution, and I’m glad to hear it works. I’ll try to implement it in the toolkit, and if I can think of any way to improve it I’ll post it in this thread.