Hi, is there any documentation that helps understand the best usage and any tips or tricks for Unreal Build Accelerator (Horde)?
We have started using it with a single Windows Horde server (no authentication) and it seems very promising.
However it is not so simple to understand if we are using it properly or if it needs tweaking.
At the moment we are using 4 dedicated agents, with 88 cores each, helping out about 20 programmers and a build machine.
Today we did a test of starting editor rebuilds for 4 programmers, waited 2 minutes then other 4 programmers and so on for a total of 19 programmers.
It seems as if each programmer is requesting agents and will obtain 1 or more, but not following a queue. It might be the programmer that joined last that obtains one or more agents.
Is there a concept of queue / priority that can be used?
Other information such as
- More agents with less cores vs less agents with more cores. We have the impression that a agent helps only one machine at a given time, so cores could be idle until all processes requested by a machine are completed
- how much can StoreCapacityGb help and where it helps (local vs agent). How much would 20GB be useful versus 64GB or even more?
- bUsePCHFiles / bUseUnityBuild etc. what setup would be more efficient?
if it does not already exist I think a guide with tricks/tips - best practices would be really appreciated.
Thanks and best regards,
David
Hey there David,
Yes we do have a UBA practical debugging tips Knowledge Base article here. It coves more nuanced debugging techniques with the system, and some configurability points. So simply put, there is no real queueing system and it’s more a matter of how the Horde Coordinator receives the requests, and services them accordingly. It’s really important to note that UBA works best with Horde, but Horde is not a hard requirement. By using Horde with UBA, you’re utilizing it as a UBA Coordinator, of which the most important logic can be seen here. Fundamentally it’s all about how many cores you’re trying to acquire at any given moment, and if you cannot obtain any, it will continue on with local actions and try again at a later point (and as a result, no such queue). It’s important to note that the nature of the work and the potential for a local user to cancel a build is high enough that managing a queue could result in a lot more work than simply retrying. To my understanding, there is no such queue/priority concept that we currently have within the UBA Horde Coordinator.
Now - given the topology you’ve got (4 incredibly powerful machines), I could see where such a queue system would make sense and feel necessary. I’d probably recommend more machines with less cores (16/32) - to my knowledge multi concurrent leases on a single agent isn’t exactly something we support (and as a result do); that is to say, you can’t split up the 88 cores evenly to be used by 11 different users.
- To my understanding, there is no such queue/priority concept that we currently have within the UBA Horde Coordinator.
- Answered above, but one extra detail here is MaxCPU - which is particularly useful in constraining the number of CPU cores you require from the cluster (so ideally a common denominator for you cluster so you’re using clean increments)
- ubaStorageServer utilizes the StoreCapacityGb for it’s CAS - this is used to host any order of inputs/outputs that will be served.
- This is particularly relevant in preventing over the wire transmission as you’ll utilize local storage before coordinating via detours.
- The answer here really depends on your variability of builds (branches, etc) across your UBA agents, their lifetime, and just how much could be practically re-used
- Default being 40gb, with a more constrained of 20gb as necessary.
- Regarding further comparative benchmarks, this isn’t something we’ve strictly done. You can monitor your UP/DOWN via UBAVisualizer to get a sense of average local reuse
- bUsePCHFiles & bUseUnityBuild
- No clear answer here, as intent & environment really matters (especially for use unity build)
- In short, bUseUnityBuild==false will result in far more actions - which will cause more contention for cores, albeit for shorter duration
- PCH Files can be pretty large, so this will place further demand on bandwidth. UBA is at it’s best in a bandwidth rich environment, but should you have lower upload speed, you may want to disable pch via -NoPCH
Related - we will be working a bit on our Horde Analytics & Telemetry over the coming quarters, and something I’m looking to provide more data & hooks for is the UBA build context. As with any optimization and profiling efforts, it’s best to work off data. Given the amount of variability of setups in infrastructure & network, it’s hard to prescribe a perfect set of tuning variables - so it’s best for us to create a telemetry & analytics stream for users to nail their setup through measurement.
Kind regards,
Julian