Iris: Request to Contribute Parallel Client Ticking

Hi, I’m opening this question to ask about contributing to the Iris parallelization efforts.

I’ve made a few posts in the past about parallelizing the client tick phase of the NetDriver when using Iris.

It’s my understanding that Epic does have a developer working on this, but that it takes a backseat depending on other feature work and bug fixes.

I’m happy to say that I recently implemented this in our version of the engine, and for nearly 2 months now we have been running our servers with it enabled without any issues.

If anyone at Epic is interested, we’d love to contribute our implementation and findings. Just let us know what channels would be best.

We playtest with roughly 50 players 2 times each week and have automatic scale tests running daily (each with 50 clients). So we’re fairly confident that the implementation is sound.

We’ve also recently had tests with 100 players in a server from various countries without issue.

When it comes to the performance, there’s some variance in our scale tests as they test different parts of our game, but the phase now runs 2x-3x faster on those tests, with our Weapons test dropping from 9.3ms average to 3.46ms.

We also see a massive improvement when the clients are joining the test server, dropping from ~35ms to ~13ms.

For non-synthetic performance, a playtest with 50 players used to average just around ~23ms. Now, we consistently run around ~8ms for the same number of players.

I tried to make the changes in a way that would minimize friction when upgrading engine versions, so it’s likely that extra performance could be squeezed out with expertise on Epic’s end; by making architectural changes or going lockless in some places.

Hi Liam - thanks for this

I’ve currently got a functional MVP version of Taskifying the NetConnection::Tick step which runs individual connections in parallel, which is very promising but does need a bit of work to finish it off.

I’d be more than happy to take a look at your implementation and see if I can draw inspiration from your approach, although I can’t promise I’d take it wholesale (as much as that would save a lot of merging, I’m sure!)

If you’re up for sharing your implementation, then I’d recommend porting it into a sample UE project like Lyra and attach it to this case. This would mean other EPS users would be able to see this though, so if you require more privacy for your code, we can set you up with a Box file share location instead.

Cheers!

Hi Liam - the latter (whole engine source tree) would be easier for me. Cheers!

Thanks so much for providing this Liam, it’s really useful to see your ideas. I was taking a look through it and you’ve taken some similar approaches to what I’ve been doing in my local prototype by starting off with some coarse grain mutexes and doing feature-level mutexes where required (like in the DeltaCompressionBaselineManager). I ended up seeing some contention on that so started improving areas like ObjectReferenceCache. Hopefully my improvements to these areas will benefit your game, and if you decide to take my version of these changes, I’d be interested to hear if you get a speedup.

Also, as a heads up I’ll likely need to add NoSync to some more of the net trace events, unfortunately - I’m hoping that we’ll be able to get a proper fix in for this but it’s not trivial due to the need to support the tail buffer. Please let me know if those changes give you any problems.

One thing I noticed was that you mentioned PIE Throttling causing a crash in FSlateApplication::Get() which I haven’t seen in my local tests yet. Do you have an approximate description of the repro steps for getting that to occur?

Hi Matt, could we go with a Box file share location? Thanks.

I’ll clean up our changes and remake them using the 5.7 Preview branch once the Github repo is back up and running.

Hey Matt, just to clarify.

Do you just want me to send a pre-built Lyra project with a patch file for the changes, or did you want the whole engine source tree with the modifications and Lyra?

Hey Matt, uploaded our implementation with some notes just now.

Excited seeing your groundwork for this going in on Github this morning.

Glad to hear it’s proved useful. We’ll end up upgrading to Epic’s implementation over time but we’re currently locking down to focus on stability, so it won’t be for a while yet.

As for the NoSync changes, I’ll make sure to notify you if there’s any issues in the future.

For the editor issue, this is the stack trace of the FSlateApplication::Get() call.

[Image Removed]I can repro this using the following settings:

  • Launch Separate Server
  • Play as Client
  • Run Under One Process
  • Number of Players: 4

You also need to have `Use Less CPU when in Background` enabled in the Editor settings. If you do this then click onto a non UE window it will trigger the crash.