Iris: Unreliable RPCs such as ClientAck are being treated as reliable

Hello, as the subject line states it looks to me like unreliable RPCs are actually being treated as reliable in Iris.

FNetBlobManager::SendUnicastRPC -> UNetRPCHandler::CreateRPC

When creating the FNetRPC we set CreationInfo flags to Ordered as long as the RPC is not multicast.

FNetObjectAttachmentSendQueue::Enqueue

Then when it comes time to enqueue the associated FNetBlob, the flags are checked and Ordered RPCs are put into the ReliableQueue.

I’m not certain this is actually causing issues, but I found it while debugging an issue in which our client stopped receiving adjustments for a fairly long time (on the order of seconds) even though there was no drop in server frame rate. My expectation was that client acks/adjustments were going to be using the OOB path and I think ultimately that is my question: can client acks/adjustments be sent using the OOB queue as to ensure that the client gets those messages ASAP?

Thanks,

Nick

`TRefCountPtrUE::Net::Private::FNetRPC UNetRPCHandler::CreateRPC(const UE::Net::FNetObjectReference& ObjectReference, const UFunction* Function, const void* Parameters) const
{
FNetBlobCreationInfo CreationInfo;
CreationInfo.Type = GetNetBlobType();
CreationInfo.Flags = ((Function->FunctionFlags & FUNC_NetReliable) != 0) ? UE::Net::ENetBlobFlags::Reliable : UE::Net::ENetBlobFlags::None;
// Unicast RPCs should be ordered with respect to other reliable and unicast RPCs.
if ((Function->FunctionFlags & FUNC_NetMulticast) == 0)
{
CreationInfo.Flags |= UE::Net::ENetBlobFlags::Ordered;
}

FNetRPC* RPC = FNetRPC::Create(ReplicationSystem, CreationInfo, ObjectReference, Function, Parameters);
return RPC;
}`…the above is throwing me, why would we mark an RPC ordered if it’s unreliable regardless of whether it’s multicast or unicast?

Lastly, am I reading this right that ENetObjectAttachmentSendPolicyFlags::ScheduleAsOOB is not actually being used? I see it being queried in multiple places, but I don’t see it being set anywhere.

Hi,

It is intended for the unreliable RPCs to end up in the reliable queue, as this is done to make sure these RPCs are executed in the expected order. However, they should not be resent if they are dropped, and the client shouldn’t wait for previously dropped RPCs to execute.

For full context, there were previously some issues with ordered RPCs waiting for older reliable/unreliable attachments to be received, causing large delays, but this was fixed with CL 29916908. Another fix related to this functionality was made at CL 31719627, which addressed delays caused by large RPCs being split and sent reliably.

There were also some more related changes made at CL 35707422, which increased the send window size on ReliableNetBlobQueue from 256 to 1024, as well as CL 35797135, which fixed an issue with packet emulation being applied multiple times to packets, causing higher than expected packet loss when using the editor’s network emulation settings.

All these changes should be in 5.5, but it’s worth double checking that you have these.

As for ScheduleAsOOB, you are correct that this isn’t set anywhere in the engine code, but rather, the intention is that projects can call UReplicationSystem::SetRPCSendPolicyFlags to mark certain RPCs as needing to be sent immediately.

If too many RPCs are being called per frame, it is possible that the engine won’t be able to send all of these, causing a RPC backlog to start building up. If possible, could you provide some more information on your setup and the issue you’re seeing? Verbose net traces from the client and server would especially be useful for getting an idea of why the slow down is occurring.

Thanks,

Alex

Thanks Alex,

Our server ticks at 20Hz and our server to client bandwidth is 2Mbps per connection. We do have a custom character movement component but that lives outside of what we’re currently interested in. We have some known bandwidth issues at the moment (lots of actors wanting to replicate a decent amount of data every frame). We see in the server logs that the character movement component wants to send an adjustment and sends it via RPC, but we don’t actually see it getting processed client side until N seconds later. It’s very hard to repro, but it does happen at least a couple times a day during playtesting.

The bandwidth led me to digging into reliable vs unreliable, as I know the ClientAck is unreliable and so at the very least I would have expected the ack to get dropped on the floor and never even go out the door server-side. But it did, the question then is did it leave late or did the client hold on to it for a long time. My guess is it’s the former. I strongly feel like client acks/adjustments shouldn’t wait and should be treated as truly unreliable. Curious what your thoughts are, and if it makes sense for us to pursue using SetRPCSendPolicyFlags for client acks/adjustments.

One more note with respect to this comment: “as this is done to make sure these RPCs are executed in the expected order.” Why then is it only done for unicast? Would multicast not suffer from this same problem?

Lastly, if we were to end up going the route of using OOB for client acks/adjustments where does it make the most sense to call SetRPCSendPolicyFlags?

Thanks again!

Hi,

This is done for unicast to mimic the behavior of the current replication system. When not using Iris, unicast RPCs are written to the bunch as soon as they are called. If there’s no packet loss, this results in the call ordering for unicast RPCs on a single actor being preserved, regardless of reliability. Unreliable multicast RPCs are queued and sent along with the actor’s next bunch of property data, so the call order for these is not preserved.

In Iris, unicast RPCs are written to the reliable queue in order to similarly preserve this ordering, so projects don’t run into issues when switching to Iris due to RPC orderings changing. They aren’t actually treated as reliable, and they also won’t be sent less frequently as a result. However, depending on the number of calls, we may not be able to fit them all into the packet, causing congestion.

I do think this could make sense, but we have not tried or tested setting the client acks/adjustments as SendImmediate ourselves. To set this up, you can have a game module hook into the delegates on FReplicationSystemFactory in order to know when replication systems are being created. From here, you can set the send policy flags for the desired RPCs on the newly created system.

Thanks,

Alex