Iris: Fast Array does not reliably remove items under certain conditions.

We are experiencing an issue where Fast Arrays do not reliably remove items that the Server has removed. This issue only manifests when using Iris replication, it does not occur when using “standard” replication. It also only appears to manifest when the fast array contains non-replicating items on the Server, a feature that is supported by the “standard” fast array serializer and is also relied upon in the Gameplay Ability System (See: FGameplayAbilitySpecContainer).

I have attached a sample project which demonstrates the issue. We have a simple fast array which the player can add “Replicated” and “Local” items to. If a “Local” item is present in that array when a “Replicated” item is removed, it is likely that the “Replicated” item will not be correctly removed by the Client. In “standard” replication, eventhing behaves as expected.

The failure sequence is as follows:

  1. Add 2x ‘Replicated’ items to the array.
  2. Add 1x ‘Non-Replicated’ (local) item to the array.
  3. Remove 1x Replicated Item.

The replicated item will be removed on the Server, and an update is sent to the Client. The Client will not remove the item however. The failure point appears to be the clients’ call to FFastArrayReplicationFragmentHelper::ApplyReplicatedState() - where the DstArray and SrcArray arguments still contains both replicated items.

If a “Local” item does not exist in the Servers’ array, then the ‘SrcArray’ will have the replicated item correctly removed, and the removal will be detected and applied as expected. Adding a breakpoint to the function and inspecting the arguments during the sequence may more clearly highlight the issue.

---

The way this issue manifests in GAS, is if you have a mixture of abilities that both are and are not replicated, and remove a replicated spec, that spec will not be removed by the client. This client will desync permanently, until all non-replicated abilities are removed on the Server, and another replicated ability is also removed.

Steps to Reproduce
In Sample Project:

  1. Build + Open the project in 5.6, select “Play As Client” under Play Settings.
  2. Press Play In Editor
  3. Press the “Right” arrow key to show a working sequence of events.
  4. Press the “Left” arrow key to show a broken sequence of events.
  5. Now open DefaultEngine.ini, and change net.Iris.UseIrisReplication=1 to net.Iris.UseIrisReplication=0
  6. Repeat Steps 1-4 - notice the error does not occur.

In Gameplay Abilities (In Iris)

  1. Grant a player 2x Replicated Abilities, and 1x Non-Replicated Ability
  2. Remove one of the Replicated Abilities
  3. Notice the replicated ability is NOT removed client-side.
  4. Remove the non-replicated ability, and the remaining replicated ability.
  5. Notice the replicated abilities are all removed.

Hi,

Thank you for the detailed bug report and repro project!

I’ve confirmed this issue still occurs in the latest version of the engine, and I’ve opened a new issue for this, UE-319174, which should be visible in the public tracker in a day or so.

Thanks,

Alex

Hi,

I see that you are using the FIrisFastArraySerializer might be the issue as it is not really used outside of some testcode.

Try using the standard FFastArraySerializer instead.

I will look into the JIRA Alex created as soon as I can and see why it does not work as expected.

Best regards

Mattias

Hi

I took a quick lock at the code and it looks like you are correct, we did not expect predictive adds to be used on the server and we have no internal code using the ShouldWriteFastArrayItem() override so this has not been exposed on our end.

The predictive add was mostly used on the client to allow for prediction to use the same array. In general it is probably wise to store replicated data and non-replicated data in separate arrays to avoid additional work dealing with it.

I will see what I can do to fix it and get back to you when I have a fix.

Best regards

Mattias

Hi again,

I Confirmed and fixed the issue in CL 45820493 on //UE5-Main.

Note: Using the iris Native fastarray serializer do not support this properly and will replicated “not-replicated”

items as well but will never apply them on the client so logic will work.

So make sure to derive from FastArraySerializer instead if IrisFastArraySerializer if local items are required. (it is also possible to set cvar net.Iris.UseNativeFastArray 0 to avoid this.)

Let me know if you cannot access code and I will help you get it

Best regards

Mattias

Great, just let me know if you have any further issues!

Hi Alex, thanks for the update.

I’ve investigated this further as we are somewhat desperate for a fast solution, but this raised some new concerns.

I was surprised to find that the fast array fragment uses an FArrayPropertyNetSerializer for the Quantize/Serialize stage, which has no knowledge about the specifics of Fast Array internals and always Serializes/Quantizes based on the total number of elements in the array, using indices to identify items, not their ReplicationID. Local-only items which do not have ReplicationIDs are still serialized, there is no call to ShouldWriteFastArrayItem() at the serialize/quantize stage, only the polling stage in TNativeFastArrayReplicationFragment::PollAllState().

The other concern is the way the ChangeMask property is used by serialization. Because it identifies dirty elements by index, you cannot mark an item dirty, then move it to another index in the array safely. This is another feature that is inherently “supported” by fast arrays because elements are always identified by their ReplicationID for the purposes of net serialization, so order is irrelevant.

This is quite crucial for fast arrays, since we often need/want to sort or otherwise reorder the array locally, without affecting net serialization dirty state tracking. Since historically the order of items in fast arrays is also unstable, clients may want to sort them locally for gameplay purposes.

These are somewhat lesser-known/underutilised features of Fast Array but for our purposes and the purpsoes of GAS it’s crucial that Iris matches the behaviour of the “standard” fast array replication, and it would be great to confirm whether Epic plans to 1:1 these features in Iris.

If you have any tips for how we might implement a custom serializer for the array in the meantime that would be helpful.

Hey Matthias,

Yeah we updated all of our FastArray types to inherit FIrisFastArraySerializer when switching to Iris, as we assumed that was required. Unfortuantely however this doesn’t fix the issue - the Iris Net Serializer for FastArrays is still used and this is still backed by the FArrayPropertyNetSerializer for the internal array property, so fails to ID items correctly.

Thanks for looking into it!

Thanks Mattias - I’m on break currently but will integrate asap! Thanks for attacking it so quickly!