We are experiencing an issue where Fast Arrays do not reliably remove items that the Server has removed. This issue only manifests when using Iris replication, it does not occur when using “standard” replication. It also only appears to manifest when the fast array contains non-replicating items on the Server, a feature that is supported by the “standard” fast array serializer and is also relied upon in the Gameplay Ability System (See: FGameplayAbilitySpecContainer).
I have attached a sample project which demonstrates the issue. We have a simple fast array which the player can add “Replicated” and “Local” items to. If a “Local” item is present in that array when a “Replicated” item is removed, it is likely that the “Replicated” item will not be correctly removed by the Client. In “standard” replication, eventhing behaves as expected.
The failure sequence is as follows:
Add 2x ‘Replicated’ items to the array.
Add 1x ‘Non-Replicated’ (local) item to the array.
Remove 1x Replicated Item.
The replicated item will be removed on the Server, and an update is sent to the Client. The Client will not remove the item however. The failure point appears to be the clients’ call to FFastArrayReplicationFragmentHelper::ApplyReplicatedState() - where the DstArray and SrcArray arguments still contains both replicated items.
If a “Local” item does not exist in the Servers’ array, then the ‘SrcArray’ will have the replicated item correctly removed, and the removal will be detected and applied as expected. Adding a breakpoint to the function and inspecting the arguments during the sequence may more clearly highlight the issue.
---
The way this issue manifests in GAS, is if you have a mixture of abilities that both are and are not replicated, and remove a replicated spec, that spec will not be removed by the client. This client will desync permanently, until all non-replicated abilities are removed on the Server, and another replicated ability is also removed.
Thank you for the detailed bug report and repro project!
I’ve confirmed this issue still occurs in the latest version of the engine, and I’ve opened a new issue for this, UE-319174, which should be visible in the public tracker in a day or so.
I took a quick lock at the code and it looks like you are correct, we did not expect predictive adds to be used on the server and we have no internal code using the ShouldWriteFastArrayItem() override so this has not been exposed on our end.
The predictive add was mostly used on the client to allow for prediction to use the same array. In general it is probably wise to store replicated data and non-replicated data in separate arrays to avoid additional work dealing with it.
I will see what I can do to fix it and get back to you when I have a fix.
I Confirmed and fixed the issue in CL 45820493 on //UE5-Main.
Note: Using the iris Native fastarray serializer do not support this properly and will replicated “not-replicated”
items as well but will never apply them on the client so logic will work.
So make sure to derive from FastArraySerializer instead if IrisFastArraySerializer if local items are required. (it is also possible to set cvar net.Iris.UseNativeFastArray 0 to avoid this.)
Let me know if you cannot access code and I will help you get it
I’ve investigated this further as we are somewhat desperate for a fast solution, but this raised some new concerns.
I was surprised to find that the fast array fragment uses an FArrayPropertyNetSerializer for the Quantize/Serialize stage, which has no knowledge about the specifics of Fast Array internals and always Serializes/Quantizes based on the total number of elements in the array, using indices to identify items, not their ReplicationID. Local-only items which do not have ReplicationIDs are still serialized, there is no call to ShouldWriteFastArrayItem() at the serialize/quantize stage, only the polling stage in TNativeFastArrayReplicationFragment::PollAllState().
The other concern is the way the ChangeMask property is used by serialization. Because it identifies dirty elements by index, you cannot mark an item dirty, then move it to another index in the array safely. This is another feature that is inherently “supported” by fast arrays because elements are always identified by their ReplicationID for the purposes of net serialization, so order is irrelevant.
This is quite crucial for fast arrays, since we often need/want to sort or otherwise reorder the array locally, without affecting net serialization dirty state tracking. Since historically the order of items in fast arrays is also unstable, clients may want to sort them locally for gameplay purposes.
These are somewhat lesser-known/underutilised features of Fast Array but for our purposes and the purpsoes of GAS it’s crucial that Iris matches the behaviour of the “standard” fast array replication, and it would be great to confirm whether Epic plans to 1:1 these features in Iris.
If you have any tips for how we might implement a custom serializer for the array in the meantime that would be helpful.
Yeah we updated all of our FastArray types to inherit FIrisFastArraySerializer when switching to Iris, as we assumed that was required. Unfortuantely however this doesn’t fix the issue - the Iris Net Serializer for FastArrays is still used and this is still backed by the FArrayPropertyNetSerializer for the internal array property, so fails to ID items correctly.