Hello.
Not sure how to reproduce or how to protect ourselves from it.
It crashed when unspawning an AI which will destroy its entity. Inside DestroyEntities “IsProcessing” is true, but when it breaks into the debugger, “ProcessingScopeCount” is 0 (so IsProcessing should return false). In MassAgentComponent it checks “IsProcessing” and if it’s true it will defer the job. It didn’t defer the job in this case, but the checkf still failed.
My only explanation is that some small task managed to run on some other thread between all of these checks meaning that this flow isn’t 100% thread safe. Am I missing some piece of logic that would make this thread-safe? I couldn’t find any suspicious task in “parallel threads”, but ProcessingScopeCount is 0 so it should have completed again.
My theory:
- MassAgentComponent checks process == false
- MassAgentComponent calls “DestroyEntities”
- Some other job starts processing, sets ProcessingScopeCount to 1
- MassSpawnerSubsystem::DestroyEntities checks for processing, fails, starts breaking into debugger
- Other job completes, sets ProcessingScopeCount to 0
- Break into debugger completes, ProcessingScopeCount is seen as 0
We have been investigating some crashes that appear to be from multithreading issues in Mass, and this looks like it may be another one to address. I have a few questions that may help us narrow down the code path that could cause this. Did the crash happen while still playing or was it part of a world teardown such as ending PIE? Is this the first/only time it has happened that you are aware of? If not, how often does this crop up?
-James
It was while playing. Cooked build.
It’s the first instance that I’m aware of. Couldn’t find anything in our crash-database. We upgraded from 5.5.4 to 5.6.1 a few weeks ago. Based on this, I would say that it’s a rare crash.
Thank you for the extra information! I simply wanted to see how often it could be happening for our own testing purposes. That does indeed seem quite rare.
The team is currently investigating the issue on a couple fronts for what might be the culprit. I have created a bug report for the issue as we look into this. Here is a link to the report on our public issue tracker: https://issues.unrealengine.com/issue/UE-318908. It may take a couple days to update and mirror correctly.
One way to avoid this is to always use the deferred commands to destroy the entities. It would need a code divergence currently, but it may be how to handle this at least temporarily.
-James
Thank you! If we see it again, I will remove the “if not procesing” part and always defer.
That is a workaround that should be perfectly safe. If you do encounter it again, please let us know. I would gladly take any additional info you may be able to capture.