## Description
We found an intermittent deadlock in `ParallelQueue.Drain()` (`Engine/Source/Programs/Shared/EpicGames.Build/System/Parallel2.cs`) that causes UnrealBuildTool to hang forever on macOS. The deadlock occurs during `Rules.PrefetchRulesFilesInternal()` when discovering `.Build.cs` and `.Target.cs` rule files.
The bug is a race condition in the producer/consumer coordination between worker threads and `Enqueue()` callbacks. When an `action()` callback (e.g., `FindAllRulesFilesRecursively`) calls `Enqueue()` to add subdirectories, there’s a window where:
1. Worker A finishes its action, calls `Interlocked.Decrement(ref _outstanding)` which returns 0
2. Worker B (on another thread, inside its own `action()`) calls `Enqueue()`, which calls `Interlocked.Increment(ref _outstanding)` back to 1
3. Worker A sets `_accepting = 0` (line 305)
4. Worker A reads `_outstanding` — sees 1, does NOT set `_done`, does NOT release workers
5. Worker B’s enqueued item is processed, `_outstanding` goes back to 0
6. But now `_accepting = 0`, and the done-signal path has already been skipped
7. All workers block on `_available.Wait()` forever — deadlock
The core issue is that the `Decrement` returning 0, the `_accepting = 0` assignment, and the `_outstanding == 0` re-check (lines 300-310) are not atomic with respect to concurrent `Enqueue()` calls from action callbacks.
## Workaround
Calling `queue.Drain(helperCount: 0)` in `Rules.PrefetchRulesFilesInternal()` forces single-threaded execution, eliminating the concurrent race. File enumeration is I/O-bound, so the performance impact is minimal.
## Proposed Fix
The race in `Drain()` lines 300-310 needs the `_outstanding` decrement and the done-check to be atomic with respect to `Enqueue()`. Options:
- Lock-based: wrap both `Enqueue`'s increment and the worker’s decrement+done-check in the same lock
- Restructure: use `CountdownEvent` or a different signaling mechanism that handles concurrent `AddCount`/`Signal` atomically
We attempted several lock-free and lock-based fixes but the race has multiple interleaving paths that are difficult to close without restructuring the `Drain()` method. We’d appreciate Epic’s take on the correct fix for `Parallel2.cs`.
## Environment
- macOS 15.6 (Sequoia), ARM64 (Mac M-series EC2 instances)
- .NET 8.0.412
- UE 5.7
- FASTBuild executor with ParallelExecutor for local actions
- Xcode 16.4 + Xcode 26 beta installed (Xcode 26 install increased repro rate, likely due to changed file enumeration timing from additional SDK directories)
## Related
- EPS 0D5QP00001XJH7Y0AX (ImmediateActionQueue race — different bug, same codebase area)
- Epic PR #13917 (ImmediateActionQueue fix — does not help this bug)
[Attachment Removed]