Starting on 5.7, our Staging step intermittently gets this error:
Reading oplog from Zen...
Failed sending oplog request to Zen at [::1]:8558 for oplog ABC.1a23b45c.Windows: Error while copying content to a stream..
(see D:\Horde\ABC\Sync\Unreal\Engine\Programs\AutomationTool\Saved\Logs\BuildCookRun\Log.txt for full exception trace)
AutomationException: Failed sending oplog request to Zen at [::1]:8558 for oplog ABC.1a23b45c.Windows: Error while copying content to a stream..
at AutomationScripts.Project.ReadZenCookedFilesFromZenServer(ProjectParams Params, DeploymentContext SC, Boolean bAutoLaunch, String PackageStoreFileArgName, String PackageStoreFileArgValue) in D:\Horde\ABC\Sync\Unreal\Engine\Source\Programs\AutomationTool\Scripts\CopyBuildToStagingDirectory.Automation.cs:line 1001
...
Occurs maybe once every 10 runs, on different machines and target platforms.
Our Staging step is combined into a single BuildCookRun command `-cook -stage -package` and looks like this:
<Property Name="PROJECT" Value="ABC"/>
<Property Name="Platform" Value="Win64"/>
<Property Name="Config" Value="Test"/>
<Property Name="ArchiveDirectory" Value="$(RootDir)\$(PROJECT)\Publish\Package"/>
<!-- Configuration: Root=(Type=Hierarchical, Inner=<Local>, Inner=<Remote>, Inner=<Cloud>) -->
<Property Name="DDCGraph" Value="DDCGraph_Horde"/>
<Command Name="BuildCookRun" Arguments="-project=$(PROJECT) -target=$(PROJECT) -platform=$(Platform) -configuration=$(Config) -archivedirectory=$(ArchiveDirectory) -NoCodeSign -skipbuild -cook -CookIncremental -ddc=$(DDCGraph) -stage -pak -iostore -package -archive -crashreporter -prereqs"/>
Debugging showed 3 scenarios for that Staging step “Reading oplog from Zen…” `ReadZenCookedFilesFromZenServer()`:
- `IsZenServerRunning(SocketHostNameAndPort)` = true, and the Zen GET request succeeds.
- `IsZenServerRunning(SocketHostNameAndPort)` fails the GET `health/ready` with “No connection could be made because the target machine actively refused it.”, resulting in calling `RunUnrealPak(…, “ZenAutoLaunch”)`, and then continues successfully.
- This is the most common scenario for successful runs.
- `IsZenServerRunning(SocketHostNameAndPort)` = true, and the Zen GET request fails “Error while copying content to a stream..”.
Looking in the Zen logs for all the scenarios, they all start during the Cook step and look similar going in the Staging step regardless of outcome.
The temporary fix that works for us so far is to split out the Cook into its own Node/step `<Cook …/>`.
Is this a bug or a misconfiguration on our part? I split it out the steps initially thinking Zen may need some time after Cooking, but the most common scenario is still above #2 where the health GET fails.
[Attachment Removed]