Pixelstreaming Connection freezes infinite for client

Hello,

we having issues since a few Month where a client Stream freezes and is not recovering. This is currently only reproducible with some Browsers, like vivaldi. With edge it does not happen, and with chrome it will recover after up to 10 seconds.

At the same time the Output window on the EC2 Instance is completley responsive and works fine. We are using our own Signalling / Cirrus and a TURN from AWS.

I attached the WebRTC-internals and the Unreal log with verbose logging for pixelstreaming. The issue occured at 16.34 (in the Webrtc stats 17:34)

We quite lost what we could potentially do about this issue.

Thanks a lot!

Best

Dominic

[Attachment Removed]

Hi there Dominic!

We’ve had a look through the provided logs and we can’t see anything that should be causing unrecoverable stream freezes. There is a lot of re-transmitting however, which could be indicative of network issues or other flakiness.

From what we can tell we think it’s likely a browser issue. We haven’t really tested much outside of the major browsers (Opera, Safari, Chrome, Firefox etc), so we can’t guarantee success on browsers like Vivaldi. See if you can reproduce the issue on one of the above mentioned browsers.

Another option that may potentially help is enabling Flexfec. Flexfec will attempt to resolve network instability with WebRTC streaming packets without relying on re-transmission.

To enable Flexfec, run your Pixel Streaming application with:

-PixelStreamingWebRTCEnableFlexFecLet us know if this helps resolve the issue!

Kind Regards,

Michael

[Attachment Removed]

Today I tested with UE5.7 and Pixelstreaming2.

First of all a big shout out to the Pixelstreaming2 Developers, it looks like a huge jump forward!

With UE5.7 + Pixelstreaming2:

  1. I could not reproduce the Freeze. (for sure because of point 2)
  2. Now every user gets a separate encoded stream. Even one with AV1 and one H264 is possible. This brings me to the questsion if an SFU will be still needed in the future.
  3. It felt much snappier!
  4. Sadly I get crashes when having 2 streams connected -> See log + CallStack which is missing in the log but not in our sentry system
EXCEPTION_STACK_BUFFER_OVERRUN / FAST_FAIL_INVALID_ARG / 0x7fff6be1d784
Fatal Error: EXCEPTION_STACK_BUFFER_OVERRUN / FAST_FAIL_INVALID_ARG / 0x7fff6be1d784
 
Thread 3852 Crashed:
0   nvEncodeAPI64.dll               0x7fff6be1d784      <unknown>
1   nvEncodeAPI64.dll               0x7fff6be1d71b      <unknown>
2   D3D12Core.dll                   0x7fff74e661cb      D3D12GetInterface
3   D3D12Core.dll                   0x7fff74f3ebf5      D3D12GetInterface
4   nvEncodeAPI64.dll               0x7fff6bdb155f      <unknown>
5   nvEncodeAPI64.dll               0x7fff6be4c6c0      <unknown>
6   nvEncodeAPI64.dll               0x7fff6bdad2dd      <unknown>
7   nvEncodeAPI64.dll               0x7fff6bdb20d1      <unknown>
8   nvEncodeAPI64.dll               0x7fff6be4c608      <unknown>
9   nvEncodeAPI64.dll               0x7fff6bdc5585      <unknown>
10  nvEncodeAPI64.dll               0x7fff6bdeb1e9      <unknown>
11  AVE_Application.exe             0x7ff6b20e710c      FEncoderNVENC::SendFrame
12  AVE_Application.exe             0x7ff6b20d031a      FVideoEncoderNVENCD3D12::SendFrame
13  AVE_Application.exe             0x7ff6b20cb148      TVideoEncoder<T>::TWrapper<T>::SendFrame
14  AVE_Application.exe             0x7ff6b1f09c48      TVideoEncoder<T>::TWrapper<T>::SendFrame
15  AVE_Application.exe             0x7ff6b1f1770f      TVideoEncoderRHI<T>::SendFrame
16  AVE_Application.exe             0x7ff6b1ff3355      UE::PixelStreaming2::TEpicRtcVideoEncoder<T>::Encode
17  AVE_Application.exe             0x7ff6a7bf2152      EpicRtc::Video::EncoderWebRtc::Encode
18  AVE_Application.exe             0x7ff6a7f28e83      webrtc::VideoStreamEncoder::EncodeVideoFrame
19  AVE_Application.exe             0x7ff6a7f280b2      webrtc::VideoStreamEncoder::MaybeEncodeVideoFrame
20  AVE_Application.exe             0x7ff6a7f27404      webrtc::VideoStreamEncoder::OnFrame
21  AVE_Application.exe             0x7ff6a7f200a1      std::deque<T>::_Xlen
22  AVE_Application.exe             0x7ff6a7d6aa68      webrtc::CreateTaskQueueWinFactory
23  AVE_Application.exe             0x7ff6a7d69cda      std::make_unique<T>
24  AVE_Application.exe             0x7ff6a7d69b16      std::make_unique<T>
25  KERNEL32.DLL                    0x7fff97c34caf      BaseThreadInitThunk
26  ntdll.dll                       0x7fff992bedca      RtlUserThreadStart

[Attachment Removed]

Hi there Dominic, apologies for the delay and thank you for your patience! You’ve sent a ton of great information through.

The crash is particularly confusing. As every user gets their own stream, we’d recommend setting up an SFU in the multi-user case, so that application performance isn’t degraded when multiple people connect simultaneously.

For the Pixel Streaming 1 deployment, it would be worth modifying your frontend to not request quality controller for a new player on connection. The idea being that the first peer connected should keep receiving their initial quality. We’re not 100 percent sure of the effects on the second peer, but they should receive a stable stream just fine.

For Pixel Streaming 2, that crash is new, we’ve not seen it before. Judging from the log it appears the first peer disconnects as the second peer connects. It may be a race condition and worth further investigation on our end.

Let me know if any of the above suggestions help out and we can keep investigating

Kind Regards,

Michael

[Attachment Removed]

Hi Dominic,

I’ve discussed this issue with the team at length, and unfortunately we can’t see too much of a solution for your issues with Pixel Streaming 1.

With PS1, we don’t have access to metrics such as available bandwidth per player. This means it isn’t feasible to implement dynamic allocation of quality controller per player.

Making the player with lower bandwidth drop packets more aggressively is something that will need to be implemented on the peer side (browser). So unless you are willing to make modifications to browser source code, this is also not feasible.

The best approach would honestly be to migrate to Pixel Streaming 2, as it provides vastly more metrics data, and then figuring out the crash from there.

If suitable, it’s worth using one of the more mainstream browsers that Pixel Streaming is tested on as well, namely; Chromium browsers, Firefox, Safari.

Apologies that we don’t have an immediate fix available, please let me know if you had any further questions.

Kind Regards

Michael

[Attachment Removed]

Hello,

I want to give a short summary about the current state, till we can switch to Pixelstreaming2.

What seems to help is to set the PixelStreaming.WebRTC.MaxBitrate to a lower value like 20Mbit/s which makes a huge difference between the two streamers available and used bandwidth.

Regarding the FlexFec option I added a log because it causing a crash in some setups, 2 streams and it seems it has something to do with enterprise networks and therefor using a Turn server.

We disabled it for now.

Best

Dominic

[Attachment Removed]

Thank you for the update Dominic!

I’m glad the reduced MaxBitrate has improved the performance. Has it helped mitigate the freezing overall, or is it just a performance improvement?

Hmm, we haven’t done a lot of testing of FlexFec in enterprise networks, but a strict network can introduce some issues.

Let us know if/when you can update to PS2 and if it resolves the issue.

Thank you and kind regards

Michael

[Attachment Removed]

Hi Dominic,

I’ve recently done a pass over Pixel Streaming 2, and I’ve been unable to reproduce this bug at all.

Has the issue persisted? If it’s seemingly resolved we can close this ticket.

Thank you and kind regards

Michael

[Attachment Removed]

Great news Dominic!

I’ll close this ticket here, but please reach out or open another if you find any further problems.

Kind Regards

Michael

[Attachment Removed]

Hello [mention removed]​,

Thanks for the feedback. Technically i am not able to set this startparameter / CVar, we currently enable key input in our streaming solution so that i can use the UE Console in our dev environments to test this. But I also have the feeling this is a browser specific issue.

But I have finally reproduced the “original” Freeze, which was reported by our customers:

Setup:

  • UE 5.5 Project is running on a EC2 Instance
  • Unreal Signalling server with turn is used
  • We need up to 80 Mbit/s for the stream

Reproducer:

  1. First user Connects, with edge
  2. Second user Connects, via a chromebox and gets qualitycontrol
  3. First use has a Bandwidth drop to 10 MBit/s (done with NetLimiter)
  4. Stream Freezes for the first user
  5. Bandwidth is set back to normal but the Stream recovers only after a very long time 30sec up to 3 minutes

I also set EnableFlexFec to true which was not helping. The Freeze is longer with higher resolutions and therefor more data i guess.

I am also aware that these scenarios will always problematic if the non-QualityController has less bandwidth, but the questing is if we can do anything to let the stream faster recover?

I attached a log, please reach out if I can gather any further information. In the log the Freeze can be seen between 14:44 -> 14:47

I will test different setups / Browsers, and also try to test with UE5.7 and Pixelstreaming2.

Best

Dominic

[Attachment Removed]

I guess this issue happens when the one who is not the Quality owner has less bandwidth than the quality owner and not enough for the data streamed.

Now the question is if we can doe something about it. Upgrading our live systems to UE5.7/5.8 is 6 month away for us.

Is there a way on the client side to more agressively drop data / frames ? Or can we cahnge something on the quality owner ship behavior ?

Any ideas would be greatly appreciated.

Best

Dominic

[Attachment Removed]

Hi [mention removed]​,

Thanks for the feedback.

Regarding the Pixelstreaming2 Crash:

  • I used the Pixelstreaming2 Signalling server in the samples folder
  • Both streams (one with chrome one with Vivaldi) where connected and i just interacted with the stream by moving the camera, after a while it will crash every time. I can check if it happens also with other browsers.

Regarding the original problem / the freezing:

I think even if we leave the quality ownership at the first player that can fail in the same way, if the second player has to little bandwidth.

Either we would need a combined Quality owner ship, like we use always the lowest reported values for bandwidth from both players. Or the Player with the lower bandwidth needs to drop data more aggressivly to be able to catch up.

But that’s just my guess work, I do not completely understand the underlying issue …

Another Idea which I have is that we dynamically switch the Quality owner ship to the player with lesser bandwidth. Do you think that can work ? Or will there be issues if we switch it quite often in a row ?

Best

Dominic

[Attachment Removed]

Hey,

I am not 100% sure if there are no freezes anymore, but due to my tests its barely reproducible and if a freeze happens it recovers in 1-3 seconds.

Best

Dominic

[Attachment Removed]

Hi Michael,

We have not switched to UE5.7 but we are startet to migrate to UE5.8, and therefor will do a first testing iteration in the next few weeks. We will try to reproduce this with UE5.8 and Pixelstreaming2 and come back to you!

Thanks.

Best

Dominic

[Attachment Removed]

Hi Michael,

So I am no longer able to reproduce the issue as well, with UE5.8 and neither with UE5.7, so everything seems to be fine now.

Thanks.

Best

Dominic

[Attachment Removed]