we having issues since a few Month where a client Stream freezes and is not recovering. This is currently only reproducible with some Browsers, like vivaldi. With edge it does not happen, and with chrome it will recover after up to 10 seconds.
At the same time the Output window on the EC2 Instance is completley responsive and works fine. We are using our own Signalling / Cirrus and a TURN from AWS.
I attached the WebRTC-internals and the Unreal log with verbose logging for pixelstreaming. The issue occured at 16.34 (in the Webrtc stats 17:34)
We quite lost what we could potentially do about this issue.
We’ve had a look through the provided logs and we can’t see anything that should be causing unrecoverable stream freezes. There is a lot of re-transmitting however, which could be indicative of network issues or other flakiness.
From what we can tell we think it’s likely a browser issue. We haven’t really tested much outside of the major browsers (Opera, Safari, Chrome, Firefox etc), so we can’t guarantee success on browsers like Vivaldi. See if you can reproduce the issue on one of the above mentioned browsers.
Another option that may potentially help is enabling Flexfec. Flexfec will attempt to resolve network instability with WebRTC streaming packets without relying on re-transmission.
To enable Flexfec, run your Pixel Streaming application with:
-PixelStreamingWebRTCEnableFlexFecLet us know if this helps resolve the issue!
First of all a big shout out to the Pixelstreaming2 Developers, it looks like a huge jump forward!
With UE5.7 + Pixelstreaming2:
I could not reproduce the Freeze. (for sure because of point 2)
Now every user gets a separate encoded stream. Even one with AV1 and one H264 is possible. This brings me to the questsion if an SFU will be still needed in the future.
It felt much snappier!
Sadly I get crashes when having 2 streams connected -> See log + CallStack which is missing in the log but not in our sentry system
Hi there Dominic, apologies for the delay and thank you for your patience! You’ve sent a ton of great information through.
The crash is particularly confusing. As every user gets their own stream, we’d recommend setting up an SFU in the multi-user case, so that application performance isn’t degraded when multiple people connect simultaneously.
For the Pixel Streaming 1 deployment, it would be worth modifying your frontend to not request quality controller for a new player on connection. The idea being that the first peer connected should keep receiving their initial quality. We’re not 100 percent sure of the effects on the second peer, but they should receive a stable stream just fine.
For Pixel Streaming 2, that crash is new, we’ve not seen it before. Judging from the log it appears the first peer disconnects as the second peer connects. It may be a race condition and worth further investigation on our end.
Let me know if any of the above suggestions help out and we can keep investigating
I’ve discussed this issue with the team at length, and unfortunately we can’t see too much of a solution for your issues with Pixel Streaming 1.
With PS1, we don’t have access to metrics such as available bandwidth per player. This means it isn’t feasible to implement dynamic allocation of quality controller per player.
Making the player with lower bandwidth drop packets more aggressively is something that will need to be implemented on the peer side (browser). So unless you are willing to make modifications to browser source code, this is also not feasible.
The best approach would honestly be to migrate to Pixel Streaming 2, as it provides vastly more metrics data, and then figuring out the crash from there.
If suitable, it’s worth using one of the more mainstream browsers that Pixel Streaming is tested on as well, namely; Chromium browsers, Firefox, Safari.
Apologies that we don’t have an immediate fix available, please let me know if you had any further questions.
I want to give a short summary about the current state, till we can switch to Pixelstreaming2.
What seems to help is to set the PixelStreaming.WebRTC.MaxBitrate to a lower value like 20Mbit/s which makes a huge difference between the two streamers available and used bandwidth.
Regarding the FlexFec option I added a log because it causing a crash in some setups, 2 streams and it seems it has something to do with enterprise networks and therefor using a Turn server.
Thanks for the feedback. Technically i am not able to set this startparameter / CVar, we currently enable key input in our streaming solution so that i can use the UE Console in our dev environments to test this. But I also have the feeling this is a browser specific issue.
But I have finally reproduced the “original” Freeze, which was reported by our customers:
Setup:
UE 5.5 Project is running on a EC2 Instance
Unreal Signalling server with turn is used
We need up to 80 Mbit/s for the stream
Reproducer:
First user Connects, with edge
Second user Connects, via a chromebox and gets qualitycontrol
First use has a Bandwidth drop to 10 MBit/s (done with NetLimiter)
Stream Freezes for the first user
Bandwidth is set back to normal but the Stream recovers only after a very long time 30sec up to 3 minutes
I also set EnableFlexFec to true which was not helping. The Freeze is longer with higher resolutions and therefor more data i guess.
I am also aware that these scenarios will always problematic if the non-QualityController has less bandwidth, but the questing is if we can do anything to let the stream faster recover?
I attached a log, please reach out if I can gather any further information. In the log the Freeze can be seen between 14:44 -> 14:47
I will test different setups / Browsers, and also try to test with UE5.7 and Pixelstreaming2.
I used the Pixelstreaming2 Signalling server in the samples folder
Both streams (one with chrome one with Vivaldi) where connected and i just interacted with the stream by moving the camera, after a while it will crash every time. I can check if it happens also with other browsers.
Regarding the original problem / the freezing:
I think even if we leave the quality ownership at the first player that can fail in the same way, if the second player has to little bandwidth.
Either we would need a combined Quality owner ship, like we use always the lowest reported values for bandwidth from both players. Or the Player with the lower bandwidth needs to drop data more aggressivly to be able to catch up.
But that’s just my guess work, I do not completely understand the underlying issue …
Another Idea which I have is that we dynamically switch the Quality owner ship to the player with lesser bandwidth. Do you think that can work ? Or will there be issues if we switch it quite often in a row ?
We have not switched to UE5.7 but we are startet to migrate to UE5.8, and therefor will do a first testing iteration in the next few weeks. We will try to reproduce this with UE5.8 and Pixelstreaming2 and come back to you!