Best setup for running multiple avatars with Pixel Streaming (50–500 users)

Hi everyone,

I’m planning a deployment using Unreal Engine and Pixel Streaming to support a relatively large number of users, somewhere between 50 and 500 concurrent connections. Each user would control their own avatar in a shared or isolated environment (depending on performance/scalability).

I’m looking for advice on the best architecture in terms of GPU and machine setup:

Would it be better to run multiple instances of the engine on a single GPU per machine, or

Use machines with multiple GPUs to reduce the total number of servers needed?

Additionally, which GPUs would be ideal for this kind of workload? Should we consider high-end gaming cards like the RTX 4090/5090, or would profesional grade cards like the RTX 6000 Ada be a better fit (e.g., for virtualization, stability, VRAM)?

We’re aiming for good concurrency, efficient GPU usage, and manageable scaling costs.

Any input would be greatly appreciated!

Thanks in advance.

Finding the best and most cost efficient infrastructure for your deployment is important, but the first thing we should get a grasp on is the performance of your application. You can likely get a good feeling for this with your local hardware or something you can access that is close to the specs you’re targeting. You can launch multiple instances of the Unreal app and check how it performs and if you’re hitting your target FPS. You can definitely optimize your content to run well across multiple instances, but eventually you’ll hit a bottleneck across GPU, CPU, RAM, or VRAM, which you should be watching and measuring (like with `stat unit` as well as Task Manager). The reason I make this point is that people always throw money at GPUs even if the GPU isn’t the bottleneck.

I’ve seen some applications run comfortably with 4 instances on a single machine, but in my honest experience most people will end up with 1 machine per 1 UE instance. Additionally, most self serve pixel streaming services out there are architected this way for simplicity.

For machines with multiple GPUs, you would need to launch the UE apps with a command line parameter “-graphicsadapter=[0 or1]”. I haven’t tried this technique. You will have to test it and also see what providers have these kinds of machines and how the price compares to single GPU.

This leads to my question for you. Are you trying to set up your own physical infrastructure, set up cloud infrastructure with a preferred CSP, or find an easy off-the-shelf provider who takes your executable and handles the rest? We could theorize on the best possible infrastructure, but generally the decision comes down to ease of use or pre-existing IT limitations.

Hi @anonymous-edc

For 50-500 concurrent connections, you’re looking at massive infrastructure complexity. Professional cards like RTX 6000 Ada would be better than gaming GPUs due to higher VRAM, virtualization support, and stability for 24/7 operation. You’d typically run one UE instance per GPU to avoid encoding bottlenecks.

However, managing hundreds of concurrent pixel streaming instances involves load balancing, auto-scaling, geographic distribution, session management, and substantial costs - you’re essentially building a cloud gaming platform from scratch.

I’d strongly recommend checking out Vagon Streams instead. They specialize in exactly this use case - high-scale Unreal Engine deployment with pixel streaming. They handle all the infrastructure complexity, GPU optimization, global edge distribution, and can scale from dozens to thousands of concurrent users without you managing servers or GPUs. This could save you months of development and potentially significant costs compared to building your own multi-GPU server farm.

Worth exploring before committing to the infrastructure build!