Multiplayer | Discussion on client-server model and game instance design

Hello everyone,

I wanted to start a discussion about UE5 and creating dedicated servers, primarily focused around running multiple dedicated servers and integrating game instances for handling large player counts.

I have looked through endless Google searches to try and find out information on how this would be achieved with UE but a lot of the articles simply don’t speak of this, or only suggest a small section of that design. I even search academic material on Udemy to see if anyone had created tutorials on this but surprisingly no material about client-server setups with multiple servers.

To give background on the question, I have been looking to see if it’s possible to create a MMORPG using UE5. In this current day and age there are countless games out there that achieve this using megaservers. However, it’s not clear if you can achieve this same architecture with UE5 or if this is the correct tool for the job.

In terms of Game Instances, I was not looking for a single dedicated server to support 1000’s of players so in theory I thought the concept of game instances (100-250 players) would be more feasible and using scalable services to deploy multiple dedicated servers as required running game instances.

I want to keep this discussion completely theoretical and keep out the technical side as it could get complex, and to simply outline how this would be achieved.

The sub-questions I wanted to raise to the community are the following:

  • How would you handle communication between multiple dedicated servers?
  • What scalable services would you use (self-hosted /AWS/GC)
  • How would you handle redundancy? (server-down)
  • How would you handle persistent data (storing player data regardless of game instances)

I appreciate anyone in advance on giving there input on this discussion and I hope it can be useful material for the whole of the community to give them the answers their looking for.

Hello

The reason why you won’t find much information online is probably that the architecture for an MMORPG is very complicated and game specific for it to be optimized properly.

Creating player limits to each area / level seems to be the most common approach where portals between areas create several instances of the same level even if you are on the same regional “server”.

Theoretically you could set the player limit to 1000 players but it would be nearly impossible to optimize properly unless if you keep the players spread apart somehow.
Since servers do crash it also affects a lot more players while having smaller instances would alleviate this problem.

Party members have to be able to play together so they can use reserved slots on the instances.

Some parts of the game like chat and player data like inventory and abilities has to be handled on its own service for it to be persistent across server instances. This would not be an Unreal Engine Server but a custom or off-the-shelf game service provider.

Now the overall server instance manager also has nothing directly to do with Unreal Engine but would be a service that starts and stops server instances as the player demand increase or decrease. It also has to move players between servers when they move through portals.
“Dungeons” or layered quests could all be on the same server instance so you don’t necessarily make the server instances too small and avoid having to connect to new server instances constantly.

Making an MMORPG is not for the faint hearted and requires a large development team as it involves a lot of technical challenges and ressources.

Actual questions

  • How would you handle communication between multiple dedicated servers?
    If they are instances, they don’t need communication except loading players inventory and other stats when he logs in from API / database
  • What scalable services would you use (self-hosted /AWS/GC)
    Cloud hosting, kubernetes, like any load dependent web app
  • How would you handle redundancy? (server-down)
    Didn’t get the question. If server is not working, it should be down. Still task for kuber
  • How would you handle persistent data (storing player data regardless of game instances)
    Same as first question, db

About large number of players

My experience: I wanted my shooter to support thousands of players on the same map, but ofc I started working on the game itself before diving into that task, although from the start trying to do everything possible to be able to scale the game later (using Lyra-style animation blueprints with multi thread support, trying to optimize network bandwith and the game performance, etc.)

Results:

  1. I got about 5K people on my discord server, but maximum amount of players in one match never was more than 50. After several months after the game launch, I get 16-30 people in planned matches on the weekends. Here I give a hit that even you build a game that supports 1K players, it’s not sure you’ll be able to test it.
  2. With 16 people ping from dedicated server (in the cloud) for people in the same city is 30 ms, but with 30 people it goes to 100 ms, and it’s less comfortable to play.

To be able to support more players, you can achieve it by:

  1. Reducing the amount of replicated data in second: Gameplay Ability System - Advanced Network Optimizations - Devtricks
  2. Moving “hard calculations” to the client side, like in planetside. Clients telling server that they hit the target while shooting (standard Lyra behaviour), client auth movement (like mmos), …, then fight insane amount of cheaters

Also some scalability calculations:

  1. Network (Server)
    With 30 people in the match and me hosting it, in my Lyra-based shooter I had average 300 kB/s bandwith load, most of it replicated movement and montages.
    As it scales in square (number of players to replicate to, number of properties to replicate), 200 people would take 13 MB/s. Still possible for server. Now paste 1000 in
    (n / 30) ^2 * 300
  2. CPU, RAM (Server)
    Didn’t measure it too much, but I hope it doesn’t increase faster than N. Also some initial part of it is reserved for the engine itself ofc. 1 CPU, 2 GB were enough for 30 players for me and still had free space.
  3. FPS (Client)
    Can make several hundreds low poly models with multi threaded ABP and don’t worry about them I guess. But I didn’t check too much.

Final resolution: Fornite made 200 people in match with heavy GAS network optimizations, I guess you could do that too with professional team.

I’m going to answer these a bit out of order, if you don’t mind.

I would probably use a container-based cloud service, Like EKS (Elastic Kubernetes on AWS), and deploy multiple instances (This not only offers redundancy and HA, but also load balances clients across all active instances).
I would tie in some auto scaling for the vertical and horizontal (Scale the services until the node resources hit a certain %, then spawn a new node and distribute more instances across it). This would create a theoretically unlimited server player capability on an infrastructure level.

For data distribution, I’d want to determine which data is to be stored, vs which data is to be either cached or broadcast with a cadence that would not require it to be stored.

The speed of the data is clutch, here. If clientA is attached to serverA and same for ClientB to ServerB, I would want serverB to update ServerA and vice verse as soon as is humanly possible, so the clients get the relative information as close to real-time as possible. How would we do that? A replication layer? Broadcast messaging with minimal queueing? There are some options out there to help with that, but that does come down to what UE5 supports.

Also, what database would be faster for storing data and what structure would you instrument?
For the cacheing items, perhaps a hopped up redis cluster?

The thing that I would watch out for there is how long read/write times take and what kind of parallelization I can do to increase the read/write capability across multiple servers/clients. Designing how to store that data would be best engineered when determining all of these things.

If UE5 supported something like full backend broadcast messaging so all servers could efficiently broadcast their changes to each other, then you could build some kind of replication layer to front-end the datastores and orchestrate this data to the servers.

So there’s a few ideas. This is from a product dev/Infrastructure/network background. Not super strong on UE5, but have been messing around with it. I don’t have a game yet, but I’ve been playing around with containerization of a UE5 environment.

I’m happy to heard the advice and recommendations!

Thanks,
D

Don’t replicate montages. Make everything data driven. Once you do this your bandwidth usage will drop drastically.

Use enumerator states… 4bits. Don’t pass object references over the network.

Can you elaborate on this a bit more?

Right now I use a Multicast to tell each client to play a specific montage. and I am sending the Montage reference in the multicast. I am still in prototyping stage and I have been very curious how much traffic this decision would cause but have not done any network load testing, so hearing you specifically mention not to do this has my attention.

When you say to use enumerator states, do you mean something like create an enum of possible montages, then simply pass the enum in the multicast and then find the appropriate montage client side?

This is roughly what I imagined doing (though my first thought was some sort of table with “Animation IDs” to reference, thinking there would be a lot of possible animations once I factor in different abiltiies, and different models.)

For montages I’d pass an ID (String, Name, Int), or have a specific event for each to call.


I use enumerators for hard states like doors, elevators, lifts, windows (Opened, Closed etc). I use them for character movement states like Movement Pace[walking, jogging, sprinting].

In most cases the enum is a rep notify and I use the onrep function to drive change logic.


Don’t pass obj references…
For this I mean don’t send a pointer to a specific character, or other obj in the game world over the network.

If I do a trace on my local client and hit an obj I need the server to do something with, I just have the server do the same trace to hit the same thing.

Trace → If hit true → RPC server to Trace
Srv Trace - If hit true → do something

There’s no reason for me to have to send it an actor obj ref of what I hit locally. If I do, I’m skirting server authority.

Ok yes the authority concept makes perfect sense, and for most actions I am using a similar model of

Client checks if it can → RPC to Server to TRY
Server Checks if Client Can → Process Action → MultiCast animation of client doing thing

This multicast is where I am currently sending a montage reference so each client machine plays the montage (attack swing, for example). I will likely change this to an Enum set up also so that I can line up Enum states with various animations for different models - but is the act of sending a montage reference across the network itself problematic for traffic / other concerns?

Personally I’d create an Event on the character. Have it use control logic to direct what each proxy does.

Keeping with CMC’s client-side prediction model I’d on input have the client do the attack locally, non damaging. Then RPC the server to Try.

Srv checks if it can, then multicasts an Attack Swing.

Multicasts determines per proxy what to do.


Edit…
Another approach would be similar to a client fakey weapon firing setup.