Dec 2, 2020.Knowledge
Advice on hosting depends a lot on the provider, the processing power of the hardware you use, and how efficient your game is. In general, you want to pack as many server processes onto a machine as possible. When doing so, there are two main bottlenecks: CPU usage and memory. Profiling your servers will be necessary to determine where bottlenecks are occurring and what optimizations are needed.
There are a number of ways to improve server CPU usage that may or may not be useful for your game, such as using the replication graph, utilizing dormancy, or adjusting actor NetUpdateFrequency and net cull distance. You can find some more information on improving replication performance here:
If you find that memory is causing a bottleneck, you can try using Unreal’s tools for debugging and optimizing memory usage in your game, such as memreport, the memory profiler (mprof), and the low level memory tracker:
You can also try enabling KSM (Kernel Samepage Merging — The Linux Kernel documentation), which may help with memory management between all of the server processes.
Finally, 4.26 has also added support for forking child processes from a master process. The master process can be used to load global data, and when forking, any allocated memory up to that point will be shared between child processes. Static data, like content definitions or physics meshes, can and should be shared, but dynamic data can be shared as well until a child modifies it. For example, the master process may load some general character data to be shared, but each child process may create its own copy to modify if needed. If there are tight constraints on memory, this approach can be beneficial depending on how much data these child processes can share. For more info, you can look into the engine’s helper functions in Fork.h as well as the forking functions in UnixPlatformProcess.h/.cpp.
When forking, the parent will have a command line interface (CLI) argument, “-WaitAndFork,” to indicate that this process will be forked. From there, the process will wait for an external signal (SIGRTMIN+1) to cause forked processes to be created. It’s also worth noting that the parent will receive an argument for the directory that holds files containing the CLI args of forked children. If not forking, these CLI args will need to be set for each of the processes. Some CLI args you may want to use include:
- “-WaitAndFork” to indicate that this process will be forked.
- “-WaitAndForkRequireResponse” to set if child processes will only proceed after a SIGRTMIN+2 signal is sent to them.
- “-WaitAndForkCmdLinePath=” to set the directory where files containing child process’ cli args will be found.
- “-NumForks=” to set the number of forks to be created when WaitAndFork is called.
- “-PostForkThreading” to set whether the process will support multithreading post fork.
- “-nothreading” to set whether the process supports multithreading.
- “-useksm” to enable KSM.
- “-vmapoolscale= to set the scale parameter used when growing the virtual memory pools on allocation (and when scaling them back).
- “-virtmemkb=” to set the process’ virtual memory limit.
- “-preloadmodulesymbols” to load the main module symbols file into memory.
- “-port=,” “-beaconport=,” and “-statsport=” to set the port number for the game, beacons, and stat requests respectively.
- As well as any other args needed for logging or memory and crash reporting.
If forking processes, it’s also recommended that each child process only serves one session before being destroyed. Forking off a new child should be very fast and makes memory management easier, as you don’t have to be as concerned with cleaning up properly before serving a new session. If not forking, this approach still works fine. However, if server start-up time is a concern, it may be more efficient to have the server leave the map and wait for a new session when one ends.
Finally, there are some caveats to consider:
- You’ll need a separate system to manage and monitor the processes’ lifetimes and determine when to create, destroy, or fork a process. For example, if a process crashes, you’ll want to be able to detect this and start a new one to replace it. This type of system is not a feature of the Unreal Engine, so a custom made or third-party solution is needed.
- Forking processes is only supported on Linux.
- If forking child processes, you may want to consider context switching between threads, which can be mitigated by adding some process affinity management.
- Another thing to consider is machines that can run a lot of servers can cause outages if one of them dies. For example, if you’re running one set of capacity on three machines and one of those dies while the number of free servers (not hosting a match) is less than one third, a capacity outage will occur.
- You should also be careful with NUMA node issues from cloud providers, as accessing shared memory across NUMA node boundaries will be very slow. It is still possible to fork in a multi-numa architecture, but you’ll need to limit multiple parents to each node as well as limit each forked child to a taskset within a single node.
Again, what is needed and what works best will depend on a variety of factors specific to your game and setup, but for more information on hosting and optimizing servers, Riot has an article on how they optimized their servers for Valorant: VALORANT's 128-Tick Servers | Riot Games Technology