So we have the weirdest behavior on our Linux Dedicated Server on Gamelift Spot instances.
Symptoms:
- CPU usage goes to 99%-100%
- Logs stop coming in. (we have a healthcheck that runs every minute)
- Gamelift healthcheck continues to pass and players are still shown as active after disconnecting
- Server process continues to run for days if we allow it
- Other server processes run on the same ec2 and are fine.
Has anyone seen anything like this? Any ideas on where to even start troubleshooting? We’ve been working at different shots in the dark for weeks with no luck. I’d say repro rate is like 1/100 instances.