A sudden spike on GPU / 100% GPU use

Hey,

I was working on my unreal engine project and recently I’ve been noticing that for some reason, out of nowhere, at random locations of my game, the engine suddenly starts to get massive hitches which cause the game to get 1 frame per 3 seconds or something. I check the GPU usage and it’s all of a sudden on 100%.

  • Note that I’m using an RTX 4090 and normally the game uses only 20% of my GPU and the GPU stat on the top right is on 9ms before the lag hits.

  • I also tried using the GPU profiling and for some reason every time I get these lags one of the lights in the scene takes 3000ms or so to render (different lights every time). The issue is that the problem doesn’t come from a specific light that is bugged or corrupted because every time the light that takes so long to render changes and it’s only one light at a time that takes so long to render when the lag hits.

  • Whenever I turn off the shadow in the Scalability the lag goes away.

  • Also After 2 minutes while the engine is lagging I get an error saying GPU Crashed or D3D Device Removed and the engine crashes after that.

Hey there @Memeready! Great information! This looks to be a fun one to say the least.

What type of shadow mapping are you using?

Are you also using Lumen? (I’ve been getting reports of single light spikes on Lumen enabled scenes but trying to draw a correlation).

Does this occur during packaged builds as well or only in editor?

Does this occur in other (blank)levels as well?

1 Like

Hey Supportive Entity!

I’m using virtual shadow maps and Lumen yes.

I’m also using Nanite.

I have ray tracing on in the project settings however I turned it off for most of the unnecessary lights.

This happens in packaged builds as well, at-least in the (Development) version of the packaged project. I can package it for shipping and let you know as well.

This massive spike on my GPU in this project has been only recently happening and it never happened before in the past 3-4 months that I’ve been working on this project.

Edit: I’ve tried in the shipping packaging. At some points the lag still does occure.

Does it still occur without VSMs? There’s a minor shadow mapping issue that was reported for 5.1, that doesn’t quite match up with this one but it’s not entirely out of the question.

Can you consistently reproduce this issue on specific lights? Be it render angle, specific object rendering, or is it really just shadow calculations for the light?

I played the project for 10 minutes and I saw no lags without the shadow maps however my other friends tried it with a RTX 2060, they were getting about 20-30 frames and did experience hitches but never the same lag that I get with RTX 4090. I get smooth frames with my 4090 but all of a sudden after like few minutes my game completely freezes, it freezes so bad that it crashes and gives me an error saying “GPU Crashed or D3D Device Removed”.

I also tried other games such as the new forza and doom eternals on maximum graphics to check whether the problem was from my graphic card, however I didn’t lag in those games.

It’s almost certainly something engine specific for the 3000ms light problem, but I can’t be sure if there are environmental issues involved as well. Raytracing and Virtual Shadow Maps are likely having some interaction.

Do you get any warnings about VSMs max pages while PIE? If so, use this command to see how many you’re using, and you can adjust the amount inside the DefaultEngine.ini.

r.Shadow.Virtual.MaxPhysicalPages

If that’s not the issue, it’s going to be a tough one to say the least.

For many issues we can talk things a bit through but the nature of a light dropping 3000ms with no discernable reason might just be better to put through a report with some repro steps if you can find them.

It’s not always 3000ms by the way, it can be 700ms, 4000ms or etc. It varies on different light but it’s always only one specific light at random.

I also checked the command and the engine said it was at 8192.

By the way I also do get:
LogD3D12RHI: Creating RT View Heap with 250000 entries
LogD3D12RHI: Creating RT Sampler Heap with 2048 entries
in the output log.

It’s so weird how my project goes from getting ~40 frames and using 40% of my GPU to suddenly 100% of my GPU unless I turn the shadows off.

Edit: Just checked again when I lagged, this time it was two lights, I’ll also submit a image so you can see as well
Screenshot

Also I kept getting these logs while I lagged

No way around it until I turn off the shadows, if I don’t, Unreal Engine crashes.

The RHI logs should have a bit more insight, is there any way I can get the whole RHI log during one of these spikes? This bit shows the end of Post processing and most of them are really (really) small. Are there significant sized draw calls listed or is it millions of extremely tiny ones?

I was unable to screenshot, right now when it lags I can’t even do anything but to wait for it to crash.

You should be able to find the logs saved in your logs folder in the project folder, hopefully it’ll show what we’re looking for.

Saved\Logs

Is there any solution to this problem? Unfortunately I also encountered the same problem