Engine running in interlaced mode should be 2x performance of progressive?

Dannington · September 21, 2017, 12:01pm

Hi, I work primarily in TV so I know a thing about interlaced vs progressive modes.

At the moment i’m producing a realtime system with is running in a 1080i graphics mode (on the card).

My problem is that I don’t see any real performance gains in running in this mode over progressive. Surely when games are running on consoles in this mode there must be an engine mode which halves the vertical image resolution.

1080p is 1920x1080 x 25(pal) images per second - this is what I usually run in while i’m developing.
1080i is 1920x540 x 50(pal) images per second, where the aspect ratio of each ‘leave’ of the interleaved frame is half as high.

Am I right in thinking that the engine just runs at full progressive mode regardless of the output being in interlaced mode or not and just rams the image into the output buffer 50(pal) times per second? If so, isn’t this massively wasteful in terms of GPU time?

I’m sure 50 1920x540 frames per second is a hell of a lot cheaper than 50 1920x1080 frames per second.

0xAFBF · September 21, 2017, 6:58pm

I guess that UE4 still renders at 1080p, as it is the resolution the system reports.
Maybe you can dig into the settings and set the Y size manually? in the end you will have half the pixels so gpu should be able to render much faster.

Konflict · September 22, 2017, 3:00am

Its different from video signals you are used in studios. The GPU (and the engine) renders to full size canvases, all this happen in memory. It is the VGA that will break down these canvases into two half-size frames and stream it out in the usual even-odd manner. The display will assemble it again to be a full sized frame again. Simply put, your gpu will render on 50 hz, but the interlaced output will be streaming on 100hz these half frames. In case you wonder where are the scanlines from your screens, these pixels are probably being stored in the display’s internal buffer and only show the full canvas once the second frame is coming in. Most consoles are also built upon the same technology (computer inside) where the game renders full size canvas and the VGA will break it down, only because for compatibility reasons. I actually believe it is more efficient to work on full sized canvases, since the engine only need to calculate vector/geomtery informations for one frame only - and not for two. Pixel shaders, especially postprocess effects would be also very problematic to run as they usually look into adjacent pixels to access color informations, which require to have some advanced lock and wait /sync to occure to have access these pixels of a next incoming frame. This would seriously throttle the graphics performance, and you don’t want that.

If you want 25 frames, then just limit the engine to render only 25 frames with a cap, which will be streamed on 50hz for your interlaced equipments.



t.MaxFPS 25

Dannington · September 22, 2017, 2:00pm

I don’t think you’re right. Look at it this way then. It’s 50fps rendering at 1920x540, where the aspect ratio is squashed. Then yes, with some coercion the gpu will create a 1080i output by merging these frames 2 at a time into an interleaved buffer. If the engine doesn’t do this for consoles at least then it’s a serious oversight.

Dannington · September 22, 2017, 2:45pm

That’s what you need to do I think. Maybe there should be an ‘r.screenpercentageY’ console command.

Dannington · September 22, 2017, 3:08pm

Konflict:

Its different from video signals you are used in studios. The GPU (and the engine) renders to full size canvases, all this happen in memory. It is the VGA that will break down these canvases into two half-size frames and stream it out in the usual even-odd manner. The display will assemble it again to be a full sized frame again. Simply put, your gpu will render on 50 hz, but the interlaced output will be streaming on 100hz these half frames. In case you wonder where are the scanlines from your screens, these pixels are probably being stored in the display’s internal buffer and only show the full canvas once the second frame is coming in. Most consoles are also built upon the same technology (computer inside) where the game renders full size canvas and the VGA will break it down, only because for compatibility reasons. I actually believe it is more efficient to work on full sized canvases, since the engine only need to calculate vector/geomtery informations for one frame only - and not for two. Pixel shaders, especially postprocess effects would be also very problematic to run as they usually look into adjacent pixels to access color informations, which require to have some advanced lock and wait /sync to occure to have access these pixels of a next incoming frame. This would seriously throttle the graphics performance, and you don’t want that.

If you want 25 frames, then just limit the engine to render only 25 frames with a cap, which will be streamed on 50hz for your interlaced equipments.
t.MaxFPS 25

By the way - i’m not being dismissive. I’ve been really impressed with your sharp AA tweaks (I couldn’t get it to build myself though when I tried it). I’m actually looking at other ways to sharpen the engine output myself. I’ve been using the engine to run a realtime graphics system on a live TV show, but I had a conversation with the gallery about the engine output not being sharp enough even when I oversample the image at 150 percent or whatever. Yesterday I tried something new which really worked well - I setup nvidia’s DSR super resolution mode so I could run the engine at 4k which is downsampled to 1080p, then in the engine I can set the screenpercentage to 40 percent or something and get a really nice image. I think the engine’s working just as hard (if not less) and the card is taking care of the output and doing a nice job of it. I’ve got a dynamic setting to make sure the engine output is locked to around 2.5MPixels per frame regardless of the resolution.

Konflict · September 22, 2017, 3:28pm

Then it would result in reduced image quality by predicting (procedurally reconstructing) the odd lines of a full sized canvas. In video technology it might works since the video signal is a bit blurry therefore the end result will not quite suffer from visible artifacts and quality loss. In computer graphics we always talk about discrete pixels and there is no room for such quasi pixels. Without serious digging into GPU bios and modifications, i don’t think you would have any chance to produce interlaced output directly from your GPU, as it is designed to work with full sized canvas only. The engine is implementing this standard as well, by using the api’s the drivers provide.

Edit:

This actually brings up back to my previous post where i mentioned that the vector and geometry informations would require to be calculated twice per time frame, and this not only means producing pixel informations of geometries but actually calculating all the triangles on the gpu. Doing this twice for a full size canvas would result in performance loss, and equals the situation when you have twice that many triangles in a scene. As for the pixel output side of things, producing 2 half frames versus one full frame, there is no performance differences, because the GPU is a multi threaded chip where the individual threads actually work in a parallel manner, thus makes no difference when and how you generate the digital images at the end.

Konflict · September 22, 2017, 3:38pm

I’m sorry to hear, tho you maybe should just have modified the shader code only, and then press the ctrl+shift+dot key combo in the editor to recompile the shader. This doesn’t actually require you to recompile the engine source code. If you have issues with finding the location of the line in different engine build just hit me up in the topic and we shall find a line number where to put that line of code.

Supersample is actually producing you better image clarity, but as you are aware there are better antialiasing methods available you can try. If you can go with forward rendering, then MSAA should be an option for you as it is the closest simulation of supersampling with decent framerates. SMAA is also famous antialiasing method with blade sharp edges, tho it suffers from line discontinuity and other image artifacts. Both msaa and smaa are also subject to specular aliasing problems, that you can probably solve on the material level.

That is amazing, and glad to hear you were able to set this up. I’ll look into it to see what else can be done with DSR as i have never really tested this feature before!