How to add image as observation when using Learning Agents

I absolutely could be wrong, but I’m wondering if there isn’t a slight misunderstanding happening.

I’m wondering, Allen.The.Alien, if you are thinking of just inputting the pixel data directly without performing convolutions.

While doable, it will not be particularly efficient and there will be issues with the large increase in the size and complexity of the model. Even a 256x256 image per frame will dominate the other data that is passed to the model without some sort of transformations of the data with either convolutions or attention.

Rendering the images at a resolution similar to human sight would make it rather difficult to train and inference due to the sheer size of the data, then you have to account for the number of agents rendering that image per frame.

At the moment, it would probably be simpler and more effective to use raycasts that pass only the information the agent needs to learn and nothing more. There will be a lot of information that is passed through images that will be unusable or even detrimental to the model’s performance.

If I’m misunderstanding, let me know.