Advice on generating realistic environments

kristijanmirceta · September 8, 2018, 3:58pm

Hello. I’m doing a masters thesis on using synthetic scenes for training machine learning models. I’m asking for some advice on creating realistic scenes - basically neighborhoods, metropolitan areas, forests, just like in reality. I plan to scan these environments from a simulated airplane with a LiDAR scanner, then using the recovered data for machine learning. I have 2 ideas on how to go about this:

Get lots of community maps of realistic environments. In this case, could you give me some advice on where I could get these?
Use plugins for procedurally generating environments like these. Does anyone know of any good plugins?

Thanks in advance for helping!

darthviper107 · September 9, 2018, 2:46am

Probably neither of those options will work. I can’t see people providing you maps that they’ve put a lot of time into, and there isn’t any highly realistic procedural methods (the more realistic you want something, the more time it takes to make that work since it adds more variables).
If you want a realistic scene, you’re probably going to have to make one yourself.

spacegojira · September 9, 2018, 10:35am

darthviper is kinda correct. There is plenty of realistic environments you can download on the net for free, but you’ll have a really hard time finding a “complete” solution and these free resources won’t likely fit together in terms of quality and look (not to mention seamless transition between areas).

Most likely you have to build these scenes together from multiple resources by yourself, which is going to be a very huge project if you look for realism. Also take into account that you’ll have to use workarounds for a map that is more than 4km².
AND you’ll still have to build a system that does the scanning and is giving you correct, useful data, which is going to be quite a lot of work as well.

Why don’t you just use real world LiDAR scans? If the scanned data is what you’re after, then there won’t be much if any real difference in the data anyway, especially since you want it to be like in reality. And you’ll never be able to reach real world realism in Unreal Engine.

Not to sound rude, but I really don’t understand the benefit of trying to recreate a realistic environment in UE4 and then scanning it, versus just using real world scans.
I’m aware your thesis is about synthetic scenery, but it’s all just data in the end. If the synthetic scenery is your main point of your thesis, then why not go for a non-realistic style (eg just cubes as buildings) of environment and use that? It would save you a huge amount of time and workload to create these.

kristijanmirceta · September 9, 2018, 2:42pm

Hello guys, thanks for taking the time to respond. I think there has been a misunderstanding. I’m well aware that real world realism is not achievable, but I’ve seen recent research that uses synthetic scenes in Unity and Unreal Engine (e.g. AirSim for vision). So the scenes are real enough, even though they’re not totally real. Also I don’t need one huge scene to scan. If I have many maps, each of them 1 km^2 or even less, that will be perfectly fine. I just need it to be as real as possible -> Also not only rendering wise, but also content wise. I don’t need space ships, aliens and fantasy castles. I need maps of neighborhoods, forests, and other stuff you would expect to see in the real world.

An example of a similar work is this research paper [1804.00103] A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving. researchers scripted a mod for GTA V, which put a LiDAR scanner on top of a car, and sent the collected data to a server. It was used to train self-driving car technology and was found to be beneficial. I want to do a similar thing - mount an airborne lidar scanner on the bottom of an airplane in GTAV, but that’s another story.

Also no worries, you are not rude at all. Yes, of course real world scenes would be the best, but here is the problem. When you scan actual real world environments, you get 3D scans, but they are not annotated. This means that while you know you hit something, and can determine its location and more, you don’t know what you actually hit. Sure, some datasets are annotated, but they are few (not enough), and if they were annotated by an algorithm (call it A), then that algorithm surely doesn’t have a perfect accuracy. This means that my algorithm (call it B) would be bounded in accuracy by A.
You can imagine that it’s different with scenes in unreal engine. Say I implement a LiDAR scanner with ray casting. When you cast a ray and it hits something, you are able to retrieve the hit object. If the hit object is say an instance of house, then I can annotate the point with being a building. So the advantage of synthetic scenes is that you actually know what you are hitting, which allows me to get perfect training data.
The goal is to then train an algorithm on the synthetic data, and use it to predict the annotations of the real world data.

Also I have already built a system. It works correctly - I use the UE4 point cloud plugin to render the result every couple of frames. It needs some tweaks, but soon it will be ready. As I’ve seen that there was interest for such a tool on these forums, I’ll post it on here soon. I just need actual scenes to scan.

kristijanmirceta · September 9, 2018, 2:46pm

But I don’t really understand why nobody would want to offer some maps. The development community has open sourced lots of tools, which means they are free for people to use. Surely there are some level designers who are willing to do the same? Also, I’m not going to use this work for commercial purposes. It’s a thesis, so both its source code, results and paper will be open for the public to view and any researcher that may want to take it for some reasons of his own may do so freely.

junfanbl · September 13, 2018, 3:38pm

Well, you should understand that 3D art requires a good measure of time, energy and sometimes money. Not just the actual work, but the research it takes to pull it off. Even for an experienced developer, to create a scene large enough to provide reasonable amounts of data like that will take a lot of work. It’s unlikely someone would want to simply give their scene to someone because of how work intensive it can be. Especially for a fairly realistic scene. Unless of course you pay someone, they might consider it.

In the meantime, why not check out some City generators and create a procedural algorithm. That might actually be worth doing. Instead of acquiring a single static scene, a city generator can create different set-ups by simply changing the recipe (or inputs). This way you can perform a variety of tests. It will still require a manual effort to add “realism” to the scene. A city generator can create a layout for you, but you still need to set-up materials, lighting and what not.

NotSoAccurateNo1 · September 13, 2018, 5:46pm

I know nvidia trains their self-drive ai with synthetic scenes so I dont really see why it can’t be useful. With that said, this sounds like a lroject that could be well into development without having realism. In which case you could box the scenes out and be able to acheive something of object recognition.

What’s keeping you from testing the tech with scenes like Kite Demo?