No announcement yet.

Generating A Synthetic Dataset To Train An Image Classifier

  • Filter
  • Time
  • Show
Clear All
new posts

    Generating A Synthetic Dataset To Train An Image Classifier

    If you’re anything like me you’ve probably heard all the hype about AI, machine learning, neural networks, etc. but have no idea how any of it works. That’s okay, join the club.
    Click image for larger version  Name:	Header.jpg Views:	1 Size:	112.2 KB ID:	1555533
    I used virtual images to train a computer to recognize real-world objects.

    This is an extension of Edje Electronic’s tutorial where he re-trains a neural network to classify playing cards. I wanted to expand his tutorial to include the entire card deck… but with a twist. Instead of the taking 1,000"s of images of cards and manually labeling them, I used the Unreal Engine to take virtual images and auto label them with the appropriate tag.
    Click image for larger version  Name:	labelling640Speed150.gif Views:	1 Size:	499.7 KB ID:	1555534
    (It'll drain your soul)

    The bounding box is determined by converting the object's max/mix bounds in 3D space into 2D screen-space bounds.
    Click image for larger version  Name:	BoundingBox.gif Views:	1 Size:	288.9 KB ID:	1555535
    Click image for larger version  Name:	GetActorBoundsV2.png Views:	1 Size:	15.7 KB ID:	1555536
    (Epic did most the work)

    The cards (and their corresponding tags/labels) are updated real time so you can modify the type of card, location, backdrop, etc as you need.
    Click image for larger version  Name:	CardSwapping.gif Views:	1 Size:	553.6 KB ID:	1555537

    I tried to automate the system with a camera rig that could capture as many images as needed. It allows for you to define certain behaviors (number of rotations, pitch increments, distance from object, etc). You can set whether there should be a random range added for each setting. I also added some "rules" to ensure you getting "good" images each time to help cut down on the post processing. In this instance I have the camera doing a ray trace to each corner of the card if it can get 3/4 it's a "good" image. This allows for the system to track cards that are partially occluded by an object (like your hand).
    Click image for larger version  Name:	RayTraceTest.gif Views:	1 Size:	436.5 KB ID:	1555538
    Here's a comparison of a virtual image and a real one:
    Click image for larger version  Name:	CardComparison.jpg Views:	1 Size:	60.5 KB ID:	1555541
    It seems to work fairly well (I was honestly surprised it worked at all).
    Click image for larger version  Name:	CardsTracked.gif Views:	1 Size:	533.3 KB ID:	1555539

    Future Work:
    I'll try to a more complicated scenario (object) now that I have a sort of benchmark to test with. I also wanted to test how "real" the images needed to be and how well I can match it in the engine.
    Click image for larger version  Name:	PANO_Office_small.jpg Views:	1 Size:	169.5 KB ID:	1555540

    I also need to post some of the training data to the tensorflow forums(?). There were definitely some differences between training on synthetic data vs real data... but it actually seemed like it worked better. Still, I don't know enough either way to be sure. Any feedback or critiques are absolutely welcome. I don't really know what I did or why I did it. It just seemed like an interesting experiment.


    This is really cool. Any updates on this?


      Would it be helpful to release the project files? It's pretty easy to generate a large dataset for either Tensorflow or PyTorch.


        yeah I guess it would be helpful, I'm really new to unreal and this concept looked really interesting